Okay so I will open the pull request soon. -Priyanka
On Wed, Mar 23, 2016 at 1:35 PM, Yogi Devendra <[email protected] > wrote: > This looks OK. Let us build it incrementally. > > ~ Yogi > > On 23 March 2016 at 13:24, Sandeep Deshmukh <[email protected]> > wrote: > > > I would suggest that we go ahead with design as suggested by Priyanka > where > > we have bandwidth setup for each operator separately. We can later extend > > this for bandwidth to be shared with different input operators or for the > > DAG as a whole. > > > > Regards, > > Sandeep > > > > On Wed, Mar 23, 2016 at 11:51 AM, Priyanka Gugale < > > [email protected]> > > wrote: > > > > > Right now it's not for output operator, but one can very well use > > bandwidth > > > manager to keep track of bandwidth usage and limit your output speed. > The > > > bigger challenge there would be, you won't be able to process window > data > > > sent by upstream operator in same window. For that you need to do more > > than > > > just bandwidth control. > > > So I would say, bandwidth control feature can be used as it is for > output > > > operator as well, only we need to do more than just bandwidth > limitation > > > for output operators. > > > > > > -Priyanka > > > > > > On Wed, Mar 23, 2016 at 11:47 AM, Priyanka Gugale < > > > [email protected]> > > > wrote: > > > > > > > That's a good question Chaitanya, Right now the bandwidth control is > at > > > > Input Operator level and not application level. So if you have two > > input > > > > operator you need to set bandwidth on both separately by this design. > > > > May be it would be good to have bandwidth control at Application > level > > > > than operator level. Let me think if I can modify the design to do > > that. > > > If > > > > you have any ideas for same, please share them. > > > > > > > > -Priyanka > > > > > > > > On Wed, Mar 23, 2016 at 11:47 AM, Yogi Devendra < > > > > [email protected]> wrote: > > > > > > > >> Priyanka, > > > >> > > > >> From the design description it is not clear how it will be used to > > > control > > > >> output bandwidth (point #2,3,4 mentioned by Sandeep) > > > >> > > > >> ~ Yogi > > > >> > > > >> On 23 March 2016 at 11:39, Chaitanya Chebolu < > > [email protected] > > > > > > > >> wrote: > > > >> > > > >> > This is very useful feature. > > > >> > I would like to know, how you are distributing the bandwidth for > the > > > >> below > > > >> > situation: > > > >> > - Two input operators say i1 and i2 are deployed on same node and > > both > > > >> the > > > >> > operators have bandwidthManager as plugin. > > > >> > > > > >> > On Fri, Mar 18, 2016 at 5:43 PM, Priyanka Gugale < > > > >> [email protected] > > > >> > > > > > >> > wrote: > > > >> > > > > >> > > Hi, > > > >> > > > > > >> > > Thanks for inputs Sandeep, would take care of those points. > > > >> > > > > > >> > > Here is high level design we are considering, We would have > > > following > > > >> > > components: > > > >> > > *1.* *BandwidthManager* > > > >> > > This keeps track of current bandwidth usage of system and takes > > > >> decision > > > >> > if > > > >> > > requested data bandwidth can be used right away or not. To do > this > > > it > > > >> > > used Leaky > > > >> > > bucket <https://en.wikipedia.org/wiki/Leaky_bucket> algorithm > > where > > > >> it > > > >> > > emits data as long as it has not overused bandwidth (i.e. > > bandwidth > > > >> > > consumption is >=0) and then wait to accumulate bandwidth for a > > > while > > > >> > (till > > > >> > > bandwidth goes from -ve value to +ve). > > > >> > > > > > >> > > *2. BandwidthLimitingInputOperator* > > > >> > > Any Input operator which want to implement bandwidth restriction > > > >> should > > > >> > > implement BandwidthLimitingInputOperator. The operator have > > abstract > > > >> > method > > > >> > > to initialize instance of BandwidthManager and a method to emit > > > tuple > > > >> > with > > > >> > > bandwidth restriction to emit tuples as per available bandwidth. > > > >> > > > > > >> > > *3. BandwidthPartitioner* > > > >> > > Bandwidth partitioner is introduced for static partitioning. If > > > static > > > >> > > partitioning is used by default StatelessPartitioner class is > > > >> > initialized. > > > >> > > With bandwidth restriction we want to equally divide bandwidth > > > amongst > > > >> > > available partitions. BandwidthPartitioner should take care of > it. > > > It > > > >> > > extends StatelessPartitioner, it just sets right bandwidth on > all > > > >> > > partitions after StatelessPartitioner creates/deletes > partitiolns. > > > In > > > >> > case > > > >> > > of dynamic partitioning the operator implementing > > definePartitions, > > > >> > should > > > >> > > take care of bandwidth distribution. > > > >> > > > > > >> > > This design takes care of basic bandwidth restriction, also > takes > > > >> care of > > > >> > > partitions by equally distributing available bandwidth among all > > > >> > > partitions. Also this is open enough to do further modifications > > to > > > >> take > > > >> > > care of complex situations. > > > >> > > > > > >> > > Let me know your opinion on what else we can do to design it > > better. > > > >> > > > > > >> > > -Priyanka > > > >> > > > > > >> > > On Thu, Mar 3, 2016 at 10:11 AM, Sandeep Deshmukh < > > > >> > [email protected] > > > >> > > > > > > >> > > wrote: > > > >> > > > > > >> > > > The main purpose is not to handle back pressure but to limit > > > >> bandwidth > > > >> > > > usage by applications. This is useful in ingestion use cases. > > > >> Typically > > > >> > > > user needs to ingest say up to 1GB per sec and not more. The > > > tuple > > > >> > size > > > >> > > > may vary based on messages based tuples (few KBs) or block > > tuples > > > >> for > > > >> > > files > > > >> > > > (few MBs). Bandwidth manager will take max bandwidth that can > be > > > >> > utilized > > > >> > > > by the application and will take care of sharing that across > > > >> partitions > > > >> > > > etc. > > > >> > > > > > > >> > > > Priyanka: You could also consider following in your design > > > >> > > > > > > >> > > > 1. Limiting input rate (across partitions) > > > >> > > > 2. Limiting output rate (across partitions) > > > >> > > > 3. Specifying total bandwidth that the Application can > > utilize > > > >> > > including > > > >> > > > input and output? Not sure if this is required. Need > comments > > > >> from > > > >> > > > others > > > >> > > > here. > > > >> > > > 4. Include default implementation that will handle 1 and 2, > > and > > > >> > anyone > > > >> > > > interested in having their own Bandwidth manager should be > > able > > > >> to > > > >> > > > extend > > > >> > > > the default one. > > > >> > > > 5. Can you also look at including/extending tuples per sec > as > > > >> > pointed > > > >> > > > out by Tim/Chinmay. > > > >> > > > > > > >> > > > Regards, > > > >> > > > Sandeep > > > >> > > > > > > >> > > > On Thu, Mar 3, 2016 at 12:23 AM, Timothy Farkas < > > > >> [email protected]> > > > >> > > > wrote: > > > >> > > > > > > >> > > > > Not sure if this is helpful, but there is already a utility > in > > > >> Malhar > > > >> > > for > > > >> > > > > converting tuples per second to tuples per window. This > allows > > > the > > > >> > user > > > >> > > > to > > > >> > > > > define a property in tuples per second, then the operator > can > > > >> convert > > > >> > > > that > > > >> > > > > to tuples per window so it emits the correct number of > tuples > > > per > > > >> > > window. > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > > > > https://github.com/apache/incubator-apex-malhar/blob/master/library/src/main/java/com/datatorrent/lib/util/time/WindowUtils.java > > > >> > > > > > > > >> > > > > On Wed, Mar 2, 2016 at 10:41 AM, Chinmay Kolhatkar < > > > >> > > > > [email protected]> > > > >> > > > > wrote: > > > >> > > > > > > > >> > > > > > Hi Priyanka, > > > >> > > > > > > > > >> > > > > > Indeed this is a useful feature. > > > >> > > > > > > > > >> > > > > > I believe number bytes consumed per sec can as well > > translate > > > to > > > >> > > number > > > >> > > > > of > > > >> > > > > > tuples consumed per sec. > > > >> > > > > > > > > >> > > > > > If above is correct, won't back pressure that is handled > by > > > >> > > > bufferserver > > > >> > > > > > help in your use case? > > > >> > > > > > > > > >> > > > > > Thanks, > > > >> > > > > > Chinmay. > > > >> > > > > > On 2 Mar 2016 4:49 p.m., "Priyanka Gugale" < > > > >> > [email protected] > > > >> > > > > > > >> > > > > > wrote: > > > >> > > > > > > > > >> > > > > > > Many times we need to put bandwidth restrictions or put > > some > > > >> > limit > > > >> > > on > > > >> > > > > > input > > > >> > > > > > > operator for number of bytes to be consumed per second. > > As I > > > >> > > > understand > > > >> > > > > > in > > > >> > > > > > > Apex there is no direct support for this feature. > > > >> > > > > > > > > > >> > > > > > > I am planning to write a bandwidth manager which will > help > > > in > > > >> > > > limiting > > > >> > > > > > > bandwidth at Input operator. Let me know if there are > any > > > >> better > > > >> > > > > > > alternative ways. I will soon publish design for > Bandwidth > > > >> > Manager > > > >> > > I > > > >> > > > am > > > >> > > > > > > planning to write. > > > >> > > > > > > > > > >> > > > > > > -Priyanka > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > > > > > > > > > > > >
