Priyanka, >From the design description it is not clear how it will be used to control output bandwidth (point #2,3,4 mentioned by Sandeep)
~ Yogi On 23 March 2016 at 11:39, Chaitanya Chebolu <[email protected]> wrote: > This is very useful feature. > I would like to know, how you are distributing the bandwidth for the below > situation: > - Two input operators say i1 and i2 are deployed on same node and both the > operators have bandwidthManager as plugin. > > On Fri, Mar 18, 2016 at 5:43 PM, Priyanka Gugale <[email protected] > > > wrote: > > > Hi, > > > > Thanks for inputs Sandeep, would take care of those points. > > > > Here is high level design we are considering, We would have following > > components: > > *1.* *BandwidthManager* > > This keeps track of current bandwidth usage of system and takes decision > if > > requested data bandwidth can be used right away or not. To do this it > > used Leaky > > bucket <https://en.wikipedia.org/wiki/Leaky_bucket> algorithm where it > > emits data as long as it has not overused bandwidth (i.e. bandwidth > > consumption is >=0) and then wait to accumulate bandwidth for a while > (till > > bandwidth goes from -ve value to +ve). > > > > *2. BandwidthLimitingInputOperator* > > Any Input operator which want to implement bandwidth restriction should > > implement BandwidthLimitingInputOperator. The operator have abstract > method > > to initialize instance of BandwidthManager and a method to emit tuple > with > > bandwidth restriction to emit tuples as per available bandwidth. > > > > *3. BandwidthPartitioner* > > Bandwidth partitioner is introduced for static partitioning. If static > > partitioning is used by default StatelessPartitioner class is > initialized. > > With bandwidth restriction we want to equally divide bandwidth amongst > > available partitions. BandwidthPartitioner should take care of it. It > > extends StatelessPartitioner, it just sets right bandwidth on all > > partitions after StatelessPartitioner creates/deletes partitiolns. In > case > > of dynamic partitioning the operator implementing definePartitions, > should > > take care of bandwidth distribution. > > > > This design takes care of basic bandwidth restriction, also takes care of > > partitions by equally distributing available bandwidth among all > > partitions. Also this is open enough to do further modifications to take > > care of complex situations. > > > > Let me know your opinion on what else we can do to design it better. > > > > -Priyanka > > > > On Thu, Mar 3, 2016 at 10:11 AM, Sandeep Deshmukh < > [email protected] > > > > > wrote: > > > > > The main purpose is not to handle back pressure but to limit bandwidth > > > usage by applications. This is useful in ingestion use cases. Typically > > > user needs to ingest say up to 1GB per sec and not more. The tuple > size > > > may vary based on messages based tuples (few KBs) or block tuples for > > files > > > (few MBs). Bandwidth manager will take max bandwidth that can be > utilized > > > by the application and will take care of sharing that across partitions > > > etc. > > > > > > Priyanka: You could also consider following in your design > > > > > > 1. Limiting input rate (across partitions) > > > 2. Limiting output rate (across partitions) > > > 3. Specifying total bandwidth that the Application can utilize > > including > > > input and output? Not sure if this is required. Need comments from > > > others > > > here. > > > 4. Include default implementation that will handle 1 and 2, and > anyone > > > interested in having their own Bandwidth manager should be able to > > > extend > > > the default one. > > > 5. Can you also look at including/extending tuples per sec as > pointed > > > out by Tim/Chinmay. > > > > > > Regards, > > > Sandeep > > > > > > On Thu, Mar 3, 2016 at 12:23 AM, Timothy Farkas <[email protected]> > > > wrote: > > > > > > > Not sure if this is helpful, but there is already a utility in Malhar > > for > > > > converting tuples per second to tuples per window. This allows the > user > > > to > > > > define a property in tuples per second, then the operator can convert > > > that > > > > to tuples per window so it emits the correct number of tuples per > > window. > > > > > > > > > > > > > > > > > > https://github.com/apache/incubator-apex-malhar/blob/master/library/src/main/java/com/datatorrent/lib/util/time/WindowUtils.java > > > > > > > > On Wed, Mar 2, 2016 at 10:41 AM, Chinmay Kolhatkar < > > > > [email protected]> > > > > wrote: > > > > > > > > > Hi Priyanka, > > > > > > > > > > Indeed this is a useful feature. > > > > > > > > > > I believe number bytes consumed per sec can as well translate to > > number > > > > of > > > > > tuples consumed per sec. > > > > > > > > > > If above is correct, won't back pressure that is handled by > > > bufferserver > > > > > help in your use case? > > > > > > > > > > Thanks, > > > > > Chinmay. > > > > > On 2 Mar 2016 4:49 p.m., "Priyanka Gugale" < > [email protected] > > > > > > > > wrote: > > > > > > > > > > > Many times we need to put bandwidth restrictions or put some > limit > > on > > > > > input > > > > > > operator for number of bytes to be consumed per second. As I > > > understand > > > > > in > > > > > > Apex there is no direct support for this feature. > > > > > > > > > > > > I am planning to write a bandwidth manager which will help in > > > limiting > > > > > > bandwidth at Input operator. Let me know if there are any better > > > > > > alternative ways. I will soon publish design for Bandwidth > Manager > > I > > > am > > > > > > planning to write. > > > > > > > > > > > > -Priyanka > > > > > > > > > > > > > > > > > > > > >
