This looks OK. Let us build it incrementally. ~ Yogi
On 23 March 2016 at 13:24, Sandeep Deshmukh <[email protected]> wrote: > I would suggest that we go ahead with design as suggested by Priyanka where > we have bandwidth setup for each operator separately. We can later extend > this for bandwidth to be shared with different input operators or for the > DAG as a whole. > > Regards, > Sandeep > > On Wed, Mar 23, 2016 at 11:51 AM, Priyanka Gugale < > [email protected]> > wrote: > > > Right now it's not for output operator, but one can very well use > bandwidth > > manager to keep track of bandwidth usage and limit your output speed. The > > bigger challenge there would be, you won't be able to process window data > > sent by upstream operator in same window. For that you need to do more > than > > just bandwidth control. > > So I would say, bandwidth control feature can be used as it is for output > > operator as well, only we need to do more than just bandwidth limitation > > for output operators. > > > > -Priyanka > > > > On Wed, Mar 23, 2016 at 11:47 AM, Priyanka Gugale < > > [email protected]> > > wrote: > > > > > That's a good question Chaitanya, Right now the bandwidth control is at > > > Input Operator level and not application level. So if you have two > input > > > operator you need to set bandwidth on both separately by this design. > > > May be it would be good to have bandwidth control at Application level > > > than operator level. Let me think if I can modify the design to do > that. > > If > > > you have any ideas for same, please share them. > > > > > > -Priyanka > > > > > > On Wed, Mar 23, 2016 at 11:47 AM, Yogi Devendra < > > > [email protected]> wrote: > > > > > >> Priyanka, > > >> > > >> From the design description it is not clear how it will be used to > > control > > >> output bandwidth (point #2,3,4 mentioned by Sandeep) > > >> > > >> ~ Yogi > > >> > > >> On 23 March 2016 at 11:39, Chaitanya Chebolu < > [email protected] > > > > > >> wrote: > > >> > > >> > This is very useful feature. > > >> > I would like to know, how you are distributing the bandwidth for the > > >> below > > >> > situation: > > >> > - Two input operators say i1 and i2 are deployed on same node and > both > > >> the > > >> > operators have bandwidthManager as plugin. > > >> > > > >> > On Fri, Mar 18, 2016 at 5:43 PM, Priyanka Gugale < > > >> [email protected] > > >> > > > > >> > wrote: > > >> > > > >> > > Hi, > > >> > > > > >> > > Thanks for inputs Sandeep, would take care of those points. > > >> > > > > >> > > Here is high level design we are considering, We would have > > following > > >> > > components: > > >> > > *1.* *BandwidthManager* > > >> > > This keeps track of current bandwidth usage of system and takes > > >> decision > > >> > if > > >> > > requested data bandwidth can be used right away or not. To do this > > it > > >> > > used Leaky > > >> > > bucket <https://en.wikipedia.org/wiki/Leaky_bucket> algorithm > where > > >> it > > >> > > emits data as long as it has not overused bandwidth (i.e. > bandwidth > > >> > > consumption is >=0) and then wait to accumulate bandwidth for a > > while > > >> > (till > > >> > > bandwidth goes from -ve value to +ve). > > >> > > > > >> > > *2. BandwidthLimitingInputOperator* > > >> > > Any Input operator which want to implement bandwidth restriction > > >> should > > >> > > implement BandwidthLimitingInputOperator. The operator have > abstract > > >> > method > > >> > > to initialize instance of BandwidthManager and a method to emit > > tuple > > >> > with > > >> > > bandwidth restriction to emit tuples as per available bandwidth. > > >> > > > > >> > > *3. BandwidthPartitioner* > > >> > > Bandwidth partitioner is introduced for static partitioning. If > > static > > >> > > partitioning is used by default StatelessPartitioner class is > > >> > initialized. > > >> > > With bandwidth restriction we want to equally divide bandwidth > > amongst > > >> > > available partitions. BandwidthPartitioner should take care of it. > > It > > >> > > extends StatelessPartitioner, it just sets right bandwidth on all > > >> > > partitions after StatelessPartitioner creates/deletes partitiolns. > > In > > >> > case > > >> > > of dynamic partitioning the operator implementing > definePartitions, > > >> > should > > >> > > take care of bandwidth distribution. > > >> > > > > >> > > This design takes care of basic bandwidth restriction, also takes > > >> care of > > >> > > partitions by equally distributing available bandwidth among all > > >> > > partitions. Also this is open enough to do further modifications > to > > >> take > > >> > > care of complex situations. > > >> > > > > >> > > Let me know your opinion on what else we can do to design it > better. > > >> > > > > >> > > -Priyanka > > >> > > > > >> > > On Thu, Mar 3, 2016 at 10:11 AM, Sandeep Deshmukh < > > >> > [email protected] > > >> > > > > > >> > > wrote: > > >> > > > > >> > > > The main purpose is not to handle back pressure but to limit > > >> bandwidth > > >> > > > usage by applications. This is useful in ingestion use cases. > > >> Typically > > >> > > > user needs to ingest say up to 1GB per sec and not more. The > > tuple > > >> > size > > >> > > > may vary based on messages based tuples (few KBs) or block > tuples > > >> for > > >> > > files > > >> > > > (few MBs). Bandwidth manager will take max bandwidth that can be > > >> > utilized > > >> > > > by the application and will take care of sharing that across > > >> partitions > > >> > > > etc. > > >> > > > > > >> > > > Priyanka: You could also consider following in your design > > >> > > > > > >> > > > 1. Limiting input rate (across partitions) > > >> > > > 2. Limiting output rate (across partitions) > > >> > > > 3. Specifying total bandwidth that the Application can > utilize > > >> > > including > > >> > > > input and output? Not sure if this is required. Need comments > > >> from > > >> > > > others > > >> > > > here. > > >> > > > 4. Include default implementation that will handle 1 and 2, > and > > >> > anyone > > >> > > > interested in having their own Bandwidth manager should be > able > > >> to > > >> > > > extend > > >> > > > the default one. > > >> > > > 5. Can you also look at including/extending tuples per sec as > > >> > pointed > > >> > > > out by Tim/Chinmay. > > >> > > > > > >> > > > Regards, > > >> > > > Sandeep > > >> > > > > > >> > > > On Thu, Mar 3, 2016 at 12:23 AM, Timothy Farkas < > > >> [email protected]> > > >> > > > wrote: > > >> > > > > > >> > > > > Not sure if this is helpful, but there is already a utility in > > >> Malhar > > >> > > for > > >> > > > > converting tuples per second to tuples per window. This allows > > the > > >> > user > > >> > > > to > > >> > > > > define a property in tuples per second, then the operator can > > >> convert > > >> > > > that > > >> > > > > to tuples per window so it emits the correct number of tuples > > per > > >> > > window. > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > > https://github.com/apache/incubator-apex-malhar/blob/master/library/src/main/java/com/datatorrent/lib/util/time/WindowUtils.java > > >> > > > > > > >> > > > > On Wed, Mar 2, 2016 at 10:41 AM, Chinmay Kolhatkar < > > >> > > > > [email protected]> > > >> > > > > wrote: > > >> > > > > > > >> > > > > > Hi Priyanka, > > >> > > > > > > > >> > > > > > Indeed this is a useful feature. > > >> > > > > > > > >> > > > > > I believe number bytes consumed per sec can as well > translate > > to > > >> > > number > > >> > > > > of > > >> > > > > > tuples consumed per sec. > > >> > > > > > > > >> > > > > > If above is correct, won't back pressure that is handled by > > >> > > > bufferserver > > >> > > > > > help in your use case? > > >> > > > > > > > >> > > > > > Thanks, > > >> > > > > > Chinmay. > > >> > > > > > On 2 Mar 2016 4:49 p.m., "Priyanka Gugale" < > > >> > [email protected] > > >> > > > > > >> > > > > > wrote: > > >> > > > > > > > >> > > > > > > Many times we need to put bandwidth restrictions or put > some > > >> > limit > > >> > > on > > >> > > > > > input > > >> > > > > > > operator for number of bytes to be consumed per second. > As I > > >> > > > understand > > >> > > > > > in > > >> > > > > > > Apex there is no direct support for this feature. > > >> > > > > > > > > >> > > > > > > I am planning to write a bandwidth manager which will help > > in > > >> > > > limiting > > >> > > > > > > bandwidth at Input operator. Let me know if there are any > > >> better > > >> > > > > > > alternative ways. I will soon publish design for Bandwidth > > >> > Manager > > >> > > I > > >> > > > am > > >> > > > > > > planning to write. > > >> > > > > > > > > >> > > > > > > -Priyanka > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > > > > > > > >
