This looks OK. Let us build it incrementally.

~ Yogi

On 23 March 2016 at 13:24, Sandeep Deshmukh <[email protected]> wrote:

> I would suggest that we go ahead with design as suggested by Priyanka where
> we have bandwidth setup for each operator separately. We can later extend
> this for bandwidth to be shared with different input operators or for the
> DAG as a whole.
>
> Regards,
> Sandeep
>
> On Wed, Mar 23, 2016 at 11:51 AM, Priyanka Gugale <
> [email protected]>
> wrote:
>
> > Right now it's not for output operator, but one can very well use
> bandwidth
> > manager to keep track of bandwidth usage and limit your output speed. The
> > bigger challenge there would be, you won't be able to process window data
> > sent by upstream operator in same window. For that you need to do more
> than
> > just bandwidth control.
> > So I would say, bandwidth control feature can be used as it is for output
> > operator as well, only we need to do more than just bandwidth limitation
> > for output operators.
> >
> > -Priyanka
> >
> > On Wed, Mar 23, 2016 at 11:47 AM, Priyanka Gugale <
> > [email protected]>
> > wrote:
> >
> > > That's a good question Chaitanya, Right now the bandwidth control is at
> > > Input Operator level and not application level. So if you have two
> input
> > > operator you need to set bandwidth on both separately by this design.
> > > May be it would be good to have bandwidth control at Application level
> > > than operator level. Let me think if I can modify the design to do
> that.
> > If
> > > you have any ideas for same, please share them.
> > >
> > > -Priyanka
> > >
> > > On Wed, Mar 23, 2016 at 11:47 AM, Yogi Devendra <
> > > [email protected]> wrote:
> > >
> > >> Priyanka,
> > >>
> > >> From the design description it is not clear how it will be used to
> > control
> > >> output bandwidth (point #2,3,4 mentioned by Sandeep)
> > >>
> > >> ~ Yogi
> > >>
> > >> On 23 March 2016 at 11:39, Chaitanya Chebolu <
> [email protected]
> > >
> > >> wrote:
> > >>
> > >> > This is very useful feature.
> > >> > I would like to know, how you are distributing the bandwidth for the
> > >> below
> > >> > situation:
> > >> > - Two input operators say i1 and i2 are deployed on same node and
> both
> > >> the
> > >> > operators have bandwidthManager as plugin.
> > >> >
> > >> > On Fri, Mar 18, 2016 at 5:43 PM, Priyanka Gugale <
> > >> [email protected]
> > >> > >
> > >> > wrote:
> > >> >
> > >> > > Hi,
> > >> > >
> > >> > > Thanks for inputs Sandeep, would take care of those points.
> > >> > >
> > >> > > Here is high level design we are considering, We would have
> > following
> > >> > > components:
> > >> > > *1.* *BandwidthManager*
> > >> > > This keeps track of current bandwidth usage of system and takes
> > >> decision
> > >> > if
> > >> > > requested data bandwidth can be used right away or not. To do this
> > it
> > >> > > used Leaky
> > >> > > bucket <https://en.wikipedia.org/wiki/Leaky_bucket> algorithm
> where
> > >> it
> > >> > > emits data as long as it has not overused bandwidth (i.e.
> bandwidth
> > >> > > consumption is >=0) and then wait to accumulate bandwidth for a
> > while
> > >> > (till
> > >> > > bandwidth goes from -ve value to +ve).
> > >> > >
> > >> > > *2. BandwidthLimitingInputOperator*
> > >> > > Any Input operator which want to implement bandwidth restriction
> > >> should
> > >> > > implement BandwidthLimitingInputOperator. The operator have
> abstract
> > >> > method
> > >> > >  to initialize instance of BandwidthManager and a method to emit
> > tuple
> > >> > with
> > >> > > bandwidth restriction to emit tuples as per available bandwidth.
> > >> > >
> > >> > > *3. BandwidthPartitioner*
> > >> > > Bandwidth partitioner is introduced for static partitioning. If
> > static
> > >> > > partitioning is used by default StatelessPartitioner class is
> > >> > initialized.
> > >> > > With bandwidth restriction we want to equally divide bandwidth
> > amongst
> > >> > > available partitions. BandwidthPartitioner should take care of it.
> > It
> > >> > > extends StatelessPartitioner, it just sets right bandwidth on all
> > >> > > partitions after StatelessPartitioner creates/deletes partitiolns.
> > In
> > >> > case
> > >> > > of dynamic partitioning the operator implementing
> definePartitions,
> > >> > should
> > >> > > take care of bandwidth distribution.
> > >> > >
> > >> > > This design takes care of basic bandwidth restriction, also takes
> > >> care of
> > >> > > partitions by equally distributing available bandwidth among all
> > >> > > partitions. Also this is open enough to do further modifications
> to
> > >> take
> > >> > > care of complex situations.
> > >> > >
> > >> > > Let me know your opinion on what else we can do to design it
> better.
> > >> > >
> > >> > > -Priyanka
> > >> > >
> > >> > > On Thu, Mar 3, 2016 at 10:11 AM, Sandeep Deshmukh <
> > >> > [email protected]
> > >> > > >
> > >> > > wrote:
> > >> > >
> > >> > > > The main purpose is not to handle back pressure but to limit
> > >> bandwidth
> > >> > > > usage by applications. This is useful in ingestion use cases.
> > >> Typically
> > >> > > > user needs to ingest say up to  1GB per sec and not more. The
> > tuple
> > >> > size
> > >> > > > may vary based on messages based tuples (few KBs) or block
> tuples
> > >> for
> > >> > > files
> > >> > > > (few MBs). Bandwidth manager will take max bandwidth that can be
> > >> > utilized
> > >> > > > by the application and will take care of sharing that across
> > >> partitions
> > >> > > > etc.
> > >> > > >
> > >> > > > Priyanka: You could also consider following in your design
> > >> > > >
> > >> > > >    1. Limiting input rate (across partitions)
> > >> > > >    2. Limiting output rate (across partitions)
> > >> > > >    3. Specifying total bandwidth that the Application can
> utilize
> > >> > > including
> > >> > > >    input and output? Not sure if this is required. Need comments
> > >> from
> > >> > > > others
> > >> > > >    here.
> > >> > > >    4. Include default implementation that will handle 1 and 2,
> and
> > >> > anyone
> > >> > > >    interested in having their own Bandwidth manager should be
> able
> > >> to
> > >> > > > extend
> > >> > > >    the default one.
> > >> > > >    5. Can you also look at including/extending tuples per sec as
> > >> > pointed
> > >> > > >    out by Tim/Chinmay.
> > >> > > >
> > >> > > > Regards,
> > >> > > > Sandeep
> > >> > > >
> > >> > > > On Thu, Mar 3, 2016 at 12:23 AM, Timothy Farkas <
> > >> [email protected]>
> > >> > > > wrote:
> > >> > > >
> > >> > > > > Not sure if this is helpful, but there is already a utility in
> > >> Malhar
> > >> > > for
> > >> > > > > converting tuples per second to tuples per window. This allows
> > the
> > >> > user
> > >> > > > to
> > >> > > > > define a property in tuples per second, then the operator can
> > >> convert
> > >> > > > that
> > >> > > > > to tuples per window so it emits the correct number of tuples
> > per
> > >> > > window.
> > >> > > > >
> > >> > > > >
> > >> > > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
> https://github.com/apache/incubator-apex-malhar/blob/master/library/src/main/java/com/datatorrent/lib/util/time/WindowUtils.java
> > >> > > > >
> > >> > > > > On Wed, Mar 2, 2016 at 10:41 AM, Chinmay Kolhatkar <
> > >> > > > > [email protected]>
> > >> > > > > wrote:
> > >> > > > >
> > >> > > > > > Hi Priyanka,
> > >> > > > > >
> > >> > > > > > Indeed this is a useful feature.
> > >> > > > > >
> > >> > > > > > I believe number bytes consumed per sec can as well
> translate
> > to
> > >> > > number
> > >> > > > > of
> > >> > > > > > tuples consumed per sec.
> > >> > > > > >
> > >> > > > > > If above is correct, won't back pressure that is handled by
> > >> > > > bufferserver
> > >> > > > > > help in your use case?
> > >> > > > > >
> > >> > > > > > Thanks,
> > >> > > > > > Chinmay.
> > >> > > > > > On 2 Mar 2016 4:49 p.m., "Priyanka Gugale" <
> > >> > [email protected]
> > >> > > >
> > >> > > > > > wrote:
> > >> > > > > >
> > >> > > > > > > Many times we need to put bandwidth restrictions or put
> some
> > >> > limit
> > >> > > on
> > >> > > > > > input
> > >> > > > > > > operator for number of bytes to be consumed per second.
> As I
> > >> > > > understand
> > >> > > > > > in
> > >> > > > > > > Apex there is no direct support for this feature.
> > >> > > > > > >
> > >> > > > > > > I am planning to write a bandwidth manager which will help
> > in
> > >> > > > limiting
> > >> > > > > > > bandwidth at Input operator. Let me know if there are any
> > >> better
> > >> > > > > > > alternative ways. I will soon publish design for Bandwidth
> > >> > Manager
> > >> > > I
> > >> > > > am
> > >> > > > > > > planning to write.
> > >> > > > > > >
> > >> > > > > > > -Priyanka
> > >> > > > > > >
> > >> > > > > >
> > >> > > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> > >
> > >
> >
>

Reply via email to