Pramod,

How would dynamic property change using OperatorRequest as part of 
StatsListener work with new approach?

Thanks
- Gaurav

> On Sep 28, 2015, at 10:30 AM, Pramod Immaneni <[email protected]> wrote:
> 
> An optimization that can be done is the below steps are done only when
> there only when there are more than one input operator but in case of a
> single input operator case which is more common the property change tuple
> can be inserted at the next possible window without having to temporarily
> pause the flow.
> 
> On Mon, Sep 28, 2015 at 10:27 AM, Timothy Farkas <[email protected]>
> wrote:
> 
>> Furthermore this approach is not limited to DAGs with a single input
>> operator. In the case where a DAG has multiple input operators property
>> changes can be set within the same window across all input operators by
>> enforcing some synchronization at the input operator level when setting the
>> property. This synchronization would look like the following:
>> 
>>   1. When receiving a property change request, ask all input operators to
>> stop and send their current window.
>>   2. Take the max window + 1 (not technically correct but you get the
>> idea)
>>   3. Send the property change request to all the input operators and tell
>> them to apply the change at the maximum window id + 1.
>>   4. Resume the input operators.
>> 
>> This ensures that the change is applied at the same window Id and also
>> ensures that the change is applied at a window ID that the input operator
>> had never played before. Therefore property changes will not interfere with
>> the idempotence of operators.
>> 
>> 
>> On Mon, Sep 28, 2015 at 9:17 AM, Pramod Immaneni <[email protected]>
>> wrote:
>> 
>>> Apex support modification of operator properties at runtime but the
>>> current implemenations has the following shortcomings.
>>> 
>>> 1. Property is not set across all partitions on the same window as
>>> individual partitions can be on different windows when property change is
>>> initiated from client resulting in inconsistency of data for those windows.
>>> I am being generous using the word inconsistent.
>>> 2. Sometimes properties need to be set on more than one logical operators
>>> at the same time to achieve the change the user is seeking. Today they will
>>> be two separate changes happening on two different windows again resulting
>>> in inconsistent data for some windows. These would need to happen as a
>>> single transaction.
>>> 3. If there is an operator failure before a committed checkpoint after an
>>> operator property is dynamically changed the operator will restart with the
>>> old property and the change will not be re-applied.
>>> 
>>> Tim and myself did some brainstorming and we have a proposal to overcome
>>> these shortcomings. The main problem in all the above cases is that the
>>> property changes are happening out-of-band of data flow and hence
>>> independent of windowing. The proposal is to bring the property change
>>> request into the in-band dataflow so that they are handled consistently
>>> with windowing and handled distributively.
>>> 
>>> The idea is to inject a special property change tuple containing the
>>> property changes and the identification information of the operator's they
>>> affect into the dataflow at the input operator. The tuple will be injected
>>> at window boundary after end window and before begin window and as this
>>> tuple flows through the DAG the intended operators properties will be
>>> modifed. They will all be modified consistently at the same window. The
>>> tuple can contain more than one property changes for more than one logical
>>> operators and the change will be applied consistently to the different
>>> logical operators at the same window. In case of failure the replay of
>>> tuples will ensure that the property change gets reapplied at the correct
>>> window.
>>> 
>>> Please give your feedback and input on what you think about this proposal.
>>> 
>>> Thanks
>>> 
>> 
>> 

Reply via email to