Hi Denny,

Yes, I agree. I cannot use restrictive policy to commit if the speed is
different among sinks. That's why I defined two flexible policies for
commit. 
For e.g. coordinator could commit the transaction if M(0=<M<=N) fast sinks
acknowledge success to coordinator.

Regards,
Yongkun Wang


On 12/08/10 15:58, "Denny Ye" <[email protected]> wrote:

>hi Yongkun,
>    OK, you have chosen most important baseline with similar consuming
>rate
>for each Sink. Regularly, it's impossible in fact. Slowest Sink will be
>limitation or bottleneck in your design. If my first question is becoming
>false case, I think you provide simplified rollback model. Do you agree
>me?
>
>-Regards
>Denny Ye
>
>2012/8/10 Wang, Yongkun | Yongkun | BDD <[email protected]>
>
>> Hi Denny,
>>
>> Thanks for the questions. Answers inline.
>>
>> On 12/08/10 15:09, "Denny Ye" <[email protected]> wrote:
>>
>> >Yongkun,
>> >    Now, I understand your design. Thanks for your interpretation.
>> >    I have two questions, please help to explain, thanks!
>> >    1. Two Sinks have different consuming rate. If Channel have 1000
>> >events, sinkA consumed 800 events and sinkB consumed 100 events. When
>>we
>> >remove totally consumed events from Channel?
>>
>> In my design, I try to avoid this case, which means SinkA and SinkB will
>> be synchronized and both get 1000 events if the mode is replicating. In
>>my
>> design, the event is not removed by Sink (call channel.take() in
>>process()
>> of sink), instead events are removed by high level sink processor, who
>> will remove the event once sinks satisfy the transaction requirements.
>>
>> >    2. Exception happened at one Sink. Each Sink retrieve 100 events
>>from
>> >Channel, and exception happening at sinkA. sinkA should rollback.
>>What's
>> >the detailed activity in your thought?
>>
>> Yes, transaction control on multiple sinks is more complicated. In my
>> design, I have two policies to commit a multi-sink transaction (suppose
>>we
>> have N sinks):
>>
>> - When M(0=<M<=N) Sinks succeed, commit; e.g. value for M: ANY, ONE,
>> QUARUM, ALL
>> - When specified M(0<M<=N) Sinks (important sinks) succeed, commit;
>> - otherwise, rollback all sinks for current event.
>>
>>
>> Regards,
>> Yongkun
>>
>> >
>> >-Regards
>> >Denny Ye
>> >
>> >2012/8/10 Wang, Yongkun | Yongkun | BDD <[email protected]>
>> >
>> >> Hi Denny,
>> >>
>> >> I am working on the patch now, it's not difficult. I have listed the
>> >> changes in that JIRA.
>> >> I think you misunderstand my design, I didn't maintain the order of
>>the
>> >> events. Instead I make sure that each sink will get the same events
>>(or
>> >> different events specified by selector).
>> >>
>> >> Suppose Channel (mc) contains the following events: 4,3,2,1
>> >>
>> >> If simply enable it by configuration, it may work like this:
>> >> Sink "hsa" may get 1,3;
>> >> Sink "hsb" may get 2,4;
>> >> So different sink will get different data. Is this what user wants?
>> >>
>> >>
>> >> In my design, "hsa" and "hsb" will both get "4,3,2,1". This is a
>>typical
>> >> case when user want to fan-out the data into two places (eg. One for
>> >>batch
>> >> and and another for real-time analysis).
>> >>
>> >> Regards,
>> >> Yongkun Wang
>> >>
>> >>
>> >> On 12/08/10 14:29, "Denny Ye" <[email protected]> wrote:
>> >>
>> >> >hi Yongkun,
>> >> >
>> >> >   JIRA can be accessed now.
>> >> >
>> >> >   I think it might be difficult to understand the order of events
>>from
>> >> >your thought. If we don't care about the order, can discuss the
>>value
>> >>and
>> >> >feasibility.  In my opinion, data ingest flow is order unawareness,
>>at
>> >> >least, not such important for us. You can try to verify your
>>proposal
>> >>and
>> >> >give us result. It may be some difficulties in keeping transaction
>>with
>> >> >several Sinks.
>> >> >
>> >> >-Regards
>> >> >Denny Ye
>> >> >
>> >> >
>> >> >2012/8/10 Wang, Yongkun | Yongkun | BDD
>><[email protected]
>> >
>> >> >
>> >> >> JIRA is down again? I cannot connect to it and comment there.
>> >> >>
>> >> >> I have a proposal in "Transactional Multiplex (fan out) Sink"):
>> >> >> https://issues.apache.org/jira/browse/FLUME-1435
>> >> >> Which contains the design of one channel to multiple sinks.
>> >> >>
>> >> >> You can search the email since JIRA cannot be accessed.
>> >> >>
>> >> >> I think this is more than a configuration issue. If simply enable
>> >> >>several
>> >> >> sinks on the same channel, they will take it either in a
>>round-robin
>> >> >>mode
>> >> >> or in a unpredictable mode if the speed of sinks are different.
>> >> >>
>> >> >> So it's better to have a even higher level transaction control
>> >>instead
>> >> >>of
>> >> >> the transaction in the process() of each sink, as I describe in
>> >> >>FLUME-1435.
>> >> >>
>> >> >> Regards,
>> >> >> Yongkun Wang
>> >> >>
>> >> >>
>> >> >> On 12/08/10 12:30, "Denny Ye (JIRA)" <[email protected]> wrote:
>> >> >>
>> >> >> >Denny Ye created FLUME-1479:
>> >> >> >-------------------------------
>> >> >> >
>> >> >> >             Summary: Multiple Sinks can connect to single
>>Channel
>> >> >> >                 Key: FLUME-1479
>> >> >> >                 URL:
>> >>https://issues.apache.org/jira/browse/FLUME-1479
>> >> >> >             Project: Flume
>> >> >> >          Issue Type: Bug
>> >> >> >          Components: Configuration
>> >> >> >    Affects Versions: v1.2.0
>> >> >> >            Reporter: Denny Ye
>> >> >> >            Assignee: Denny Ye
>> >> >> >             Fix For: v1.3.0
>> >> >> >
>> >> >> >
>> >> >> >If we has one Channel (mc) and two Sinks (hsa, hsb), then they
>>may
>> >>be
>> >> >> >connected with each other with configuration example
>> >> >> >{quote}
>> >> >> >agent.sinks.hsa.channel = mc
>> >> >> >agent.sinks.hsb.channel = mc
>> >> >> >{quote}
>> >> >> >It means that there have multiple Sinks can connect to single
>> >>Channel.
>> >> >> >Normally, one Sink only can connect to unified Channel
>> >> >> >
>> >> >> >--
>> >> >> >This message is automatically generated by JIRA.
>> >> >> >If you think it was sent incorrectly, please contact your JIRA
>> >> >> >administrators:
>> >> >>
>> >> >>>
>> >>
>> https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
>> >> >> >For more information on JIRA, see:
>> >> >>http://www.atlassian.com/software/jira
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >>
>> >> >>
>> >> >>
>> >>
>> >>
>> >>
>>
>>
>>


Reply via email to