Hi Denny, Yes, I agree. I cannot use restrictive policy to commit if the speed is different among sinks. That's why I defined two flexible policies for commit. For e.g. coordinator could commit the transaction if M(0=<M<=N) fast sinks acknowledge success to coordinator.
Regards, Yongkun Wang On 12/08/10 15:58, "Denny Ye" <[email protected]> wrote: >hi Yongkun, > OK, you have chosen most important baseline with similar consuming >rate >for each Sink. Regularly, it's impossible in fact. Slowest Sink will be >limitation or bottleneck in your design. If my first question is becoming >false case, I think you provide simplified rollback model. Do you agree >me? > >-Regards >Denny Ye > >2012/8/10 Wang, Yongkun | Yongkun | BDD <[email protected]> > >> Hi Denny, >> >> Thanks for the questions. Answers inline. >> >> On 12/08/10 15:09, "Denny Ye" <[email protected]> wrote: >> >> >Yongkun, >> > Now, I understand your design. Thanks for your interpretation. >> > I have two questions, please help to explain, thanks! >> > 1. Two Sinks have different consuming rate. If Channel have 1000 >> >events, sinkA consumed 800 events and sinkB consumed 100 events. When >>we >> >remove totally consumed events from Channel? >> >> In my design, I try to avoid this case, which means SinkA and SinkB will >> be synchronized and both get 1000 events if the mode is replicating. In >>my >> design, the event is not removed by Sink (call channel.take() in >>process() >> of sink), instead events are removed by high level sink processor, who >> will remove the event once sinks satisfy the transaction requirements. >> >> > 2. Exception happened at one Sink. Each Sink retrieve 100 events >>from >> >Channel, and exception happening at sinkA. sinkA should rollback. >>What's >> >the detailed activity in your thought? >> >> Yes, transaction control on multiple sinks is more complicated. In my >> design, I have two policies to commit a multi-sink transaction (suppose >>we >> have N sinks): >> >> - When M(0=<M<=N) Sinks succeed, commit; e.g. value for M: ANY, ONE, >> QUARUM, ALL >> - When specified M(0<M<=N) Sinks (important sinks) succeed, commit; >> - otherwise, rollback all sinks for current event. >> >> >> Regards, >> Yongkun >> >> > >> >-Regards >> >Denny Ye >> > >> >2012/8/10 Wang, Yongkun | Yongkun | BDD <[email protected]> >> > >> >> Hi Denny, >> >> >> >> I am working on the patch now, it's not difficult. I have listed the >> >> changes in that JIRA. >> >> I think you misunderstand my design, I didn't maintain the order of >>the >> >> events. Instead I make sure that each sink will get the same events >>(or >> >> different events specified by selector). >> >> >> >> Suppose Channel (mc) contains the following events: 4,3,2,1 >> >> >> >> If simply enable it by configuration, it may work like this: >> >> Sink "hsa" may get 1,3; >> >> Sink "hsb" may get 2,4; >> >> So different sink will get different data. Is this what user wants? >> >> >> >> >> >> In my design, "hsa" and "hsb" will both get "4,3,2,1". This is a >>typical >> >> case when user want to fan-out the data into two places (eg. One for >> >>batch >> >> and and another for real-time analysis). >> >> >> >> Regards, >> >> Yongkun Wang >> >> >> >> >> >> On 12/08/10 14:29, "Denny Ye" <[email protected]> wrote: >> >> >> >> >hi Yongkun, >> >> > >> >> > JIRA can be accessed now. >> >> > >> >> > I think it might be difficult to understand the order of events >>from >> >> >your thought. If we don't care about the order, can discuss the >>value >> >>and >> >> >feasibility. In my opinion, data ingest flow is order unawareness, >>at >> >> >least, not such important for us. You can try to verify your >>proposal >> >>and >> >> >give us result. It may be some difficulties in keeping transaction >>with >> >> >several Sinks. >> >> > >> >> >-Regards >> >> >Denny Ye >> >> > >> >> > >> >> >2012/8/10 Wang, Yongkun | Yongkun | BDD >><[email protected] >> > >> >> > >> >> >> JIRA is down again? I cannot connect to it and comment there. >> >> >> >> >> >> I have a proposal in "Transactional Multiplex (fan out) Sink"): >> >> >> https://issues.apache.org/jira/browse/FLUME-1435 >> >> >> Which contains the design of one channel to multiple sinks. >> >> >> >> >> >> You can search the email since JIRA cannot be accessed. >> >> >> >> >> >> I think this is more than a configuration issue. If simply enable >> >> >>several >> >> >> sinks on the same channel, they will take it either in a >>round-robin >> >> >>mode >> >> >> or in a unpredictable mode if the speed of sinks are different. >> >> >> >> >> >> So it's better to have a even higher level transaction control >> >>instead >> >> >>of >> >> >> the transaction in the process() of each sink, as I describe in >> >> >>FLUME-1435. >> >> >> >> >> >> Regards, >> >> >> Yongkun Wang >> >> >> >> >> >> >> >> >> On 12/08/10 12:30, "Denny Ye (JIRA)" <[email protected]> wrote: >> >> >> >> >> >> >Denny Ye created FLUME-1479: >> >> >> >------------------------------- >> >> >> > >> >> >> > Summary: Multiple Sinks can connect to single >>Channel >> >> >> > Key: FLUME-1479 >> >> >> > URL: >> >>https://issues.apache.org/jira/browse/FLUME-1479 >> >> >> > Project: Flume >> >> >> > Issue Type: Bug >> >> >> > Components: Configuration >> >> >> > Affects Versions: v1.2.0 >> >> >> > Reporter: Denny Ye >> >> >> > Assignee: Denny Ye >> >> >> > Fix For: v1.3.0 >> >> >> > >> >> >> > >> >> >> >If we has one Channel (mc) and two Sinks (hsa, hsb), then they >>may >> >>be >> >> >> >connected with each other with configuration example >> >> >> >{quote} >> >> >> >agent.sinks.hsa.channel = mc >> >> >> >agent.sinks.hsb.channel = mc >> >> >> >{quote} >> >> >> >It means that there have multiple Sinks can connect to single >> >>Channel. >> >> >> >Normally, one Sink only can connect to unified Channel >> >> >> > >> >> >> >-- >> >> >> >This message is automatically generated by JIRA. >> >> >> >If you think it was sent incorrectly, please contact your JIRA >> >> >> >administrators: >> >> >> >> >> >>> >> >> >> https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa >> >> >> >For more information on JIRA, see: >> >> >>http://www.atlassian.com/software/jira >> >> >> > >> >> >> > >> >> >> > >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>
