Forgot about sink processors; yes, it will work. The trick of this method is you will use a different sink for each endpoint, where as the RpcClient (when exposed) will do it all in itself. Your configuration will need to look something like this:
----------------- <sources> a1.channels = c1 <channel setup> a1.sinks = k1 k2 a1.sinks.k1.type = AVRO < set up centralFlumeE connection > a1.sinks.k1.channel = c1 a1.sinks.k2.type = AVRO < set up centralFlumeF connection > a1.sinks.k2.channel = c1 a1.sinkgroups = g1 a1.sinkgroups.g1.sinks = k1 k2 a1.sinkgroups.g1.processor.type = load_balance a1.sinkgroups.g1.processor.backoff = true a1.sinkgroups.g1.processor.selector = round_robin ----------------- here is the relevant link for the load balancing processor: http://flume.apache.org/FlumeUserGuide.html#load-balancing-sink-processor Remember that all sinks in a sink group must share the same channel. This is load balancing, which is what you are seeking in your scenario; the load balancer is not for failover (in the setup of primary and backup servers), although there is a FailoverSinkProcessor for if that's needed. - Connor On Wed, Jan 9, 2013 at 11:55 PM, Denny Ye <[email protected]> wrote: > hi Hari, > I cannot judge the situation that using method you raised. I would > like to explain my case and need your comments. Thanks a lot! > What I need is load balancing while event transferring. Assume that I > have single local Flume server (located with application) named > 'localFlumeA', configured with single AvroSink and Channel. Meanwhile, two > central Flume servers (collectors) named 'centralFlumeE' and > 'centralFlumeF'. Under this case, I would like to configure load balancing > between 'centralFlumeE' and 'centralFlumeF' for events coming from > 'localFlumeA', and load can be dispatched averagely for that two central > Flume servers. > Can it be configured by LoadBalancingSinkProcessor in your mind? Wish > your advice > > -Regards > Denny Ye > > > 2013/1/10 Hari Shreedharan <[email protected]> > >> The LoadBalancing capability similar to the LoadBalancingRpcClient can >> be configured for multiple Avro Sinks using a LoadBalancingSinkProcessor, >> if you are looking for that functionality. >> >> >> Hari >> >> -- >> Hari Shreedharan >> >> On Wednesday, January 9, 2013 at 11:05 PM, Connor Woodson wrote: >> >> Short answer: there is no way in the current AvroSink to configure the >> RpcClient, limiting you to just a single host connection (I'm not sure how >> well it recovers if that host goes down). >> >> The AvroSink is incredibly simplified from what the RPCClient can do and >> exposes none of the background functionality. Right now, the only way >> around that is to create a custom sink based off of the AvroSink source >> code and instead of setting the RPCClient up the way it currently is, you >> pass into the RPCClient.getInstance() a set of user supplied properties. To >> implement this in an unsafe way (not checking any of the user's values) >> would only take a couple lines of code I believe. It is a work around, but >> it will enable all of the various RPCClient capabilities such as failover >> or loadbalancing mode and allow it to connect to multiple hosts. >> >> This is something that (I think) there is a JIRA filed for; but if not, >> it would be very helpful for this to be implemented into the actual >> AvroSink (and something that should be linked to that is >> RPCClient.getInstance accepting a Context object, simply for ease of use). >> >> - Connor >> >> >> On Wed, Jan 9, 2013 at 10:55 PM, Denny Ye <[email protected]> wrote: >> >> hi all, >> I didn't find the relationship between AvroSink and other types of >> RpcClient, including LoadBalancingRpcClient. In my opinion, user can set >> the specified RpcClient type from AvroSink with several strategies and host >> selectors. Also, I cannot get information from source code and user guide. >> Did I miss something about this? >> Wish someone can support, thanks! >> >> -Regards >> Denny Ye >> >> >> >> >
