Beautiful. Will try 4 channels in one Agent first. Thanks!
On Tue, Mar 12, 2013 at 4:35 PM, Roshan Naik <[email protected]> wrote: > Even 16 on a single channel might be on the higher side IMHO. > > Try instead splitting into four channels with 4 sinks each... or even > four agents with one channel and 4 sinks each ..... it will reduce > contention. be careful to ensure your capacity of each channel is not > too high since you now have many channels. > -roshan > > On Tue, Mar 12, 2013 at 2:24 PM, Chris Neal <[email protected]> wrote: > > Thanks for the reply. You're definitely on to something with the > > ever-increasing number of sinks. :) > > > > I scaled it back to 16 AvroSinks, and used a > > MemoryChannel.transactionCapacity of 1000, and AvroSink.batch-size of > 1000. > > My ExecSource.batchSize is 100 (I chose this smaller number because there > > are so many of them (124), I didn't want 10s of thousands of events > getting > > dropped on the MemoryChannel at once, rather just 1000s). With those > > settings, things are keeping the MemoryChannel drained. Finally getting > > somewhere! :) > > > > Much appreciate the prompt response. If anything else comes to mind, > please > > do let me know. > > > > Thanks again. > > Chris > > > > > > > > On Tue, Mar 12, 2013 at 4:12 PM, Roshan Naik <[email protected]> > wrote: > >> > >> i meant 640,000 not 64,000 > >> > >> On Tue, Mar 12, 2013 at 2:10 PM, Roshan Naik <[email protected]> > >> wrote: > >> > beyond a certain # of sinks it wont help adding more. my suspicion is > >> > you may have gone way overboard. > >> > > >> > if your sink-side batch size is that large and you have 64 sinks in > >> > the round-robin.. it will take a lot of events (64,000) to be pumped > >> > in by the source order before the first event can start trickling out > >> > of any sink. Also memory consumption will be quite high.. each sink > >> > will open a transaction and hold on to 10000 events. This the cause > >> > for the Memory channel filling up. Until the sink side transaction is > >> > committed (i.e 10k events are pulled), the memory reservation on the > >> > channel is not relinquished. So your memory channel size will have to > >> > really high to support so manch sinks each with such a big batch size. > >> > > >> > My gut feel is that your source-side batch size is not much of an > >> > issue and can be smaller. Increasing the number of sinks will only > >> > help if the sink is indeed the bott > >> > > >> > On Tue, Mar 12, 2013 at 1:43 PM, Chris Neal <[email protected]> wrote: > >> >> Hi all. > >> >> > >> >> I've been working on this for quite some time, and need some advice > >> >> from the > >> >> experts. I have a two tiered Flume architecture: > >> >> > >> >> App Tier (all on one server): > >> >> 124 ExecSources -> MemoryChannel -> AvroSinks > >> >> > >> >> HDFS Tier (on two servers): > >> >> AvroSource -> FileChannel -> HDFSSinks > >> >> > >> >> When I run the agents, the HDFS tier is keeping up fine with the App > >> >> Tier. > >> >> queue sizes stay between 0-10000 (I have a batch size of 10000). All > >> >> is > >> >> good. > >> >> > >> >> On the App Tier, when I view the JMX data through jconsole, I watch > the > >> >> size > >> >> of the MemoryChannel grow steadily until it reaches the max, then it > >> >> starts > >> >> throwing exceptions about not being able to put the batch on the > >> >> channel as > >> >> expected. > >> >> > >> >> There seems to be two basic ways to increase the throughput of the > App > >> >> Tier: > >> >> 1. Increase the MemoryChannel's transactionCapacity and the > >> >> corresponding > >> >> AvroSink's batch-size. Both are set to 10000 for me. > >> >> 2. Increase the number of AvroSinks to drain the MemoryChannel. I'm > >> >> up to > >> >> 64 Sinks now which round-robin between the two Flume Agents on the > HDFS > >> >> tier. > >> >> > >> >> Both of those values seem quite high to me (batch size and number of > >> >> sinks). > >> >> > >> >> Am I missing something as far as tuning? > >> >> Which would allow for greater increase to throughput, more Sinks or > >> >> larger > >> >> batch size? > >> >> > >> >> I'm stumped here. I still think I can get this to work. :) > >> >> > >> >> Any suggestions are most welcome. > >> >> Thanks for your time. > >> >> Chris > >> >> > > > > >
