Hi all. I need some thoughts on sizing/tuning of the above (common) route in FlumeNG to maximize throughput. Here is my setup:
*Source JVM (ExecSource/MemoryChannel/AvroSink):* -Xmx4g -Xms4g -XX:MaxDirectMemorySize=256m Number of ExecSources in config: 124 (yes, it's a ton. Can't do anything about it :) The write rate to the source files is fairly fast and bursty. ExecSource.batchSize = 1000 (so, when all 124 tail -F instances get 1000 events, they all dump to the memory channel) MemoryChannel.capacity = 1000000 MemoryChannel.transactionCapacity = 1000 (somewhat unclear on what this is. Docs say "The number of events stored in the channel per transaction", but what is a "transaction" to a MemoryChannel?) AvroSink.batchSize = 1000 *Destination JVM (AvroSource/FileChannel/HDFSSink)* (Cluster of two JVMs on two servers, each configured the same as per below) -Xms=2g -Xmx=2g -XX:MaxDirectMemorySize is not defined, so whatever the default is AvroSource.threads = 64 FileChannel.transactionCapacity = 1000 FileChannel.capacity = 32000000 HDFSSink.batchSize = 1000 HDFSSink.threadPoolSize = 64 With this configuration, in about 5 minutes, I get the common Exception: "Space for commit to queue couldn't be acquired Sinks are likely not keeping up with sources, or the buffer size is too tight" on the Source JVM. It is no where near the 4g max, rather only at about 2.5g. I'm wondering about the logic of having all the batch sizes/transaction sizes 1000. My thought was that would keep from fragmenting the transfer of data, but maybe that's flawed? Should the sizes be different? Also curious about increasing the MaxDirectMemorySize to something larger than 256MB? I tried removing it altogether in my Source JVM (which makes the size unbounded), but that didn't seem to make a difference. I'm having some trouble figuring out where the backup is happening, and how to open up the gates. :) Thanks in advance for any suggestions. Chris