Hi, What version of NG are you running? Comment below inline.
On Tue, Nov 6, 2012 at 8:10 PM, Cameron Gandevia <[email protected]>wrote: > Hi > > I am trying to transition some flume nodes running FlumeOG to FlumeNG but > am running into a few difficulties. We are writing around 16,000 events/s > from a bunch of FlumeOG agents to a FlumeNG agent but we can't seem to get > the FlumeNG agent to drain the memory channel fast enough. At first I > thought maybe we were reaching the limit of a single Flume agent but I get > similar performance using a file channel which doesn't make sense. > > I have tried configuring anywhere from a single hdfs sink up to twenty of > them, I have also tried changing the batch sizes from 1000 up to 100,000 > but no matter what I do the channel fills fairly quickly. > > I am running a single flow using the below configuration > > ${FLUME_COLLECTOR_ID}.channels.hdfs-memoryChannel.type = memory > ${FLUME_COLLECTOR_ID}.channels.hdfs-memoryChannel.capacity = 1000000 > ${FLUME_COLLECTOR_ID}.channels.hdfs-memoryChannel.transactionCapacity = > 100000 > > ${FLUME_COLLECTOR_ID}.sources.perf_legacysource.type = > org.apache.flume.source.thriftLegacy.ThriftLegacySource > ${FLUME_COLLECTOR_ID}.sources.perf_legacysource.host = 0.0.0.0 > ${FLUME_COLLECTOR_ID}.sources.perf_legacysource.port = 36892 > ${FLUME_COLLECTOR_ID}.sources.perf_legacysource.channels = > hdfs-memoryChannel > ${FLUME_COLLECTOR_ID}.sources.perf_legacysource.selector.type = replicating > > ${FLUME_COLLECTOR_ID}.sinks.hdfs-sink.type = hdfs > ${FLUME_COLLECTOR_ID}.sinks.hdfs-sink.hdfs.path = > hdfs://${HADOOP_NAMENODE}:8020/rawLogs/%Y-%m-%d/%H00 > ${FLUME_COLLECTOR_ID}.sinks.hdfs-sink.hdfs.codeC = > com.hadoop.compression.lzo.LzopCodec > ${FLUME_COLLECTOR_ID}.sinks.hdfs-sink.hdfs.fileType = CompressedStream > ${FLUME_COLLECTOR_ID}.sinks.hdfs-sink.hdfs.rollInterval = 300 > ${FLUME_COLLECTOR_ID}.sinks.hdfs-sink.hdfs.rollSize = 0 > ${FLUME_COLLECTOR_ID}.sinks.hdfs-sink.hdfs.threadsPoolSize = 10 > ${FLUME_COLLECTOR_ID}.sinks.hdfs-sink.hdfs.rollCount = 0 > ${FLUME_COLLECTOR_ID}.sinks.hdfs-sink.hdfs.batchSize = 50000 > ${FLUME_COLLECTOR_ID}.sinks.hdfs-sink.hdfs.callTimeout = 120000 > ${FLUME_COLLECTOR_ID}.sinks.hdfs-sink.hdfs.filePrefix = > ${FLUME_COLLECTOR_ID}_1 > ${FLUME_COLLECTOR_ID}.sinks.hdfs-sink.txnEventMax = 1000 > I think this should be: ${FLUME_COLLECTOR_ID}.sinks.hdfs-sink.hdfs.txnEventMax = 50000 Spelled wrong and it should be equal to your batch size. I believe we removed that parameter in trunk. > ${FLUME_COLLECTOR_ID}.sinks.hdfs-sink.serializer = text > ${FLUME_COLLECTOR_ID}.sinks.hdfs-sink.channel = hdfs-memoryChannel > > Thanks > > Cameron Gandevia > -- Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/
