Our testing has shown up to 60MB/s to HDFS if we use up to 8 or 10 sinks per 
agent, and with a file channel with a single dataDir.


From: lohit [mailto:[email protected]]
Sent: Wednesday, July 15, 2015 11:11 AM
To: [email protected]
Subject: HDFS Sink performance

Hello,

Does anyone have some numbers which they can share around HDFS sink 
performance. From our testing, for single sink writing to HDFS 
(CompressedStream) and reading from MemoryChannel can only do about 35000 
events per second (each event is about 1K) in size. After compression this 
turns out to be ~10MB/s write stream to HDFS file. Which is pretty low. Our 
configuration looks like this

agent.sinks.hdfsSink.type = hdfs
agent.sinks.hdfsSink.channel = memoryChannel
agent.sinks.hdfsSink.hdfs.path = /tmp/lohit
agent.sinks.hdfsSink.hdfs.codeC = lzo
agent.sinks.hdfsSink.hdfs.fileType = CompressedStream
agent.sinks.hdfsSink.hdfs.writeFormat = Writable
agent.sinks.hdfsSink.hdfs.rollInterval = 3600
agent.sinks.hdfsSink.hdfs.rollSize = 1073741824
agent.sinks.hdfsSink.hdfs.rollCount = 0
agent.sinks.hdfsSink.hdfs.batchSize = 10000
agent.sinks.hdfsSink.hdfs.txnEventMax = 10000

agent.channels.memoryChannel.type = memory

agent.channels.memoryChannel.capacity = 3000000
agent.channels.memoryChannel.transactionCapacity = 10000

--
Have a Nice Day!
Lohit


Nothing in this message is intended to constitute an electronic signature 
unless a specific statement to the contrary is included in this message.

Confidentiality Note: This message is intended only for the person or entity to 
which it is addressed. It may contain confidential and/or privileged material. 
Any review, transmission, dissemination or other use, or taking of any action 
in reliance upon this message by persons or entities other than the intended 
recipient is prohibited and may be unlawful. If you received this message in 
error, please contact the sender and delete it from your computer.

Reply via email to