Flume version: 1.4 (compiled with hadoop 2) HDFS version: I have the following agent config: - 1 avro source, (threads = 24, deflate compression) - 1 file channel - 4 hdfs sinks (thread pool size 2, write to a new hdfs directory every 5 min, bzip2)
Event size ~500bytes. Physical RAM : 64gb The java max heap size is capped at 8gb and the actual java heap consumption on the running instance is well below that (few hundred mb). However I am noticing in the 'top' output that the total virtual memory size and resident set size keep steadily increasing over time (well beyond 8gb). Once the total Resident set size of all the process comes close to about size of physical RAM (flume consuming 95% of it), the operating system nukes the flume process leaving no trace of this death in the flume logs. Here is the first sample top output. PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 15625 root 20 0 *30.1g 10g * 53m S 2.0 16.0 40:45.93 java Here is one after a few hours PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 15625 root 20 0 *61.3g 36g* 1072 S 12.0 57.4 126:10.98 java When i replace the HDFS sink with null sink, the problem goes away and the process remains very stable. So Fchannel does not seem to be the culprit. sample config is attached -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
