Flume version: 1.4 (compiled with hadoop 2)
HDFS version:

I have the following agent config:
 - 1 avro source, (threads = 24, deflate compression)
 - 1 file channel
 - 4 hdfs sinks (thread pool size 2, write to a new hdfs directory every 5
min, bzip2)

Event size ~500bytes.
Physical RAM : 64gb


The java max heap size is capped at 8gb and the actual java heap
consumption on the running instance is well below that (few hundred mb).
However I am noticing in the 'top' output that the total virtual memory
size and resident set size keep steadily increasing over time (well beyond
8gb). Once the total Resident set size of all the process comes close to
about size of physical RAM (flume consuming 95% of it), the operating
system nukes the flume process leaving no trace of this death in the flume
logs.

Here is the first sample top output.

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND

15625 root      20   0 *30.1g  10g * 53m S  2.0 16.0  40:45.93 java


Here is one after a few hours

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
15625 root      20   0 *61.3g  36g* 1072 S 12.0 57.4 126:10.98 java



When i replace the HDFS sink with null sink, the problem goes away and the
process remains very stable. So Fchannel does not seem to be the culprit.

 sample config is attached

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Reply via email to