Roshan, This parameter is hdfs.idleTimeout and hdfs.maxOpenFiles (you need to do agent.hdfsSink.hdfs.idleTimeout), thanks to some historical configuration formatting.
Thanks, Hari On Tuesday, December 10, 2013 at 11:58 AM, Hari Shreedharan wrote: > Flume config - these are parameters for the HDFS sink. > > > Thanks, > Hari > > > On Tuesday, December 10, 2013 at 11:54 AM, Steve Morin wrote: > > > Hari Hadoop config or in the flume config > > > > > On Dec 10, 2013, at 11:30, Hari Shreedharan <[email protected] > > > (mailto:[email protected])> wrote: > > > > > > The reason for this is the direct memory allocations by HDFS codecs. > > > Reduce your maxOpenFiles and idleTimeout to have the bucket writers > > > garbage collected regularly. > > > > > > > > > Thanks, > > > Hari > > > > > > > > > > On Tuesday, December 10, 2013 at 11:19 AM, Roshan Naik wrote: > > > > > > > > Flume version: 1.4 (compiled with hadoop 2) > > > > HDFS version: > > > > > > > > I have the following agent config: > > > > - 1 avro source, (threads = 24, deflate compression) > > > > - 1 file channel > > > > - 4 hdfs sinks (thread pool size 2, write to a new hdfs directory every > > > > 5 min, bzip2) > > > > > > > > Event size ~500bytes. > > > > Physical RAM : 64gb > > > > > > > > > > > > The java max heap size is capped at 8gb and the actual java heap > > > > consumption on the running instance is well below that (few hundred mb). > > > > However I am noticing in the 'top' output that the total virtual memory > > > > size and resident set size keep steadily increasing over time (well > > > > beyond 8gb). Once the total Resident set size of all the process comes > > > > close to about size of physical RAM (flume consuming 95% of it), the > > > > operating system nukes the flume process leaving no trace of this death > > > > in the flume logs. > > > > > > > > Here is the first sample top output. > > > > > > > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > > > > 15625 root 20 0 30.1g 10g 53m S 2.0 16.0 40:45.93 java > > > > > > > > Here is one after a few hours > > > > > > > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > > > > 15625 root 20 0 61.3g 36g 1072 S 12.0 57.4 126:10.98 java > > > > > > > > > > > > > > > > > > > > When i replace the HDFS sink with null sink, the problem goes away and > > > > the process remains very stable. So Fchannel does not seem to be the > > > > culprit. > > > > > > > > sample config is attached > > > > CONFIDENTIALITY NOTICE > > > > NOTICE: This message is intended for the use of the individual or > > > > entity to which it is addressed and may contain information that is > > > > confidential, privileged and exempt from disclosure under applicable > > > > law. If the reader of this message is not the intended recipient, you > > > > are hereby notified that any printing, copying, dissemination, > > > > distribution, disclosure or forwarding of this communication is > > > > strictly prohibited. If you have received this communication in error, > > > > please contact the sender immediately and delete it from your system. > > > > Thank You. > > > > > > > > > > > > > > > > > > > >
