Roshan, 

This parameter is hdfs.idleTimeout and hdfs.maxOpenFiles (you need to do 
agent.hdfsSink.hdfs.idleTimeout), thanks to some historical configuration 
formatting. 


Thanks,
Hari


On Tuesday, December 10, 2013 at 11:58 AM, Hari Shreedharan wrote:

> Flume config - these are parameters for the HDFS sink. 
> 
> 
> Thanks,
> Hari
> 
> 
> On Tuesday, December 10, 2013 at 11:54 AM, Steve Morin wrote:
> 
> > Hari Hadoop config or in the flume config
> > 
> > > On Dec 10, 2013, at 11:30, Hari Shreedharan <[email protected] 
> > > (mailto:[email protected])> wrote:
> > > 
> > > The reason for this is the direct memory allocations by HDFS codecs. 
> > > Reduce your maxOpenFiles and idleTimeout to have the bucket writers 
> > > garbage collected regularly. 
> > > 
> > > 
> > > Thanks,
> > > Hari
> > > 
> > > 
> > > > On Tuesday, December 10, 2013 at 11:19 AM, Roshan Naik wrote:
> > > > 
> > > > Flume version: 1.4 (compiled with hadoop 2)
> > > > HDFS version:
> > > > 
> > > > I have the following agent config:
> > > > - 1 avro source, (threads = 24, deflate compression)
> > > > - 1 file channel
> > > > - 4 hdfs sinks (thread pool size 2, write to a new hdfs directory every 
> > > > 5 min, bzip2)
> > > > 
> > > > Event size ~500bytes.
> > > > Physical RAM : 64gb
> > > > 
> > > > 
> > > > The java max heap size is capped at 8gb and the actual java heap 
> > > > consumption on the running instance is well below that (few hundred mb).
> > > > However I am noticing in the 'top' output that the total virtual memory 
> > > > size and resident set size keep steadily increasing over time (well 
> > > > beyond 8gb). Once the total Resident set size of all the process comes 
> > > > close to about size of physical RAM (flume consuming 95% of it), the 
> > > > operating system nukes the flume process leaving no trace of this death 
> > > > in the flume logs.
> > > > 
> > > > Here is the first sample top output.
> > > > 
> > > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 
> > > > 15625 root 20 0 30.1g 10g 53m S 2.0 16.0 40:45.93 java 
> > > > 
> > > > Here is one after a few hours
> > > > 
> > > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 
> > > > 15625 root 20 0 61.3g 36g 1072 S 12.0 57.4 126:10.98 java 
> > > > 
> > > > 
> > > > 
> > > > 
> > > > When i replace the HDFS sink with null sink, the problem goes away and 
> > > > the process remains very stable. So Fchannel does not seem to be the 
> > > > culprit. 
> > > > 
> > > > sample config is attached 
> > > > CONFIDENTIALITY NOTICE
> > > > NOTICE: This message is intended for the use of the individual or 
> > > > entity to which it is addressed and may contain information that is 
> > > > confidential, privileged and exempt from disclosure under applicable 
> > > > law. If the reader of this message is not the intended recipient, you 
> > > > are hereby notified that any printing, copying, dissemination, 
> > > > distribution, disclosure or forwarding of this communication is 
> > > > strictly prohibited. If you have received this communication in error, 
> > > > please contact the sender immediately and delete it from your system. 
> > > > Thank You.
> > > > 
> > > 
> > > 
> > 
> > 
> > 
> > 
> 
> 

Reply via email to