I know about idleTimeout. rollingSize, rollingCount ( which about roll over writing file).
I didn't set callTimeout, so the default 10s will be applied. also closeTries, retryInterval haven't set too. So, I think even close failed one time, close retries will be retried after 180s(default retryInterval) But as you can see at the logs above, close retry never happen. am I wrong? 2016-07-20 17:25 GMT+09:00 Chris Horrocks <[email protected]>: > You could look at tuning either hdfs.idleTimeout, hdfs.callTimeout, or > hdfs.retryInterval which can all be found at: > http://flume.apache.org/FlumeUserGuide.html#hdfs-sink > > -- > Chris Horrocks > > > On Wed, Jul 20, 2016 at 9:01 am, no jihun <'[email protected]'> wrote: > > @chirs If you meant hdfs.callTimeout > Now I am doing a test on that. > > I can increase the value. > When timeout occur while close, It will never retried? ( as logs above ) > > 2016-07-20 16:50 GMT+09:00 Chris Horrocks <[email protected]>: > >> Have you tried increasing the HDFS sink timeouts? >> >> -- >> Chris Horrocks >> >> >> On Wed, Jul 20, 2016 at 8:03 am, no jihun <'[email protected]'> wrote: >> >> Hi. >> >> I found some files on hdfs left as OPEN_FOR_WRITE state. >> >> *This is flume's log about the file.* >> >> >> 01 18 7 2016 16:12:02,765 INFO >>> [SinkRunner-PollingRunner-DefaultSinkProcessor] >>> (org.apache.flume.sink.hdfs.BucketWriter.open:234) >> >> 02 - Creating 1468825922758.avro.tmp >> >> >>> 03 18 7 2016 16:22:39,812 INFO [hdfs-hdfs2-roll-timer-0] >>> (org.apache.flume.sink.hdfs.BucketWriter$5.call:429) >> >> 04 - Closing idle bucketWriter 1468825922758.avro.tmp at 1468826559812 >> >> >>> 05 18 7 2016 16:22:39,812 INFO [hdfs-hdfs2-roll-timer-0] >>> (org.apache.flume.sink.hdfs.BucketWriter.close:363) >> >> 06 - Closing 1468825922758.avro.tmp >> >> >>> 07 18 7 2016 16:22:49,813 WARN [hdfs-hdfs2-roll-timer-0] >>> (org.apache.flume.sink.hdfs.BucketWriter.close:370) >> >> 08 - failed to close() HDFSWriter for file (1468825922758.avro.tmp). >>> Exception follows. >> >> 09 java.io.IOException: Callable timed out after 10000 ms on file: >>> 1468825922758.avro.tmp >> >> >>> 10 18 7 2016 16:22:49,816 INFO [hdfs-hdfs2-call-runner-7] >>> (org.apache.flume.sink.hdfs.BucketWriter$8.call:629) >> >> 11 - Renaming 1468825922758.avro.tmp to 1468825922758.avro >> >> >> - seems close never retried >> - flume just renamed which still opened. >> >> >> *2 day later I've found that file by this command* >> >> hdfs fsck /data/flume -openforwrite | grep "OPENFORWRITE" | grep >>> "2016/07/18" | sed 's//data/flume// /data/flume//g' | grep -v ".avro.tmp" | >>> sed -n 's/.*(/data/flume/.*avro).*/ /p' >> >> >> >> *So, reverseLease-ed* >> >> hdfs debug recoverLease -path 1468825922758.avro -retries 3 >>> recoverLease returned false. >>> Retrying in 5000 ms... >>> Retry #1 >>> recoverLease SUCCEEDED on 1468825922758.avro >> >> >> >> *My hdfs sink configuration* >> >> hadoop2.sinks.hdfs2.type = hdfs >>> hadoop2.sinks.hdfs2.channel = fileCh1 >>> hadoop2.sinks.hdfs2.hdfs.fileType = DataStream >>> hadoop2.sinks.hdfs2.serializer = .... >>> hadoop2.sinks.hdfs2.serializer.compressionCodec = snappy >>> hadoop2.sinks.hdfs2.hdfs.filePrefix = %{type}_%Y-%m-%d_%{host} >>> hadoop2.sinks.hdfs2.hdfs.fileSuffix = .avro >>> hadoop2.sinks.hdfs2.hdfs.rollInterval = 3700 >>> #hadoop2.sinks.hdfs2.hdfs.rollSize = 67000000 >>> hadoop2.sinks.hdfs2.hdfs.rollSize = 800000000 >>> hadoop2.sinks.hdfs2.hdfs.rollCount = 0 >>> hadoop2.sinks.hdfs2.hdfs.batchSize = 10000 >>> hadoop2.sinks.hdfs2.hdfs.idleTimeout = 300 >> >> >> hdfs.closeTries, retryInterval both not set. >> >> >> *My question is * >> why '1468825922758.avro' left OPEN_FOR_WRITE? even though renamed to >> .avro succesufully. >> Is this expected behavior? so , what should I do to eliminate these >> anomal OPENFORWRITE files? >> >> Regards, >> Jihun. >> >> > > > -- > ---------------------------------------------- > Jihun No ( 노지훈 ) > ---------------------------------------------- > Twitter : @nozisim > Facebook : nozisim > Website : http://jeesim2.godohosting.com > > --------------------------------------------------------------------------------- > Market Apps : android market products. > <https://market.android.com/developer?pub=%EB%85%B8%EC%A7%80%ED%9B%88> > > -- ---------------------------------------------- Jihun No ( 노지훈 ) ---------------------------------------------- Twitter : @nozisim Facebook : nozisim Website : http://jeesim2.godohosting.com --------------------------------------------------------------------------------- Market Apps : android market products. <https://market.android.com/developer?pub=%EB%85%B8%EC%A7%80%ED%9B%88>
