[ 
https://issues.apache.org/jira/browse/FLUME-2429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14070311#comment-14070311
 ] 

chenshangan commented on FLUME-2429:
------------------------------------

testAgent.sinks.testSink.hdfs.callTimeout = 15000
the callTimeout here is too short for a hdfs operation, I use 180000 in 
production env. Keep in mind, hdfs operation sometimes cost a lot of time, and 
error might happens, so you should deal with these exceptions. Sometimes blocks 
of a file might lost, and file can never got closed.In flume-1.5, there's a 
parameter to control how many times you want to try to close a file.  

> Callable timed out in HDFS sink
> -------------------------------
>
>                 Key: FLUME-2429
>                 URL: https://issues.apache.org/jira/browse/FLUME-2429
>             Project: Flume
>          Issue Type: Bug
>    Affects Versions: v1.4.0
>            Reporter: Jay
>
> Hi.
> I got a warning msg using HDFS sink.
> AVRO source > Memory (or File) channel > HDFS sink
> Switching channel type didn't solve the problem.
> Error occurs once a day or several days.
> Any Solution?
> Here is my configuration.
> --------------------------------------------------------------------
> testAgent.sources = testSrc
> testAgent.channels = testChannel
> testAgent.sinks = testSink
> testAgent.sources.testSrc.type = avro
> testAgent.sources.testSrc.channels = testChannel
> testAgent.channels.testChannel.type = memory
> testAgent.sources.testSrc.bind = 0.0.0.0
> testAgent.sources.testSrc.port = 4141
> testAgent.sinks.testSink.type = hdfs
> testAgent.sinks.testSink.channel = testChannel
> testAgent.sources.testSrc.interceptors = testInterceptor
> testAgent.sources.testSrc.interceptors.testInterceptor.type = static
> testAgent.sources.testSrc.interceptors.testInterceptor.preserveExisting = true
> testAgent.sources.testSrc.interceptors.testInterceptor.key = testKey
> testAgent.sources.testSrc.interceptors.testInterceptor.value = .testfile
> testAgent.sinks.testSink.hdfs.path = hdfs://hadoop-cluster:8020/flume/%Y%m%d
> testAgent.sinks.testSink.hdfs.filePrefix = %Y%m%d%H%M
> testAgent.sinks.testSink.hdfs.fileSuffix = .testfile
> testAgent.sinks.testSink.hdfs.fileType = DataStream
> testAgent.sinks.testSink.hdfs.rollInterval = 1
> testAgent.sinks.testSink.hdfs.rollCount = 0
> testAgent.sinks.testSink.hdfs.rollSize = 0
> testAgent.sinks.testSink.hdfs.batchSize = 150000
> testAgent.sinks.testSink.hdfs.callTimeout = 15000
> testAgent.sinks.testSink.hdfs.useLocalTimeStamp = true
> testAgent.sinks.testSink.serializer = text
> testAgent.sinks.testSink.serializer.appendNewline = false
> testAgent.channels.testChannel.keep-alive = 1
> testAgent.channels.testChannel.write-timeout = 1
> testAgent.channels.testChannel.transactionCapacity = 150000
> testAgent.channels.testChannel.capacity = 18000000
> #testAgent.channels.testChannel.checkpointDir = /data/flumedata/checkpoint
> #testAgent.channels.testChannel.useDualCheckpoints = true
> #testAgent.channels.testChannel.backupCheckpointDir = 
> /data/flumedata_backup/checkpoint
> #testAgent.channels.testChannel.dataDirs = /data/flumedata/data
> testAgent.channels.testChannel.byteCapacityBufferPercentage = 20
> testAgent.channels.testChannel.byteCapacity = 1000000000
> --------------------------------------------------------------------
> I sometimes get a warning message in a flume log.
> --------------------------------------------------------------------
> 2014-07-22 16:28:20,186 (SinkRunner-PollingRunner-DefaultSinkProcessor) [WARN 
> - org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:477)] 
> Caught IOException writing to HDFSWriter (Callable timed out after 15000 ms 
> on file: 
> hdfs://hadoop-cluster:8020/flume/20140722/201407221628.1406014084417.testfile.tmp).
>  Closing file 
> (hdfs://hadoop-cluster:8020/flume/20140722/201407221628.1406014084417.testfile.tmp)
>  and rethrowing exception.
> 2014-07-22 16:28:35,187 (SinkRunner-PollingRunner-DefaultSinkProcessor) [WARN 
> - org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:483)] 
> Caught IOException while closing file 
> (hdfs://hadoop-cluster:8020/flume/20140722/201407221628.1406014084417.testfile.tmp).
>  Exception follows.
> java.io.IOException: Callable timed out after 15000 ms on file: 
> hdfs://search-hdanal-cluster:8020/flume/20140722/201407221628.1406014084417.testfile.tmp
>         at 
> org.apache.flume.sink.hdfs.BucketWriter.callWithTimeout(BucketWriter.java:603)
>         at 
> org.apache.flume.sink.hdfs.BucketWriter.doFlush(BucketWriter.java:381)
>         at 
> org.apache.flume.sink.hdfs.BucketWriter.flush(BucketWriter.java:343)
>         at 
> org.apache.flume.sink.hdfs.BucketWriter.close(BucketWriter.java:292)
>         at 
> org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:481)
>         at 
> org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:401)
>         at 
> org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
>         at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
>         at java.lang.Thread.run(Thread.java:662)
> Caused by: java.util.concurrent.TimeoutException
>         at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:228)
>         at java.util.concurrent.FutureTask.get(FutureTask.java:91)
>         at 
> org.apache.flume.sink.hdfs.BucketWriter.callWithTimeout(BucketWriter.java:596)
>         ... 8 more
> 2014-07-22 16:28:35,187 (SinkRunner-PollingRunner-DefaultSinkProcessor) [WARN 
> - org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:438)] 
> HDFS IO error
> java.io.IOException: Callable timed out after 15000 ms on file: 
> hdfs://hadoop-cluster:8020/flume/20140722/201407221628.1406014084417.testfile.tmp
>         at 
> org.apache.flume.sink.hdfs.BucketWriter.callWithTimeout(BucketWriter.java:603)
>         at 
> org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:469)
>         at 
> org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:401)
>         at 
> org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
>         at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
>         at java.lang.Thread.run(Thread.java:662)
> Caused by: java.util.concurrent.TimeoutException
>         at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:228)
>         at java.util.concurrent.FutureTask.get(FutureTask.java:91)
>         at 
> org.apache.flume.sink.hdfs.BucketWriter.callWithTimeout(BucketWriter.java:596)
>         ... 5 more
> --------------------------------------------------------------------



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to