[ 
https://issues.apache.org/jira/browse/FLUME-2429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14342228#comment-14342228
 ] 

Johny Rufus commented on FLUME-2429:
------------------------------------

[~jaoo62], having a short callTimeout, results in a scenario where the HDFS 
cluster does not complete the call in the time for which the HDFS sink  in 
Flume waits for the call to complete. In this case, Flume retries the entire 
transaction, and events that were written as part of the previous failed 
transaction, are again written to HDFS as part of the retried transaction. That 
is why  the timeout value should be good  enough to handle the 
performance/current limits of your HDFS cluster

> Callable timed out in HDFS sink
> -------------------------------
>
>                 Key: FLUME-2429
>                 URL: https://issues.apache.org/jira/browse/FLUME-2429
>             Project: Flume
>          Issue Type: Bug
>    Affects Versions: v1.4.0
>            Reporter: Jay
>
> Hi.
> I got a warning msg using HDFS sink.
> AVRO source > Memory (or File) channel > HDFS sink
> Switching channel type didn't solve the problem.
> Error occurs once a day or several days.
> Any Solution?
> Here is my configuration.
> --------------------------------------------------------------------
> testAgent.sources = testSrc
> testAgent.channels = testChannel
> testAgent.sinks = testSink
> testAgent.sources.testSrc.type = avro
> testAgent.sources.testSrc.channels = testChannel
> testAgent.channels.testChannel.type = memory
> testAgent.sources.testSrc.bind = 0.0.0.0
> testAgent.sources.testSrc.port = 4141
> testAgent.sinks.testSink.type = hdfs
> testAgent.sinks.testSink.channel = testChannel
> testAgent.sources.testSrc.interceptors = testInterceptor
> testAgent.sources.testSrc.interceptors.testInterceptor.type = static
> testAgent.sources.testSrc.interceptors.testInterceptor.preserveExisting = true
> testAgent.sources.testSrc.interceptors.testInterceptor.key = testKey
> testAgent.sources.testSrc.interceptors.testInterceptor.value = .testfile
> testAgent.sinks.testSink.hdfs.path = hdfs://hadoop-cluster:8020/flume/%Y%m%d
> testAgent.sinks.testSink.hdfs.filePrefix = %Y%m%d%H%M
> testAgent.sinks.testSink.hdfs.fileSuffix = .testfile
> testAgent.sinks.testSink.hdfs.fileType = DataStream
> testAgent.sinks.testSink.hdfs.rollInterval = 1
> testAgent.sinks.testSink.hdfs.rollCount = 0
> testAgent.sinks.testSink.hdfs.rollSize = 0
> testAgent.sinks.testSink.hdfs.batchSize = 150000
> testAgent.sinks.testSink.hdfs.callTimeout = 15000
> testAgent.sinks.testSink.hdfs.useLocalTimeStamp = true
> testAgent.sinks.testSink.serializer = text
> testAgent.sinks.testSink.serializer.appendNewline = false
> testAgent.channels.testChannel.keep-alive = 1
> testAgent.channels.testChannel.write-timeout = 1
> testAgent.channels.testChannel.transactionCapacity = 150000
> testAgent.channels.testChannel.capacity = 18000000
> #testAgent.channels.testChannel.checkpointDir = /data/flumedata/checkpoint
> #testAgent.channels.testChannel.useDualCheckpoints = true
> #testAgent.channels.testChannel.backupCheckpointDir = 
> /data/flumedata_backup/checkpoint
> #testAgent.channels.testChannel.dataDirs = /data/flumedata/data
> testAgent.channels.testChannel.byteCapacityBufferPercentage = 20
> testAgent.channels.testChannel.byteCapacity = 1000000000
> --------------------------------------------------------------------
> I sometimes get a warning message in a flume log.
> --------------------------------------------------------------------
> 2014-07-22 16:28:20,186 (SinkRunner-PollingRunner-DefaultSinkProcessor) [WARN 
> - org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:477)] 
> Caught IOException writing to HDFSWriter (Callable timed out after 15000 ms 
> on file: 
> hdfs://hadoop-cluster:8020/flume/20140722/201407221628.1406014084417.testfile.tmp).
>  Closing file 
> (hdfs://hadoop-cluster:8020/flume/20140722/201407221628.1406014084417.testfile.tmp)
>  and rethrowing exception.
> 2014-07-22 16:28:35,187 (SinkRunner-PollingRunner-DefaultSinkProcessor) [WARN 
> - org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:483)] 
> Caught IOException while closing file 
> (hdfs://hadoop-cluster:8020/flume/20140722/201407221628.1406014084417.testfile.tmp).
>  Exception follows.
> java.io.IOException: Callable timed out after 15000 ms on file: 
> hdfs://search-hdanal-cluster:8020/flume/20140722/201407221628.1406014084417.testfile.tmp
>         at 
> org.apache.flume.sink.hdfs.BucketWriter.callWithTimeout(BucketWriter.java:603)
>         at 
> org.apache.flume.sink.hdfs.BucketWriter.doFlush(BucketWriter.java:381)
>         at 
> org.apache.flume.sink.hdfs.BucketWriter.flush(BucketWriter.java:343)
>         at 
> org.apache.flume.sink.hdfs.BucketWriter.close(BucketWriter.java:292)
>         at 
> org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:481)
>         at 
> org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:401)
>         at 
> org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
>         at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
>         at java.lang.Thread.run(Thread.java:662)
> Caused by: java.util.concurrent.TimeoutException
>         at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:228)
>         at java.util.concurrent.FutureTask.get(FutureTask.java:91)
>         at 
> org.apache.flume.sink.hdfs.BucketWriter.callWithTimeout(BucketWriter.java:596)
>         ... 8 more
> 2014-07-22 16:28:35,187 (SinkRunner-PollingRunner-DefaultSinkProcessor) [WARN 
> - org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:438)] 
> HDFS IO error
> java.io.IOException: Callable timed out after 15000 ms on file: 
> hdfs://hadoop-cluster:8020/flume/20140722/201407221628.1406014084417.testfile.tmp
>         at 
> org.apache.flume.sink.hdfs.BucketWriter.callWithTimeout(BucketWriter.java:603)
>         at 
> org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:469)
>         at 
> org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:401)
>         at 
> org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
>         at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
>         at java.lang.Thread.run(Thread.java:662)
> Caused by: java.util.concurrent.TimeoutException
>         at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:228)
>         at java.util.concurrent.FutureTask.get(FutureTask.java:91)
>         at 
> org.apache.flume.sink.hdfs.BucketWriter.callWithTimeout(BucketWriter.java:596)
>         ... 5 more
> --------------------------------------------------------------------



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to