I have been using 1.5 all along. I end up with a 0 length file which is a little concerning. Not to mention that the timeout is adding 10 seconds to the overall transfer. Is this normal or is there something I can do to prevent the timeout?
Thanks, Ed. Sent from my iPhone > On Oct 30, 2014, at 5:58 PM, Asim Zafir <[email protected]> wrote: > > Ed, > > Are you saying you resolved the problem with 1.5.0 or you still have an issue? > > Thanks, > > Asim Zafir. > >> On Thu, Oct 30, 2014 at 1:47 PM, Ed Judge <[email protected]> wrote: >> Thanks for the replies. We are using 1.5.0. >> My observation is that Flume retries automatically (without my intervention) >> and that no data is lost. >> The impact is a) a delay of 10 seconds due to the timeout and b) a zero >> length file. >> >> -Ed >> >>> On Oct 30, 2014, at 3:46 PM, Asim Zafir <[email protected]> wrote: >>> >>> Please check if ur sinks i.e. hdfs data nodes that were receiving the >>> writes are not having any bad blocks . Secondly I think you should also set >>> hdfs roll interval or size to a higher value. The reason this problem >>> happens is because flume sink is not able to right to a data pipeline that >>> was initially presented by hdfs. The solution in this case should be for >>> hdfs to initialize a new pipeline and present to flume. The hack currently >>> Is to restart the flume process which then initializes a new hdfs pipeline >>> enabling the sink to push backlogged events. There is a fix to this >>> incorporated In flume 1.5 (i havent test it yet) but if u are on anything >>> older the only way to make this work is restart the flume process >>> >>>> On Oct 30, 2014 11:54 AM, "Ed Judge" <[email protected]> wrote: >>>> I am running into the following problem. >>>> >>>> 30 Oct 2014 18:43:26,375 WARN >>>> [SinkRunner-PollingRunner-DefaultSinkProcessor] >>>> (org.apache.flume.sink.hdfs.HDFSEventSink.process:463) - HDFS IO error >>>> java.io.IOException: Callable timed out after 10000 ms on file: >>>> hdfs://localhost:9000/tmp/dm/dm-1-19.1414694596209.ds.tmp >>>> at >>>> org.apache.flume.sink.hdfs.BucketWriter.callWithTimeout(BucketWriter.java:732) >>>> at org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:262) >>>> at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:554) >>>> at >>>> org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:426) >>>> at >>>> org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68) >>>> at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147) >>>> at java.lang.Thread.run(Thread.java:745) >>>> Caused by: java.util.concurrent.TimeoutException >>>> at java.util.concurrent.FutureTask.get(FutureTask.java:201) >>>> at >>>> org.apache.flume.sink.hdfs.BucketWriter.callWithTimeout(BucketWriter.java:725) >>>> ... 6 more >>>> 30 Oct 2014 18:43:27,717 INFO >>>> [SinkRunner-PollingRunner-DefaultSinkProcessor] >>>> (org.apache.flume.sink.hdfs.BucketWriter.open:261) - Creating >>>> hdfs://localhost:9000/tmp/dm/dm-1-19.1414694596210.ds.tmp >>>> 30 Oct 2014 18:43:46,971 INFO [agent-shutdown-hook] >>>> (org.apache.flume.lifecycle.LifecycleSupervisor.stop:79) - Stopping >>>> lifecycle supervisor 10 >>>> >>>> >>>> The following is my configuration. The source is just a script running a >>>> curl command and downloading files from S3. >>>> >>>> >>>> # Name the components on this agent >>>> a1.sources = r1 >>>> a1.sinks = k1 >>>> a1.channels = c1 >>>> >>>> # Configure the source: STACK_S3 >>>> a1.sources.r1.type = exec >>>> a1.sources.r1.command = ./conf/FlumeAgent.1.sh >>>> a1.sources.r1.channels = c1 >>>> >>>> # Use a channel which buffers events in memory >>>> a1.channels.c1.type = memory >>>> a1.channels.c1.capacity = 1000000 >>>> a1.channels.c1.transactionCapacity = 100 >>>> >>>> # Describe the sink >>>> a1.sinks.k1.type = hdfs >>>> a1.sinks.k1.hdfs.path = hdfs://localhost:9000/tmp/dm >>>> a1.sinks.k1.hdfs.filePrefix = dm-1-20 >>>> a1.sinks.k1.hdfs.fileSuffix = .ds >>>> a1.sinks.k1.hdfs.rollInterval = 0 >>>> a1.sinks.k1.hdfs.rollSize = 0 >>>> a1.sinks.k1.hdfs.rollCount = 0 >>>> a1.sinks.k1.hdfs.fileType = DataStream >>>> a1.sinks.k1.serializer = TEXT >>>> a1.sinks.k1.channel = c1 >>>> a1.sinks.k1.hdfs.minBlockReplicas = 1 >>>> a1.sinks.k1.hdfs.batchSize = 10 >>>> >>>> >>>> I had the HDFS batch size at the default (100) but this issue was still >>>> happening. Does anyone know what parameters I should change to make this >>>> error go away? >>>> No data is lost but I end up with a 0 byte file. >>>> >>>> Thanks, >>>> Ed >
