[ 
https://issues.apache.org/jira/browse/FLUME-2147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13735207#comment-13735207
 ] 

Jonathan Cooper-Ellis commented on FLUME-2147:
----------------------------------------------

This is actually a bigger issue than I initially thought, because when the sink 
fails to process the event missing the header, it retries on the whole batch 
and duplicates in HDFS whatever data was ahead of the bad event in the batch.
                
> Missing headers cause events to become stuck in channel
> -------------------------------------------------------
>
>                 Key: FLUME-2147
>                 URL: https://issues.apache.org/jira/browse/FLUME-2147
>             Project: Flume
>          Issue Type: Bug
>          Components: Sinks+Sources
>            Reporter: Jonathan Cooper-Ellis
>
> If a sink expects a header but does not find it, events will become stuck in 
> the channel and Flume will log NullPointer and EventDelivery exceptions. In a 
> memory channel, this can be fixed by restarting. In a file channel, 
> restarting does not cause events to be removed.
> 05 Aug 2013 12:21:09,424 ERROR 
> [SinkRunner-PollingRunner-DefaultSinkProcessor] 
> (org.apache.flume.SinkRunner$PollingRunner.run:160)  - Unable to deliver 
> event. Exception follows.
> org.apache.flume.EventDeliveryException: java.lang.NullPointerException: 
> Expected timestamp in the Flume event headers, but it was null
>         at 
> org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:426)
>         at 
> org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
>         at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
>         at java.lang.Thread.run(Thread.java:662)
> Caused by: java.lang.NullPointerException: Expected timestamp in the Flume 
> event headers, but it was null
>         at 
> com.google.common.base.Preconditions.checkNotNull(Preconditions.java:204)
>         at 
> org.apache.flume.formatter.output.BucketPath.replaceShorthand(BucketPath.java:200)
>         at 
> org.apache.flume.formatter.output.BucketPath.escapeString(BucketPath.java:396)
>         at 
> org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:356)
>         ... 3 more
> 05 Aug 2013 12:21:09,424 ERROR 
> [SinkRunner-PollingRunner-DefaultSinkProcessor] 
> (org.apache.flume.sink.hdfs.HDFSEventSink.process:422)  - process failed
> java.lang.NullPointerException: Expected timestamp in the Flume event 
> headers, but it was null
>         at 
> com.google.common.base.Preconditions.checkNotNull(Preconditions.java:204)
>         at 
> org.apache.flume.formatter.output.BucketPath.replaceShorthand(BucketPath.java:200)
>         at 
> org.apache.flume.formatter.output.BucketPath.escapeString(BucketPath.java:396)
>         at 
> org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:356)
>         at 
> org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
>         at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
>         at java.lang.Thread.run(Thread.java:662)
> I was using RegexExtractorInterceptor to match timestamp for partitioning in 
> with HDFS sink.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to