Mike P.

I encountered the same issue when trying out hdfs sink.

It's tricky to explicitly have a timestamp added at a source. Flume-NG should add some default event header, such as timestamp, before the event gets consumed. I think OG can do it.

Filed: https://issues.apache.org/jira/browse/FLUME-1214

Thanks,
Mingjie

On 5/15/12 3:22 PM, Mike Percy wrote:
Hi Mike,
Actually your timing is pretty good. Just today we have committed two
major features to Flume 1.x (NG) that can enable your use case:

https://issues.apache.org/jira/browse/FLUME-1157 - Support for
Interceptors, which are plugins that can write e.g. timestamp headers
https://issues.apache.org/jira/browse/FLUME-1183 - HBase sink

These two features are now available on Flume trunk.

I encourage you to adopt Flume 1.x in favor of Flume 0.9.x … there are
some serious issues with 0.9.x that no one seems to be working on, and
the momentum behind Flume 1.x is growing day by day.

Mike

On Tuesday, May 15, 2012 at 1:54 PM, Mike M wrote:

I appreciate the response, Mike. It wan't clear from the docs that
having a timestamp header was required. Is it possible to make the
exec source provide one (I'm trying to use Flume to `tail -F` a
logfile)?

If not, would Flume OG provide the functionality I'm looking for? It
already has an HBase sink, which is tempting for my use case...

Thanks!

Mike

On Tue, May 15, 2012 at 3:16 PM, Mike Percy <[email protected]
<mailto:[email protected]>> wrote:
Hi Mike, you have to have events each containing a "timestamp" header for
that functionality to work.

Mike

On Tuesday, May 15, 2012 at 8:37 AM, Mike M wrote:

I'm running into an issue where trying to use an escape sequence with
an HDFS sink causes an exception. For example, if I have the sink
configured like:

tail-nginx.sinks.HDFS.hdfs.path = hdfs://localhost:8020/flume/%Y/%m/%d/

I see the following exception showing up thousands of times:

2012-05-14 16:20:15,730 INFO source.ExecSource: Exec source starting
with command:tail -F /var/log/nginx/localhost.access.log
2012-05-14 16:20:15,743 ERROR hdfs.HDFSEventSink: process failed
java.lang.NumberFormatException: null
at java.lang.Long.parseLong(Long.java:404)
at java.lang.Long.valueOf(Long.java:540)
at
org.apache.flume.formatter.output.BucketPath.replaceShorthand(BucketPath.java:185)
at
org.apache.flume.formatter.output.BucketPath.escapeString(BucketPath.java:219)
at
org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:344)
at
org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
at java.lang.Thread.run(Thread.java:722)
2012-05-14 16:20:15,745 ERROR flume.SinkRunner: Unable to deliver
event. Exception follows.
org.apache.flume.EventDeliveryException: java.lang.NumberFormatException:
null
at
org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:407)
at
org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
at java.lang.Thread.run(Thread.java:722)
Caused by: java.lang.NumberFormatException: null
at java.lang.Long.parseLong(Long.java:404)
at java.lang.Long.valueOf(Long.java:540)
at
org.apache.flume.formatter.output.BucketPath.replaceShorthand(BucketPath.java:185)
at
org.apache.flume.formatter.output.BucketPath.escapeString(BucketPath.java:219)
at
org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:344)
... 3 more

I've tried various methods of quoting the escape sequences, to no
avail. When the escape sequences are removed, Flume works as
expected. Is this just a configuration issue on my side, or is there
possibly a bug? I'm running Flume-NG 1.1.0 packaged with CDH3u4.

Thanks!

Mike

Reply via email to