George, Just to get things working, you can use UUID Interceptor http://flume.apache.org/FlumeUserGuide.html#uuid-interceptor
Put the headerName field value as rowKey and the code should work. I have not used this, but if it still doesn't work let us know. I will quickly hack out a working example. On Thu, Oct 31, 2013 at 1:22 AM, George Pang <[email protected]> wrote: > Thank you, but I am not so sure I can insert header with the example in > this blog. I miss a part for the whole picture. > > George > > > On Wed, Oct 30, 2013 at 6:56 AM, Brock Noland <[email protected]> wrote: > >> I just googled and found this. Not sure if there is a better one. >> >> >> http://blog.cloudera.com/blog/2013/07/morphlines-the-easy-way-to-build-and-integrate-etl-apps-for-apache-hadoop/ >> >> >> On Wed, Oct 30, 2013 at 12:34 AM, George Pang <[email protected]> wrote: >> >>> Is there a tutorial for this topic out there? >>> >>> Thanks, >>> >>> George >>> >>> >>> On Tue, Oct 29, 2013 at 6:50 PM, George Pang <[email protected]> wrote: >>> >>>> Hi Brock, >>>> >>>> The morphline comand addValue looks like the one I need, but how can I >>>> add the event head key-value pair? >>>> >>>> Thank you, >>>> >>>> George >>>> >>>> >>>> On Tue, Oct 29, 2013 at 1:02 PM, George Pang <[email protected]> wrote: >>>> >>>>> Hi Brock, >>>>> >>>>> Yes, I think morphline interceptor should be something I am looking >>>>> for. I am studying it now. >>>>> >>>>> Thank you, >>>>> >>>>> George >>>>> >>>>> >>>>> On Tue, Oct 29, 2013 at 12:56 PM, Brock Noland <[email protected]>wrote: >>>>> >>>>>> In a very simple demo you could use the static interceptor: >>>>>> http://flume.apache.org/FlumeUserGuide.html#static-interceptor >>>>>> >>>>>> but you probably want to use morphlines interceptor a custom >>>>>> interceptor: >>>>>> http://flume.apache.org/FlumeUserGuide.html#morphline-interceptor >>>>>> >>>>>> >>>>>> On Tue, Oct 29, 2013 at 2:52 PM, Hari Shreedharan < >>>>>> [email protected]> wrote: >>>>>> >>>>>>> Nope. You need to insert it at some other location. >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> Hari >>>>>>> >>>>>>> On Tuesday, October 29, 2013 at 12:48 PM, George Pang wrote: >>>>>>> >>>>>>> Hi Hari, >>>>>>> >>>>>>> Is it (inserting a rowKey header into event) something I can do in >>>>>>> flume.conf? I tried to do that but I am new to flume. >>>>>>> >>>>>>> Thank you, >>>>>>> >>>>>>> George >>>>>>> >>>>>>> >>>>>>> On Tue, Oct 29, 2013 at 12:40 PM, Hari Shreedharan < >>>>>>> [email protected]> wrote: >>>>>>> >>>>>>> Did you insert a rowKey header into the event? If the header is >>>>>>> not there, you are obviously going to get null returned from >>>>>>> currentEvent.getHeaders().get(“rowKey”). You need to insder the header >>>>>>> into >>>>>>> the event at some point. >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> Hari >>>>>>> >>>>>>> On Tuesday, October 29, 2013 at 12:30 PM, George Pang wrote: >>>>>>> >>>>>>> Hi Ashish, >>>>>>> >>>>>>> Actually it starts with headers. In the example code has " String >>>>>>> rowKeyStr = currentEvent.getHeaders().get("rowKey");" but there is no >>>>>>> such >>>>>>> header found. If I get rid of this line, the rest will complain unable >>>>>>> to >>>>>>> deliver event. But I checked the event, it's not null. >>>>>>> >>>>>>> I am trying to use flume to save to hbase, and use the example >>>>>>> http://blog.cloudera.com/blog/2012/11/streaming-data-into-apache-hbase-using-apache-flume/for >>>>>>> customized serializer. >>>>>>> >>>>>>> flume.conf: >>>>>>> >>>>>>> logger-agent.sources = Syslog-UDP >>>>>>> logger-agent.sinks = Syslog-HBase >>>>>>> logger-agent.channels = Syslog-HBase-Channel >>>>>>> >>>>>>> logger-agent.sources.Syslog-UDP.channels = Syslog-HBase-Channel >>>>>>> logger-agent.sinks.Syslog-HBase.channel = Syslog-HBase-Channel >>>>>>> >>>>>>> logger-agent.sources.Syslog-UDP.type = syslogudp >>>>>>> logger-agent.sources.Syslog-UDP.port = 5140 >>>>>>> logger-agent.sources.Syslog-UDP.host = localhost >>>>>>> >>>>>>> logger-agent.sinks.Syslog-HBase.type = org.apache.flume.sink.hbase. >>>>>>> AsyncHBaseSink >>>>>>> logger-agent.sinks.Syslog-HBase.table = syslog2 >>>>>>> logger-agent.sinks.Syslog-HBase.columnFamily = cluster >>>>>>> logger-agent.sinks.Syslog-HBase.serializer.payloadColumn = dev >>>>>>> logger-agent.sinks.Syslog-HBase.serializer.incrementColumn = icol >>>>>>> logger-agent.sinks.Syslog-HBase.serializer.columns = >>>>>>> forum,inbound,outbound >>>>>>> logger-agent.sinks.Syslog-HBase.batchSize = 5000 >>>>>>> logger-agent.sinks.Syslog-HBase.serializer = >>>>>>> org.apache.flume.sink.hbase.SimpleAsyncHbaseEventSerializer >>>>>>> >>>>>>> logger-agent.channels.Syslog-HBase-Channel.type = memory >>>>>>> >>>>>>> >>>>>>> Flume version: 1.4 >>>>>>> >>>>>>> org.apache.flume.FlumeException: No row key found in headers! >>>>>>> at >>>>>>> com.ib.SplittingSerializer.setEvent(SplittingSerializer.java:43) >>>>>>> at >>>>>>> org.apache.flume.sink.hbase.AsyncHBaseSink.process(AsyncHBaseSink.java:184) >>>>>>> at >>>>>>> org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68) >>>>>>> at >>>>>>> org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147) >>>>>>> at java.lang.Thread.run(Thread.java:662) >>>>>>> >>>>>>> Thank you, >>>>>>> >>>>>>> George >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Tue, Oct 29, 2013 at 2:29 AM, Ashish <[email protected]>wrote: >>>>>>> >>>>>>> George, >>>>>>> >>>>>>> Can you share more details about what you are trying to achieve? If >>>>>>> possible, please share Flume version, Agent configuration and exception >>>>>>> stacktrace. >>>>>>> You may also look at HBase Sink for more info >>>>>>> http://flume.apache.org/FlumeUserGuide.html#hbasesinks >>>>>>> >>>>>>> >>>>>>> On Tue, Oct 29, 2013 at 2:50 PM, George Pang <[email protected]>wrote: >>>>>>> >>>>>>> I use the serializer example in this blog post: >>>>>>> http://blog.cloudera.com/blog/2012/11/streaming-data-into-apache-hbase-using-apache-flume/ >>>>>>> >>>>>>> but got "Unable to deliver event. Exception follows. >>>>>>> java.lang.NullPointerException". From looking it up in forums, I think >>>>>>> it >>>>>>> may be caused by empty header. If so, how is a timestamp header is >>>>>>> added? >>>>>>> if not what cause the event undelivery to happen? >>>>>>> >>>>>>> Thank you, >>>>>>> >>>>>>> George >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> thanks >>>>>>> ashish >>>>>>> >>>>>>> Blog: http://www.ashishpaliwal.com/blog >>>>>>> My Photo Galleries: http://www.pbase.com/ashishpaliwal >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org >>>>>> >>>>> >>>>> >>>> >>> >> >> >> -- >> Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org >> > > -- thanks ashish Blog: http://www.ashishpaliwal.com/blog My Photo Galleries: http://www.pbase.com/ashishpaliwal
