Here is some material to get started with morphlines: http://flume.apache.org/FlumeUserGuide.html#morphline-interceptor
http://cloudera.github.io/cdk/docs/current/cdk-morphlines/index.html http://cloudera.github.io/cdk/docs/current/cdk-morphlines/morphlinesReferenceGuide.html http://cloudera.github.io/cdk/docs/current/cdk-morphlines/morphlinesReferenceGuide.html#/addValues http://cloudera.github.io/cdk/docs/current/cdk-morphlines/morphlinesReferenceGuide.html#/generateUUID Wolfgang. On Oct 30, 2013, at 6:53 PM, Ashish wrote: > George, > > Just to get things working, you can use UUID Interceptor > http://flume.apache.org/FlumeUserGuide.html#uuid-interceptor > > Put the headerName field value as rowKey and the code should work. I have not > used this, but if it still doesn't work let us know. I will quickly hack out > a working example. > > > On Thu, Oct 31, 2013 at 1:22 AM, George Pang <[email protected]> wrote: > Thank you, but I am not so sure I can insert header with the example in this > blog. I miss a part for the whole picture. > > George > > > On Wed, Oct 30, 2013 at 6:56 AM, Brock Noland <[email protected]> wrote: > I just googled and found this. Not sure if there is a better one. > > http://blog.cloudera.com/blog/2013/07/morphlines-the-easy-way-to-build-and-integrate-etl-apps-for-apache-hadoop/ > > > On Wed, Oct 30, 2013 at 12:34 AM, George Pang <[email protected]> wrote: > Is there a tutorial for this topic out there? > > Thanks, > > George > > > On Tue, Oct 29, 2013 at 6:50 PM, George Pang <[email protected]> wrote: > Hi Brock, > > The morphline comand addValue looks like the one I need, but how can I add > the event head key-value pair? > > Thank you, > > George > > > On Tue, Oct 29, 2013 at 1:02 PM, George Pang <[email protected]> wrote: > Hi Brock, > > Yes, I think morphline interceptor should be something I am looking for. I am > studying it now. > > Thank you, > > George > > > On Tue, Oct 29, 2013 at 12:56 PM, Brock Noland <[email protected]> wrote: > In a very simple demo you could use the static interceptor: > http://flume.apache.org/FlumeUserGuide.html#static-interceptor > > but you probably want to use morphlines interceptor a custom interceptor: > http://flume.apache.org/FlumeUserGuide.html#morphline-interceptor > > > On Tue, Oct 29, 2013 at 2:52 PM, Hari Shreedharan <[email protected]> > wrote: > Nope. You need to insert it at some other location. > > > Thanks, > Hari > > On Tuesday, October 29, 2013 at 12:48 PM, George Pang wrote: > >> Hi Hari, >> >> Is it (inserting a rowKey header into event) something I can do in >> flume.conf? I tried to do that but I am new to flume. >> >> Thank you, >> >> George >> >> >> On Tue, Oct 29, 2013 at 12:40 PM, Hari Shreedharan >> <[email protected]> wrote: >>> Did you insert a rowKey header into the event? If the header is not there, >>> you are obviously going to get null returned from >>> currentEvent.getHeaders().get(“rowKey”). You need to insder the header into >>> the event at some point. >>> >>> >>> Thanks, >>> Hari >>> >>> On Tuesday, October 29, 2013 at 12:30 PM, George Pang wrote: >>> >>>> Hi Ashish, >>>> >>>> Actually it starts with headers. In the example code has " String >>>> rowKeyStr = currentEvent.getHeaders().get("rowKey");" but there is no such >>>> header found. If I get rid of this line, the rest will complain unable to >>>> deliver event. But I checked the event, it's not null. >>>> >>>> I am trying to use flume to save to hbase, and use the example >>>> http://blog.cloudera.com/blog/2012/11/streaming-data-into-apache-hbase-using-apache-flume/ >>>> for customized serializer. >>>> >>>> flume.conf: >>>> >>>> logger-agent.sources = Syslog-UDP >>>> logger-agent.sinks = Syslog-HBase >>>> logger-agent.channels = Syslog-HBase-Channel >>>> >>>> logger-agent.sources.Syslog-UDP.channels = Syslog-HBase-Channel >>>> logger-agent.sinks.Syslog-HBase.channel = Syslog-HBase-Channel >>>> >>>> logger-agent.sources.Syslog-UDP.type = syslogudp >>>> logger-agent.sources.Syslog-UDP.port = 5140 >>>> logger-agent.sources.Syslog-UDP.host = localhost >>>> >>>> logger-agent.sinks.Syslog-HBase.type = >>>> org.apache.flume.sink.hbase.AsyncHBaseSink >>>> logger-agent.sinks.Syslog-HBase.table = syslog2 >>>> logger-agent.sinks.Syslog-HBase.columnFamily = cluster >>>> logger-agent.sinks.Syslog-HBase.serializer.payloadColumn = dev >>>> logger-agent.sinks.Syslog-HBase.serializer.incrementColumn = icol >>>> logger-agent.sinks.Syslog-HBase.serializer.columns = forum,inbound,outbound >>>> logger-agent.sinks.Syslog-HBase.batchSize = 5000 >>>> logger-agent.sinks.Syslog-HBase.serializer = >>>> org.apache.flume.sink.hbase.SimpleAsyncHbaseEventSerializer >>>> >>>> logger-agent.channels.Syslog-HBase-Channel.type = memory >>>> >>>> >>>> Flume version: 1.4 >>>> >>>> org.apache.flume.FlumeException: No row key found in headers! >>>> at com.ib.SplittingSerializer.setEvent(SplittingSerializer.java:43) >>>> at >>>> org.apache.flume.sink.hbase.AsyncHBaseSink.process(AsyncHBaseSink.java:184) >>>> at >>>> org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68) >>>> at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147) >>>> at java.lang.Thread.run(Thread.java:662) >>>> >>>> Thank you, >>>> >>>> George >>>> >>>> >>>> >>>> On Tue, Oct 29, 2013 at 2:29 AM, Ashish <[email protected]> wrote: >>>>> George, >>>>> >>>>> Can you share more details about what you are trying to achieve? If >>>>> possible, please share Flume version, Agent configuration and exception >>>>> stacktrace. >>>>> You may also look at HBase Sink for more info >>>>> http://flume.apache.org/FlumeUserGuide.html#hbasesinks >>>>> >>>>> >>>>> On Tue, Oct 29, 2013 at 2:50 PM, George Pang <[email protected]> wrote: >>>>>> I use the serializer example in this blog post: >>>>>> http://blog.cloudera.com/blog/2012/11/streaming-data-into-apache-hbase-using-apache-flume/ >>>>>> >>>>>> but got "Unable to deliver event. Exception follows. >>>>>> java.lang.NullPointerException". From looking it up in forums, I think >>>>>> it may be caused by empty header. If so, how is a timestamp header is >>>>>> added? if not what cause the event undelivery to happen? >>>>>> >>>>>> Thank you, >>>>>> >>>>>> George >>>>> >>>>> >>>>> >>>>> -- >>>>> thanks >>>>> ashish >>>>> >>>>> Blog: http://www.ashishpaliwal.com/blog >>>>> My Photo Galleries: http://www.pbase.com/ashishpaliwal >>>> >>> >> > > > > > -- > Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org > > > > > > > -- > Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org > > > > > -- > thanks > ashish > > Blog: http://www.ashishpaliwal.com/blog > My Photo Galleries: http://www.pbase.com/ashishpaliwal
