Hi,
Since users would want to write the data in a format their processing code
would understand, Flume does not impose a format on the data it writes. Instead
Flume allows you to plugin your own code to serialize each event. By default,
the serializer used is BodyTextEventSerializer which simply stringifies the
body and writes it out. Flume does have an Avro serializer too, which writes
the headers and body in Avro format.
If you need the data in any other format you would need to write a serializer
that serializes the data.
Thanks
Hari
--
Hari Shreedharan
On Saturday, September 1, 2012 at 2:32 AM, Kevin Lee wrote:
> Hi all,
> When i use Flume-OG, hdfs sink will get full event like follow text.
> head -1 varnishncsa.log | python -mjson.tool { "body": "xxxxxxxxx", "fields":
> { "rolltag": "20120427-234205010-0400.418723114906325.00004966",
> "tailSrcFile": "varnishncsa.log" }, "host":
> "ip-10-170-147-62.compute.internal", "nanos": 5164871621338860, "pri":
> "INFO", "timestamp": 1335584717134 }
> But when i using flume-NG. If i’m use a logger sink, it will give me full
> event, like bellow.
> collector config:
> agent.sinks.hdfs-write.type = logger
> agent config:
> agent.sources.exec-agent.interceptors.hostname.type =
> org.apache.flume.interceptor.HostInterceptor$Builder
> agent.sources.exec-agent.interceptors.hostname.preserveExisting = true
> agent.sources.exec-agent.interceptors.hostname.useIP = false
> >
> > 12/09/01 07:51:41 INFO sink.LoggerSink: Event: {
> > headers:{timestamp=1346485898919, host=ip-10-34-4-55.ec2.internal} body: 61
> > 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 aaaaaaaaaaaaaaaa }
>
> But when i use hdfs sink, i can’t get the full event, thanks in advance for
> any help.
> Thanks,
> - Kevin