[ 
https://issues.apache.org/jira/browse/FLUME-2852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15046941#comment-15046941
 ] 

Tristan Stevens commented on FLUME-2852:
----------------------------------------

Yes, this would absolutely be the AvroFlumEvent and would be without the schema 
as in the Avro Sink / Avro Source pairing.

Here is the code from the Kafka Channel:

        if (parseAsFlumeEvent) {
          if (!tempOutStream.isPresent()) {
            tempOutStream = Optional.of(new ByteArrayOutputStream());
          }
          if (!writer.isPresent()) {
            writer = Optional.of(new
              SpecificDatumWriter<AvroFlumeEvent>(AvroFlumeEvent.class));
          }
          tempOutStream.get().reset();
          AvroFlumeEvent e = new AvroFlumeEvent(
            toCharSeqMap(event.getHeaders()),
            ByteBuffer.wrap(event.getBody()));
          encoder = EncoderFactory.get()
            .directBinaryEncoder(tempOutStream.get(), encoder);
          writer.get().write(e, encoder);
          // Not really possible to avoid this copy :(
          serializedEvents.get().add(tempOutStream.get().toByteArray());
        } else {
          serializedEvents.get().add(event.getBody());
        }

The flow in this case is Syslog Source -> Memory Channel -> Kafka Sink -> Kafka 
Broker -> Kafka Source -> Memory Channel -> HDFS Sink

Although in the future I'd like to make it: Syslog Source -> Kafka Channel -> 
Kafka Sink -> Kafka Broker -> Kafka Source -> Kafka Channel -> HDFS Sink

N.B. The three tiers run in different sites. 

> Kafka Source/Sink should optionally read/write Avro Datums
> ----------------------------------------------------------
>
>                 Key: FLUME-2852
>                 URL: https://issues.apache.org/jira/browse/FLUME-2852
>             Project: Flume
>          Issue Type: Improvement
>          Components: Sinks+Sources
>    Affects Versions: v1.6.0
>            Reporter: Tristan Stevens
>            Assignee: Jeff Holoman
>
> Currently the Kafka Sink only writes the event body to Kafka rather than an 
> Avro Datum. This works fine when being used with a Kafka Source, or when 
> being used with Kafka Channel, however it does mean that any Flume headers 
> are lost when transported via Kafka.
> Request is to implement an equivalent of the Kafka Channel's 
> parseAsFlumeEvent parameter to the sink/source so that optionally Avro Datums 
> can be read from and written to Kafka.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to