[ 
https://issues.apache.org/jira/browse/FLUME-2010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated FLUME-2010:
-----------------------------

    Attachment: FLUME-2010.patch

Here's another revision of the patch that reuses the Avro encoder, and support 
schema IDs rather than sending the schema in a Flume header.

The idea behind schema IDs is that if you set the ID for the schema in the 
Log4j MDC then it will be used in the Flume header. (If you don't set it then 
everything still works, it just has to set the schema in a header for every 
message.) The HDFS sink then retrieves the schema by looking it up from its 
configuration file, which has to include the ID -> schema mapping.

When AVRO-1124 is done we could use that for the schema repository.

An alternative way of doing this now would be to have a schema catalog 
properties files with the ID -> schema mapping, and have both the Log4jAppender 
and HDFS sink use it - in this way we could avoid the MDC part.
                
> Support Avro records in Log4jAppender and the HDFS Sink
> -------------------------------------------------------
>
>                 Key: FLUME-2010
>                 URL: https://issues.apache.org/jira/browse/FLUME-2010
>             Project: Flume
>          Issue Type: New Feature
>          Components: Client SDK, Sinks+Sources
>            Reporter: Tom White
>            Assignee: Tom White
>             Fix For: v1.4.0
>
>         Attachments: FLUME-2010.patch, FLUME-2010.patch
>
>
> It would be nice to support logging arbitrary Avro records via the Log4j 
> Flume logger, and have them written to HDFS in Avro data files (using an 
> appropriately configured HDFS sink).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to