When you use a sequence file,you can write your own serializer to decide how the key and value looks like. You implement this interface: http://flume.apache.org/releases/content/1.5.0/apidocs/org/apache/flume/sink/hdfs/SequenceFileSerializer.html
and set that class’s FQCN as your hdfs.writeFormat parameter’s value Thanks, Hari On Thu, Sep 18, 2014 at 1:55 AM, Blade Liu <[email protected]> wrote: > Hi, > The scenario is a machine dynamically generates data, which consists > sections of binary data. We use Flume SDK to collect data and the sink is > HDFS(SequenceFile). > I'm curious what is in the sequence file, since Flume is unaware of schema. > i.e., How does Flume and Avro do serialization without schema? ( Directly > writing raw bytes to disk file may cause alignment issue). > http://stackoverflow.com/questions/18001818/avro-schema-storage is similar > to my question. > Also, how the key is determined in the sequence file? If my understanding > is not correct, please indicate correct usage of Flume with Avro. > Thank you for your clarification. > Cheers, > Blade
