The default data type for HDFS Sink is Sequence file. Set the hdfs.fileType to
DataStream. See details here:
http://flume.apache.org/FlumeUserGuide.html#hdfs-sink
Thanks,
Hari
On Friday, October 4, 2013 at 6:52 AM, Deepak Subhramanian wrote:
> I tried using the HDFS Sink to generate the avro file by using the serializer
> as avro_event. But it is not generating avro file. But a sequence file. Is it
> not suppose to generate a avro file with default schema ? Or do I have to
> generate the avro data from text in my HTTPHandler source ?
>
> "{ \"type\":\"record\", \"name\": \"Event\", \"fields\": [" +
> " {\"name\": \"headers\", \"type\": { \"type\": \"map\", \"values\":
> \"string\" } }, " +
> " {\"name\": \"body\", \"type\": \"bytes\" } ] }");
>
>
>
> On Thu, Oct 3, 2013 at 3:36 PM, Deepak Subhramanian
> <[email protected] (mailto:[email protected])> wrote:
> > Hi ,
> >
> > I want to convert xml files in text to an avro file and store it in hdfs .
> > I get the xml files as a post request. I extended the HTTPHandler to
> > process the XML post request. Do I have to convert the data in text to avro
> > in HTTPHandler or does the Avro Sink or HDFSSink convert it directly to
> > avro with some configuration details. I want to store the entire xml string
> > in an avro variable.
> >
> > Thanks in advance for any inputs.
> > Deepak Subhramanian
> >
> >
>
>
>
>
>
> --
> Deepak Subhramanian