The default data type for HDFS Sink is Sequence file. Set the hdfs.fileType to 
DataStream. See details here: 
http://flume.apache.org/FlumeUserGuide.html#hdfs-sink 


Thanks,
Hari


On Friday, October 4, 2013 at 6:52 AM, Deepak Subhramanian wrote:

> I tried using the HDFS Sink to generate the avro file by using the serializer 
> as avro_event. But it is not generating avro file. But a sequence file. Is it 
> not suppose to generate a avro file with default schema ?  Or do I have to 
> generate the avro data from text in my HTTPHandler source ? 
> 
>  "{ \"type\":\"record\", \"name\": \"Event\", \"fields\": [" + 
>       " {\"name\": \"headers\", \"type\": { \"type\": \"map\", \"values\": 
> \"string\" } }, " +
>       " {\"name\": \"body\", \"type\": \"bytes\" } ] }");  
> 
> 
> 
> On Thu, Oct 3, 2013 at 3:36 PM, Deepak Subhramanian 
> <[email protected] (mailto:[email protected])> wrote:
> > Hi ,
> > 
> > I want to convert xml files in text to an avro file and store it in hdfs . 
> > I get the xml files as a post request. I extended the  HTTPHandler to 
> > process the XML post request. Do I have to convert the data in text to avro 
> > in HTTPHandler or does the Avro Sink or HDFSSink convert it directly to 
> > avro with some configuration details. I want to store the entire xml string 
> > in an avro variable.  
> > 
> > Thanks in advance for any inputs. 
> > Deepak Subhramanian 
> > 
> > 
> 
> 
> 
> 
> 
> -- 
> Deepak Subhramanian 

Reply via email to