Thanks Hari. I speficied the fileType. This is what I have. I will try again and let you know.
tier1.sources = httpsrc1 tier1.channels = c1 tier1.sinks = sink1 tier1.sources.httpsrc1.bind = 127.0.0.1 tier1.sources.httpsrc1.type = http tier1.sources.httpsrc1.port = 9999 tier1.sources.httpsrc1.channels = c1 tier1.sources.httpsrc1.handler = spikes.flume.XMLHandler tier1.sources.httpsrc1.handler.nickname = HTTPTesting tier1.channels.c1.type = memory tier1.channels.c1.capacity = 100 #tier1.sinks.sink1.type = logger tier1.sinks.sink1.channel = c1 tier1.sinks.sink1.type = hdfs tier1.sinks.sink1.hdfs.path = /tmp/flumecollector tier1.sinks.sink1.hdfs.filePrefix = access_log tier1.sinks.sink1.hdfs.fileSuffix = .avro tier1.sinks.sink1.hdfs.fileType = DataStream tier1.sinks.sink1.hdfs.serializer = avro_event I also added this later. tier1.sinks.sink1.hdfs.serializer.appendNewline = true tier1.sinks.sink1.hdfs.serializer.compressionCodec = snappy On Fri, Oct 4, 2013 at 4:56 PM, Hari Shreedharan <[email protected]>wrote: > The default data type for HDFS Sink is Sequence file. Set the > hdfs.fileType to DataStream. See details here: > http://flume.apache.org/FlumeUserGuide.html#hdfs-sink > > > Thanks, > Hari > > On Friday, October 4, 2013 at 6:52 AM, Deepak Subhramanian wrote: > > I tried using the HDFS Sink to generate the avro file by using the > serializer as avro_event. But it is not generating avro file. But a > sequence file. Is it not suppose to generate a avro file with default > schema ? Or do I have to generate the avro data from text in my > HTTPHandler source ? > > "{ \"type\":\"record\", \"name\": \"Event\", \"fields\": [" + > > " {\"name\": \"headers\", \"type\": { \"type\": \"map\", > \"values\": \"string\" } }, " + > " {\"name\": \"body\", \"type\": \"bytes\" } ] }"); > > > On Thu, Oct 3, 2013 at 3:36 PM, Deepak Subhramanian < > [email protected]> wrote: > > Hi , > > I want to convert xml files in text to an avro file and store it in hdfs . > I get the xml files as a post request. I extended the HTTPHandler to > process the XML post request. Do I have to convert the data in text to avro > in HTTPHandler or does the Avro Sink or HDFSSink convert it directly to > avro with some configuration details. I want to store the entire xml string > in an avro variable. > > Thanks in advance for any inputs. > Deepak Subhramanian > > > > > -- > Deepak Subhramanian > > > -- Deepak Subhramanian
