It is because you are using SequenceFile as the output format for the HDFS 
Sink. Change this: a1.sinks.k1.hdfs.file.Type=DataStream to 
a1.sinks.k1.hdfs.fileType=DataStream. Also, the log4jappender does not support 
patterns (will support from the next release of Flume, or build trunk from 
source after applying the patch attached to 
https://issues.apache.org/jira/browse/FLUME-1818). The log4jappender appends 
the severity and logger information to the flume headers, so without this 
patch, you need to write your own serializer (or use the 
HeaderAndBodyTextSerializer which is not any release yet, but in trunk - so 
will be in next release).


Hari  

--  
Hari Shreedharan


On Wednesday, January 9, 2013 at 2:12 AM, Chhaya Vishwakarma wrote:

> The expected output I pasted is from file only which I can see in file but 
> while writing to HDFS its giving some junk value and why I am not able to see 
> timestamp and other log information
>   
> From: Bertrand Dechoux [mailto:[email protected]]  
> Sent: Wednesday, January 09, 2013 3:39 PM
> To: [email protected] (mailto:[email protected])
> Subject: Re: flume to HDFS log event write
>   
> http://hadoop.apache.org/docs/current/api/org/apache/hadoop/io/SequenceFile.html
>  
> is a binary format. You may want to make flume ouput to a file or the console 
> first.
> And then compare what you are expecting versus what you are getting.
>  
> Regards
>  
> Bertrand
> On Wed, Jan 9, 2013 at 11:02 AM, Chhaya Vishwakarma 
> <[email protected] 
> (mailto:[email protected])> wrote:
> hi,
>   
> I am using Flume log4j appender to write log events to HDFS but it contains 
> some junk value and I am not able to see anything other than log message no 
> timestamp.
>   
> Here is my configuration
> Log4j.properties
>   
> log4j.logger.log4jExample= DEBUG,out2
> log4j.appender.out2 = org.apache.flume.clients.log4jappender.Log4jAppender
> log4j.appender.out2.Port = 41414
> log4j.appender.out2.Hostname = 172.20.104.223
>   
> here is agent configuration
> a1.sources = r1
> a1.sinks = k1
> a1.channels = c1
>   
> #sources
> a1.sources.r1.type = avro
> a1.sources.r1.bind =172.20.104.226
> a1.sources.r1.port= 41414
> a1.sources.r1.restart =true
> a1.sources.r1.batchsize=10000
>   
> # Describe the sink
> a1.sinks.k1.type = hdfs
> a1.sinks.k1.hdfs.path=hdfs://172.20.104.226:8020/flumeinput/%{host} 
> (http://172.20.104.226:8020/flumeinput/%25%7bhost%7d)
> a1.sinks.k1.hdfs.file.Type=DataStream
> a1.sinks.k1.hdfs.writeFormat=Writable
> a1.sinks.k1.hdfs.rollCount=10000
> a1.sinks.k1.serializer=TEXT
>   
> # Use a channel which buffers events in memory
> a1.channels.c1.type = file
> a1.channels.c1.capacity = 10000
> a1.channels.c1.transactionCapacity = 10000
>   
> # Bind the source and sink to the channel
> a1.sources.r1.channels = c1
> a1.sinks.k1.channel = c1
>   
> Expected output
> [2013-01-09 15:15:45,457] - [main] DEBUG log4jExample Current data 
> unavailalbe, using cached values
> [2013-01-09 15:15:45,458] - [main] INFO  log4jExample Hello this is an info 
> message
> [2013-01-09 15:15:45,460] - [main] ERROR log4jExample Dabase unavaliable, 
> connetion lost
> [2013-01-09 15:15:45,461] - [main] WARN  log4jExample Attention!! Application 
> running in debugmode
> [2013-01-09 15:15:45,463] - [main] DEBUG log4jExample Current data 
> unavailalbe, using cached values
> [2013-01-09 15:15:45,465] - [main] INFO  log4jExample Hello this is an info 
> message
> [2013-01-09 15:15:45,467] - [main] ERROR log4jExample Dabase unavaliable, 
> connetion lost
> [2013-01-09 15:15:45,468] - [main] WARN  log4jExample Attention!! Application 
> running in debugmode
> [2013-01-09 15:15:45,470] - [main] DEBUG log4jExample Current data 
> unavailalbe, using cached values
>   
> But getting this  
> Output on HDFS
> SEQ!org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritable������+�AE����9<‑��-Current
>  data unavailalbe, using cached values)<‑��Hello this is an info 
> message.<‑��"Dabase unavaliable, connetion lost8<‑��,Attention!! Application 
> running in debugmode9<‑��-Current data unavailalbe, using cached 
> values)<‑��Hello this is an info message.<‑��"Dabase unavaliable, connetion 
> lost8<‑��,Attention!! Application running in debugmode9<‑��-Current data 
> unavailalbe, using cached values)<‑��‑Hello this is an info 
> message.<‑��‑"Dabase unavaliable, connetion lost8<‑��­,Attention!! 
> Application running in debugmode9<‑�� -Current data unavailalbe, using cached 
> values)<‑�� Hello this is an info message.<‑��!"Dabase unavaliable, connetion 
> lost8<‑��",Attention!! Application running in debugmode9<‑��"-Current data 
> unavailalbe, using cached values)<‑��#Hello this is an info 
> message.<‑��#"Dabase unavaliable, connetion lost8<‑��$,Attention!! 
> Application running in debugmode9<‑��$-Current data unavailalbe, using cached 
> values)<‑��%Hello this is an info message.<‑��%"Dabase unavaliable, connetion 
> lost8<‑��%,Attention!! Application running in debugmode9<‑��&-Current data 
> unavailalbe, using cached values)<‑��&Hello this is an info 
> message.<‑��&apos;"Dabase unavaliable, connetion lost8<‑��(,Attention!! 
> Application running in debugmode
>   
>   
>  
>   
>  
> The contents of this e-mail and any attachment(s) may contain confidential or 
> privileged information for the intended recipient(s). Unintended recipients 
> are prohibited from taking action on the basis of information in this e-mail 
> and using or disseminating the information, and must notify the sender and 
> delete it from their system. L&T Infotech will not accept responsibility or 
> liability for the accuracy or completeness of, or the presence of any virus 
> or disabling code in this e-mail"
>  
>  
>  
>  
>  
> --  
> Bertrand Dechoux  

Reply via email to