Just a warning if you are using Text output format then you will have some hard 
time with "\n" inside your logs like stackTrace for example.
Also, text file will either be non-compressed or non-splittable.

/Jerome.

On 11/19/10 9:30 AM, "Eric Yang" <[email protected]> wrote:




On 11/19/10 12:37 AM, "Ying Tang" <[email protected]> wrote:

Hi all ,
    1.   I have install 2 nodes chukwa for testing , one agent and one 
collector  . And also i have an hdfs , but i found the log collected by the 
collector in hdfs , the file name is
          time+logsourcehost+java.rmi.server.UID()
          time's format is yyyyddHHmmssSSS , there is no month ? And this is 
been written in the code .
    I      need the month  ,  so i must change the code and recompile it ?
    2.   And another question , the log content in the log file(in the hdfs) , 
the metadata is messy code , the log content from the agent is ok.
          My adaptor is UTF8 , how to solve this?


 1.  Looks like a mistake on the temp filename.  Please open a jira and we will 
fix it.
 2.  The data is recorded in sequence file format to make the data easier to 
process with mapreduce.  If you are expecting plain text of the log content, 
you will need to write a map/reduce job with output format to text output 
format and channel the log files types according.

Regards,
Eric

Reply via email to