So is  that means that we will always have several lines of log data in the 
<body> tag of chukwa record?
 Can you please tell me where is that agent code that defines this.

I have read these ChukwaRecord through Map Reduce and can read the original log 
lines. :)

Stuti

From: Gerrit Jansen van Vuuren [mailto:gvanvuu...@specificmedia.com]
Sent: Tuesday, June 08, 2010 5:53 PM
To: chukwa-user@hadoop.apache.org
Subject: RE: Problem in ChukwaRecord file contents

Each chukwa record will contain several lines of log data (depending on how the 
agent defines lines :) ).

You can use the MapReduce Jobs, HDFS or Pig to read these files.  You might 
need to do some coding though.

I use pig to read to chukwa files and then to get the original log lines I 
output the data column (i.e. these original records) using a pig BinStorage.

Have a look at  com.specificmedia.hadoop.logimport.demux.chukwa.ChukwaArchive() 
and the other chukwa-core classes.

Hope this helps.


Cheers,



From: Stuti Awasthi [mailto:stuti_awas...@persistent.co.in]
Sent: Tuesday, June 08, 2010 12:51 PM
To: chukwa-user@hadoop.apache.org
Subject: Problem in ChukwaRecord file contents

Hi All,

I gave my log file as input to chukwa and converted it to .evt file i.e. 
ChukwaRecord file
I checked the ChukwaRecord file which is a sequence file with ChukwaRecordKey 
and ChukwaRecord.
I saw that the ChukwaRecord contains Timestamp and some other fields. One of 
them is the "body" field.
However, this body field of each record contains a bunch of lines (from my 
original log file).

Contents of Original log file:

May 29 13:00:16 ps3156 syslogd 1.5.0#5ubuntu3: restart.
May 29 13:00:16 ps3156 anacron[4148]: Job `cron.daily' terminated
May 29 13:00:16 ps3156 anacron[4148]: Normal exit (1 job run)
May 29 13:09:02 ps3156 /USR/SBIN/CRON[19815]: (root) CMD (  [ -x 
/usr/lib/php5/maxlifetime ] && [ -d /var/lib/php5 ] && find /var/lib/php5/ 
-type f -cmin +$(/usr/lib/php5/maxlifetime) -print0 | xargs -n 200 -r -0 rm)
May 29 13:09:02 ps3156 /USR/SBIN/CRON[19815]: (root) CMD (  [ -x 
/usr/lib/php5/maxlifetime ] && [ -d /var/lib/php5 ] && find /var/lib/php5/ 
-type f -cmin +$(/usr/lib/php5/maxlifetime) -print0 | xargs -n 200 -r -0 rm)


Contents of ChukwaRecord file:

{"DataType": "SysLog", "Key": 
"1275118200000/ps3156.persistent.co.in/1275118216000", "Timestamp": 
1275118216000, "mapFields": {"csource": "ps3156.persistent.co.in", "capp": 
"/home/hadoop/Test/syslog_test", "ctags": " cluster="chukwa"", "body": "May 29 
13:00:16 ps3156 syslogd 1.5.0#5ubuntu3: restart.
May 29 13:00:16 ps3156 anacron[4148]: Job `cron.daily' terminated
May 29 13:00:16 ps3156 anacron[4148]: Normal exit (1 job run)
May 29 13:09:02 ps3156 /USR/SBIN/CRON[19815]: (root) CMD (  [ -x 
/usr/lib/php5/maxlifetime ] && [ -d /var/lib/php5 ] && find /var/lib/php5/ 
-type f -cmin +$(/usr/lib/php5/maxlifetime) -print0 | xargs -n 200 -r -0 rm)
May 29 13:09:02 ps3156 /USR/SBIN/CRON[19815]: (root) CMD (  [ -x 
/usr/lib/php5/maxlifetime ] && [ -d /var/lib/php5 ] && find /var/lib/php5/ 
-type f -cmin +$(/usr/lib/php5/maxlifetime) -print0 | xargs -n 200 -r -0 rm) } }

I can see that each record in the ChukwaRecord file contains a chunk of lines 
from the original log file. Is this behavior correct?
According to my understanding, each record in the ChukwaRecord file should 
contain only one line from the original log file.
Is it possible to create such a ChukwaRecord file?
Please suggest.


Stuti

DISCLAIMER ========== This e-mail may contain privileged and confidential 
information which is the property of Persistent Systems Ltd. It is intended 
only for the use of the individual or entity to which it is addressed. If you 
are not the intended recipient, you are not authorized to read, retain, copy, 
print, distribute or use this message. If you have received this communication 
in error, please notify the sender and delete all copies of this message. 
Persistent Systems Ltd. does not accept any liability for virus infected mails.

DISCLAIMER
==========
This e-mail may contain privileged and confidential information which is the 
property of Persistent Systems Ltd. It is intended only for the use of the 
individual or entity to which it is addressed. If you are not the intended 
recipient, you are not authorized to read, retain, copy, print, distribute or 
use this message. If you have received this communication in error, please 
notify the sender and delete all copies of this message. Persistent Systems 
Ltd. does not accept any liability for virus infected mails.

Reply via email to