On 5/18/09 7:14 AM, "Jiaqi Tan" <[email protected]> wrote:
> Hi Eric,
>
>> Chukwa has a special log4j appender which escapes return character. The
>> multi-lines exception will be stored as a single chunk, and processed as a
>> single chukwa record after Demux.
>
> In this case, I suppose I would need to configure the monitored Hadoop
> cluster to actually use the Chukwa log4j appender? Would I also need
> to recompile the Hadoop of the monitored cluster to include the Chukwa
> code then?
There is no need to recompile Hadoop. Chukwa contains a jar file called
chukwa-hadoop-*-client.jar, and json.jar. Drop those two jar files in lib
directory of the hadoop cluster, and configure log4j.properties in hadoop
conf directory.
>
> Where are these record types defined, and how do they map the the
> processors? Is it a direct <record type name>Processor mapping that's
> automatically done by the Demux?
Record types are defined in the log4j.properties. For example, hadoop has a
appender called DRFA, and the chukwa enabled appender would look like:
log4j.appender.DRFA=org.apache.hadoop.chukwa.inputtools.log4j.ChukwaDailyRol
lingFileAppender
log4j.appender.DRFA.File=${hadoop.log.dir}/${hadoop.log.file}
log4j.appender.DRFA.recordType=HadoopLog
log4j.appender.DRFA.chukwaClientHostname=localhost
log4j.appender.DRFA.chukwaClientPortNum=9093
The association of HadoopLog record type and the demux class is in
chukwa-demux-conf.xml.
Regards,
Eric
>
>>
>> On 5/17/09 6:42 PM, "Jiaqi Tan" <[email protected]> wrote:
>>
>>> Hi Ariel,
>>>
>>> So with the CharFileTailingAdaptorUTF8NewLineEscaped, if I have a log
>>> file entry with a multi-line entry, e.g. if there was a Java exception
>>> logged, would each line be separated into a different chunk? If that's
>>> the case, are there any adaptors that would coalesce multi-line log
>>> entries into a single chunk?
>>>
>>> Also, does the data type get resolved by Demux to one of the classes
>>> in org.apache.hadoop.chukwa.extraction.demux.processor.mapper? i.e. if
>>> I wanted to implement my own custom datatype, I should create a Demux
>>> processor and stick it in as one of the classes in that package?
>>>
>>> Thanks,
>>> Jiaqi
>>>
>>> On Sun, May 17, 2009 at 6:19 PM, Ariel Rabkin <[email protected]> wrote:
>>>> It's worth distinguishing two different things.
>>>>
>>>> The adaptor (as in CharFileTailingAdaptorUTF8) is responsible for
>>>> deciding how to break the data into chunks, and how to tag the chunks.
>>>> Probably CharFileTailingAdaptorUTF8NewLineEscaped is right for you.
>>>> (We should really rename that to something shorter!)
>>>>
>>>> The type, like SysLog or NameNodeLog, is stored by the adaptor, and
>>>> passed through as Chunk metadata. It's used to tell the Demux how to
>>>> process that data. The demux-conf has the mapping from datatype to
>>>> processor. For logs, you should be fine just picking a datatype. If
>>>> you aren't using Demux to process the logs, you don't even need to
>>>> write a processor.
>>>>
>>>> --Ari
>>>>
>>>> On Sun, May 17, 2009 at 6:15 PM, Jiaqi Tan <[email protected]> wrote:
>>>>> Hi,
>>>>>
>>>>> Which adaptor should I use if I want to process log entries from the
>>>>> TaskTracker and DataNode logs? Should I just use one of the
>>>>> FileTailer adaptors already available (CharFileTailingAdaptorUTF8), or
>>>>> is there a custom type such as the one for SysLog or NameNodeLog when
>>>>> using the CharFileTailingAdaptorUTF8NewLineEscaped adaptor?
>>>>>
>>>>> Is there any documentation available on what the "type" (e.g. SysLog
>>>>> or NameNodeLog) means and how to use it/how it works?
>>>>>
>>>>> Thanks,
>>>>> Jiaqi
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Ari Rabkin [email protected]
>>>> UC Berkeley Computer Science Department
>>>>
>>
>>