Chukwa - Add duplicate detection, and implement virtual offset of the log file
to checkpoint file
-------------------------------------------------------------------------------------------------
Key: HADOOP-4710
URL: https://issues.apache.org/jira/browse/HADOOP-4710
Project: Hadoop Core
Issue Type: Bug
Environment: Redhat EL 4.5, Java 1.6, Hadoop Trunk
Reporter: Eric Yang
Each data stream has been sent to Chukwa with sequence id, and this sequence id
is used as the guide line for tracking duplicate chunk data in Chukwa.
However, the check point file does not include the virtual offset. This means
when collector crashed, sequence id is reset to zero. Chukwa Agent needs to
keep track of the sequence id in the check point file in order to recover from
a crash.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.