[ 
https://issues.apache.org/jira/browse/CHUKWA-305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720827#action_12720827
 ] 

Jiaqi Tan commented on CHUKWA-305:
----------------------------------

It's not obvious if this is a Chukwa bug or a Hadoop issue, but if Hadoop is 
emitting logs that do not have timezones, then there's nothing Chukwa can do 
about it.

To be more specific, I ran into this problem when I generated data using a 
cluster in EDT, and then processed the data on a system in PDT. The times in 
the Job History logs are correct since they are stored in UTC, but the 
log-processing on the machine in PDT reads the EDT time strings and assumes 
they are in PDT, resulting in the data from the text-based log sources being 3 
hours ahead of the data in the Job History logs.

> Inconsistent time inputs
> ------------------------
>
>                 Key: CHUKWA-305
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-305
>             Project: Hadoop Chukwa
>          Issue Type: Bug
>          Components: data collection, Data Processors
>            Reporter: Jiaqi Tan
>
> Times in Job History logs are stored in UTC seconds from Epoch, but times in 
> log-based sources e.g. clienttrace, and the daemon (DataNode, TaskTracker, 
> JobTracker, NameNode) logs are in local timezones and in ISO-8601 strings, 
> and do not have the timezone they are recorded in. This leads to 
> inconsistencies when trying to correlate data in time across log-based 
> sources and Job History data because the timezone of the data for the 
> log-based sources which emit times in human-readable strings do not record 
> the timezone. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to