[
https://issues.apache.org/jira/browse/CHUKWA-305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720827#action_12720827
]
Jiaqi Tan commented on CHUKWA-305:
----------------------------------
It's not obvious if this is a Chukwa bug or a Hadoop issue, but if Hadoop is
emitting logs that do not have timezones, then there's nothing Chukwa can do
about it.
To be more specific, I ran into this problem when I generated data using a
cluster in EDT, and then processed the data on a system in PDT. The times in
the Job History logs are correct since they are stored in UTC, but the
log-processing on the machine in PDT reads the EDT time strings and assumes
they are in PDT, resulting in the data from the text-based log sources being 3
hours ahead of the data in the Job History logs.
> Inconsistent time inputs
> ------------------------
>
> Key: CHUKWA-305
> URL: https://issues.apache.org/jira/browse/CHUKWA-305
> Project: Hadoop Chukwa
> Issue Type: Bug
> Components: data collection, Data Processors
> Reporter: Jiaqi Tan
>
> Times in Job History logs are stored in UTC seconds from Epoch, but times in
> log-based sources e.g. clienttrace, and the daemon (DataNode, TaskTracker,
> JobTracker, NameNode) logs are in local timezones and in ISO-8601 strings,
> and do not have the timezone they are recorded in. This leads to
> inconsistencies when trying to correlate data in time across log-based
> sources and Job History data because the timezone of the data for the
> log-based sources which emit times in human-readable strings do not record
> the timezone.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.