[ 
https://issues.apache.org/jira/browse/HADOOP-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12622431#action_12622431
 ] 

Amar Kamat commented on HADOOP-3937:
------------------------------------

bq. _job-history-start-time___job-id___job-name___user-name_
I missed the jobtracker's hostname in this. So the history filename looks like 
_job-history-start-time___jobtracker-hostname___job-id___job-name___user-name_

Job's id is unique within a jobtracker but not across jobtrackers although its 
less probable that two tracker will start at the same time. Since running two 
jobtrackers on a same node is even less probable, I think its safe to assume 
that _jobtracker-hostname___job-id_ should be unique across clusters. One 
simple way to achieve short and unique history filenames would be to have 
something like _job-id___*f*(jobtracker-hostname)_, where _*f*_(s) is something 
like a hash. One can maintain the mapping of _*f*_(s) to _jobtracker-hostname_ 
in some _index_ file along with the username and jobname information. Thoughts?


> Job history may get disabled due to overly long job names
> ---------------------------------------------------------
>
>                 Key: HADOOP-3937
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3937
>             Project: Hadoop Core
>          Issue Type: Bug
>    Affects Versions: 0.17.0, 0.17.1, 0.18.0, 0.19.0
>            Reporter: Matei Zaharia
>         Attachments: HADOOP-3937.patch
>
>
> Since Hadoop 0.17, the job history logs include the job's name in the 
> filename. However, this can lead to overly long filenames, because job names 
> may be arbitrarily long. When a filename is too long for the underlying OS, 
> file creation fails and the JobHistory class silently disables history from 
> that point on. This can lead to days of lost history until somebody notices 
> the error in the log.
> Proposed solution: Trim the job name to a reasonable length when selecting a 
> filename for the history file.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to