+1 for dates in Owen's suggested format, so that the job id's will be easily sortable.

On Jul 10, 2007, at 1:11 AM, Enis Soztutar (JIRA) wrote:


[ https://issues.apache.org/jira/browse/HADOOP-1473? page=com.atlassian.jira.plugin.system.issuetabpanels:comment- tabpanel#action_12511351 ]

Enis Soztutar commented on HADOOP-1473:
---------------------------------------

its looks much cleaner and also is easy to grep on the logs with the jobs ran on some day and some month.
The date in the job's id is not intended to be the date job is run but the JT is started.

+1 for date/times, they're generally easier to remember than random strings.
you do not have to remember the dates' unless you're dealing with jobs' which run on (now) stopped JT.

IMO as far as "look at job 75" is concerned, i think either method would make no difference.
       look at job 75 => find {{job_200706081450_00075}}
or   look at job 75 => find {{job_jkx3y7_00075}}

my vote is to 4-6 digit hash of the JT start time
       look at job 75 => find {{job_4390_00075}}

but now it is harder to explain what 4390 is to new comers.


Make jobids unique across jobtracker restarts
---------------------------------------------

                Key: HADOOP-1473
URL: https://issues.apache.org/jira/browse/ HADOOP-1473
            Project: Hadoop
         Issue Type: Improvement
         Components: mapred
   Affects Versions: 0.12.3
           Reporter: Owen O'Malley
           Assignee: Owen O'Malley
            Fix For: 0.14.0

        Attachments: new-job-id.patch


I'll make the job ids unique across JobTracker restarts by adding the startup time of the JobTracker, so if the JobTracker started at 8 Jun 2007 14:50, the first job would be called:
job_200706081450_00001
the second job would be:
job_200706081450_00002
and so on...

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Reply via email to