[
https://issues.apache.org/jira/browse/MAPREDUCE-323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12873160#action_12873160
]
Dick King commented on MAPREDUCE-323:
-------------------------------------
Okay...
1: I will have to fix rumen to recursively descend into a directory of
directories to make it capable of swallowing a history directory.
1a: I would like to still process the job IDs in lexicographical order [which
is almost always chronological order] for compatibility with applications that
expect approximately chronological order.
1b: This creates a memory footprint of about 200b/entry, which may impose a
limit of one million jobs or so.
2: I will make the directories configurable. How about the following controls?
||locution||meaning||
|{{%y}} |year [four digits] [The Y10K problem will be someone else's
problem :-) ]|
|{{%m}} |month [two digits, leading zeros present]|
|{{%d}} |day [two digits, leading zeros present]|
|{{%h}} |hour [two digits, leading zeros present]|
|{{%i}} |mInute [two digits, leading zeros present]|
|{{%u}} |user|
|{{%xi-j}} |the digits from the jobID index whose positions run from {{i}}
through {{j}}, _downwards_, numbered _from the right, 1-based_. If you choose
any digits that don't exist you get no characters in the output for those
digits. {{%x9-3}} will give you directories holding logs for at most 100 jobs,
unless you omit timestamp selection controls.|
|{{/}} |directory component separator [even on platforms with a
different separator character] -- if there are two or more slashes in a row we
swallow all but one, and note that there's an implicit leading and trailing
separator character|
|any other character |itself|
Did I leave anything out?
> Improve the way job history files are managed
> ---------------------------------------------
>
> Key: MAPREDUCE-323
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-323
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: jobtracker
> Affects Versions: 0.21.0, 0.22.0
> Reporter: Amar Kamat
> Assignee: Dick King
> Priority: Critical
>
> Today all the jobhistory files are dumped in one _job-history_ folder. This
> can cause problems when there is a need to search the history folder
> (job-recovery etc). It would be nice if we group all the jobs under a _user_
> folder. So all the jobs for user _amar_ will go in _history-folder/amar/_.
> Jobs can be categorized using various features like _jobid, date, jobname_
> etc but using _username_ will make the search much more efficient and also
> will not result into namespace explosion.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.