[jira] Commented: (MAPREDUCE-2151) [rumen] Add a map of jobconf key-value pairs in LoggedJob

Hong Tang (JIRA) Tue, 26 Oct 2010 10:58:45 -0700

    [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12925063#action_12925063
 ]


Hong Tang commented on MAPREDUCE-2151:
--------------------------------------

bq. I think Hong is referring to configuration parameters that are likely to 
modify the behaviour of the job and tasks (e.g mapred.child.* , mapreduce.job.* 
etc).

No, this is not what this jira intends to solve. But this jira could 
potentially help. Currently Rumen extracts from jobconf.xml some key-values 
specific to map-reduce layer, and converts them to regular primitive types. I 
think the extraction of mapred.child.* and mapreduce.job.* etc should continue 
along this path.

However, we start to think of using Rumen output to analyze performance of 
frameworks on top of map-reduce. One example is Pig. Pig will add more 
information in jobconf.xml to describe the features being used, and 
compile-time statistics, We need to have a mechanism in Rumen to retain such 
information in an extensible way, and is the primary purpose of this jira.

bq. Also *-default.xml might not be available for reference comparison. 
Correct. That is the main reason we have to make each parsed LoggedJob instance 
self-contained.

bq. Hmm. But I guess we need to bring in more and more configuration properties 
soon.
Yes, it will be,  but not unbounded. I think we can support extraction of 
properties based on exact match or prefixes.

bq. Created MAPREDUCE-2153 to get other needed configuration properties in to 
the trace file. 
This seems to be in addition to MAPREDUCE-1658. I suggest you roll two jiras 
into one (closing MR-1658 and roll the work int oMR-2153).

bq. Also created MAPREDUCE-2152 for avoiding TraceBuilder's its own handling of 
deprecated configuration properties in favour of Configuration object.
The purpose of this jira is to extend the set of key-values to be extracted by 
jobconf parser and retain them as-is in LoggedJob object. So I believe your 
point is relatively orthogonal to this jira. FWIW, I am a bit concerned to 
introduce this dependency between Rumen and MapReduce because I think the 
handling deprecated conf parameters is not really a core part of MapReduce API 
and could be dropped in the future (which would lead us to move the code into 
Rumen - similar to the case of Pre21JobHistoryConstants).

> [rumen] Add a map of jobconf key-value pairs in LoggedJob
> ---------------------------------------------------------
>
>                 Key: MAPREDUCE-2151
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2151
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: tools/rumen
>            Reporter: Hong Tang
>
> It'd be useful to retain application level configuration settings in 
> LoggedJob.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-2151) [rumen] Add a map of jobconf key-value pairs in LoggedJob

Reply via email to