[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0

Chris Douglas (JIRA) Fri, 18 Mar 2011 14:56:55 -0700

    [ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13008648#comment-13008648
 ]


Chris Douglas commented on MAPREDUCE-279:
-----------------------------------------

bq. Why not contain a ContainerLaunchContext to specify the container in which 
to run the AM? Seems like lots of duplicated fields.
Agreed. Fixing this also addresses the URL as insufficient for resources. The 
\_todo form was introduced to effect this, and remains in-progress.

bq. how does one access stderr/stdout contents? both while they're being 
written and after a container has terminated? (maybe I just haven't gotten to 
that bit yet somewhere else)
This is still a TODO (working on it now). In the short term, something similar 
to what the TT does is probably sufficient, I hope.

bq. Did you consider making the ids all strings instead of ints? The pro would 
be that there could be canonical formats, like "AM-<hex id>" for app masters vs 
"C-<hex id>" for containers.
Some of the implementation ended up relying on a consistent mapping of int ids 
to strings, so going all the way could make sense. On the other hand, parsing 
strings to determine relationships between containers and applications is 
regrettable.

bq. the URL record is missing user/password used for http basic auth or s3n auth
Agreed, full URIs should be supported, though pushing that all the way through 
FileContext and FileSystem could be painful.

bq. just to clarify, APPLICATION visibility means "only to this application 
submitted by this user". ie if joe and bob both submit MapReduce 2.x.y jobs 
with identical jars, it still won't share, even if sha1s match?
Right. The target layout for the NodeManager looks roughly like this:
{noformat}
for x in localdir:
$x/filecache # public cache
$x/usercache
$x/usercache/$user
$x/usercache/filecache # private cache
$x/usercache/$user/appcache
$x/usercache/$user/appcache/$appid
$x/usercache/$user/appcache/$appid/filecache # application cache
$x/usercache/$user/appcache/$appid/$containerid
$x/usercache/$user/appcache/$appid/output # output retained after container 
exits, i.e. intermediate data
{noformat}
So the end of the container and application can just delete those subdirs. 
Matching a job jar between invocations would require one to register that 
resource as PUBLIC/PRIVATE. The APPLICATION scope is more for job.xml and the 
like.

> Map-Reduce 2.0
> --------------
>
>                 Key: MAPREDUCE-279
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: jobtracker, tasktracker
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>             Fix For: 0.23.0
>
>         Attachments: MR-279.patch, MR-279.patch, MR-279.sh, 
> MR-279_MR_files_to_move.txt
>
>
> Re-factor MapReduce into a generic resource scheduler and a per-job, 
> user-defined component that manages the application execution. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0

Reply via email to