[jira] Commented: (HADOOP-4931) Document TaskTracker's memory management functionality and CapacityScheduler's memory based scheduling.

Hemanth Yamijala (JIRA) Wed, 21 Jan 2009 11:00:25 -0800

    [ 
https://issues.apache.org/jira/browse/HADOOP-4931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12665913#action_12665913
 ]


Hemanth Yamijala commented on HADOOP-4931:
------------------------------------------

bq. The cluster setup seems to be for setup required to make sure the cluster 
works correctly - basic, core settings. At least that's my take on it. The TT 
monitoring stuff is an optional feature, likely to be used only by power users.

The cluster setup guide reads:
bq. This document describes how to install, configure and manage non-trivial 
Hadoop clusters ranging from a few nodes to extremely large clusters with 
thousands of nodes.
By non-trivial, I am assuming it is more than the basic settings. Look at the 
'Real world Cluster Configurations' section. So, I think that it is the right 
place for advanced configuration.

bq. Splitting the memory features across three guides (cluster setup, M/R 
tutorial, and Capacity Scheduler) seems excessive.

But we are documenting different things, the cluster admin is going to deep 
dive into details of configuring memory parameters, while skimming through how 
users can specify memory limits in their job conf. The user, reading map/reduce 
tutorial only needs to know how he can set up memory configuration for his job. 
Details of how memory management is configured on the TT, while useful, are not 
relevant for  him.

bq. The MR tutorial already had a section on 'Memory management', so it seemed 
like a logical place to place our documentation.

That place talks about the ulimit option and java vm child options which can be 
tweaked by users when submitting a job. So, it still makes sense in the 
map/reduce tutorial. Note that the ulimit option is also configured in the 
Cluster setup guide, where the admin can tweak it.

bq. You can argue that there are plenty of parameters described in the MR 
tutorial that are also 'admin features'.
I wasn't able to find any. Can you please give me an example. All parameters 
mentioned in the MR tutorial seem to be ones which users can configure as part 
of job configuration.

>From the very nature of the guides, I think M/R tutorial is meant to help 
>people wanting to write jobs and cluster setup is for people who want to 
>configure clusters. The feature description should be split in that way. I am 
>completely OK with describing how the feature works in one place rather than 
>three and linking them from other places for completeness.

> Document TaskTracker's memory management functionality and 
> CapacityScheduler's memory based scheduling.
> -------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-4931
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4931
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/capacity-sched, mapred
>            Reporter: Vinod K V
>            Assignee: Vivek Ratan
>         Attachments: 4931.1.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-4931) Document TaskTracker's memory management functionality and CapacityScheduler's memory based scheduling.

Reply via email to