[
https://issues.apache.org/jira/browse/HADOOP-4931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12665913#action_12665913
]
Hemanth Yamijala commented on HADOOP-4931:
------------------------------------------
bq. The cluster setup seems to be for setup required to make sure the cluster
works correctly - basic, core settings. At least that's my take on it. The TT
monitoring stuff is an optional feature, likely to be used only by power users.
The cluster setup guide reads:
bq. This document describes how to install, configure and manage non-trivial
Hadoop clusters ranging from a few nodes to extremely large clusters with
thousands of nodes.
By non-trivial, I am assuming it is more than the basic settings. Look at the
'Real world Cluster Configurations' section. So, I think that it is the right
place for advanced configuration.
bq. Splitting the memory features across three guides (cluster setup, M/R
tutorial, and Capacity Scheduler) seems excessive.
But we are documenting different things, the cluster admin is going to deep
dive into details of configuring memory parameters, while skimming through how
users can specify memory limits in their job conf. The user, reading map/reduce
tutorial only needs to know how he can set up memory configuration for his job.
Details of how memory management is configured on the TT, while useful, are not
relevant for him.
bq. The MR tutorial already had a section on 'Memory management', so it seemed
like a logical place to place our documentation.
That place talks about the ulimit option and java vm child options which can be
tweaked by users when submitting a job. So, it still makes sense in the
map/reduce tutorial. Note that the ulimit option is also configured in the
Cluster setup guide, where the admin can tweak it.
bq. You can argue that there are plenty of parameters described in the MR
tutorial that are also 'admin features'.
I wasn't able to find any. Can you please give me an example. All parameters
mentioned in the MR tutorial seem to be ones which users can configure as part
of job configuration.
>From the very nature of the guides, I think M/R tutorial is meant to help
>people wanting to write jobs and cluster setup is for people who want to
>configure clusters. The feature description should be split in that way. I am
>completely OK with describing how the feature works in one place rather than
>three and linking them from other places for completeness.
> Document TaskTracker's memory management functionality and
> CapacityScheduler's memory based scheduling.
> -------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-4931
> URL: https://issues.apache.org/jira/browse/HADOOP-4931
> Project: Hadoop Core
> Issue Type: Bug
> Components: contrib/capacity-sched, mapred
> Reporter: Vinod K V
> Assignee: Vivek Ratan
> Attachments: 4931.1.patch
>
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.