[
https://issues.apache.org/jira/browse/HADOOP-4931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12666083#action_12666083
]
Vinod K V commented on HADOOP-4931:
-----------------------------------
Some comments:
- The words 'absence' and 'set' are being used frequently in relation to
configuration parameters. We can define upfront that a -1 value indicates
disabling/missing/unset and that any other +ve value indicates set/enabling.
- The purpose of offset/reserved memory is never being stated. We should state
that such reserved memory/offset is for system usage like for OS, system and
hadoop daemons themselves, so that it's clear for the admins.
- All the memory values are in bytes. Sorry Vivek for misinforming you earlier.
mapred_tutorial.xml
- VM is being used to stand for virtual machine at one other place in
mapred_tutorial. May be we should use VMEM and PMEM to be clear.
- I think pmem related parameters should not be discussed about in monitoring
section. They can be in a separate section, say scheduling related
configuration.
- Hemanth> Can we add a note that in monitoring, when a task is killed, a
message is logged so users can see it in the daemon logs.
We also give this information, as to why the task is killed, in the tasks'
diagnostic messages. We can say that in the documentation.
- As for the overall documentation's organization, I too feel that we should
separate cluster setup related information from user parameters.
capacity_scheduler.xml
- May be we can give an example of how scheduling based on memory is done,
citing real numbers and memory values. But I don't know for sure as the
expected audience of this document doesn't look very clear to me.
- That brings me to another point. May be we should separate
capacity_scheduler.xml into different guides, or in the minimum different
sections - for administrators, for users and a general configuration glossary -
in the same vein as HOD's guides. Another point is that the scheduling steps
are being described in good detail here. Only some part of it is actually
needed/useful for the users. Once we have different guides/sections we can
organize the documentation for memory based scheduling properly. Thoughts?
> Document TaskTracker's memory management functionality and
> CapacityScheduler's memory based scheduling.
> -------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-4931
> URL: https://issues.apache.org/jira/browse/HADOOP-4931
> Project: Hadoop Core
> Issue Type: Bug
> Components: contrib/capacity-sched, mapred
> Reporter: Vinod K V
> Assignee: Vivek Ratan
> Attachments: 4931.1.patch
>
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.