[jira] Commented: (HADOOP-4931) Document TaskTracker's memory management functionality and CapacityScheduler's memory based scheduling.

Vinod K V (JIRA) Thu, 22 Jan 2009 00:24:23 -0800

    [ 
https://issues.apache.org/jira/browse/HADOOP-4931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12666083#action_12666083
 ]


Vinod K V commented on HADOOP-4931:
-----------------------------------

Some comments:
 - The words 'absence' and 'set' are being used frequently in relation to 
configuration parameters. We can define upfront that a -1 value indicates 
disabling/missing/unset and that any other +ve value indicates set/enabling.
 - The purpose of offset/reserved memory is never being stated. We should state 
that such reserved memory/offset is for system usage like for OS, system and 
hadoop daemons themselves, so that it's clear for the admins.
 - All the memory values are in bytes. Sorry Vivek for misinforming you earlier.

mapred_tutorial.xml
 - VM is being used to stand for virtual machine at one other place in 
mapred_tutorial. May be we should use VMEM and PMEM to be clear.
 - I think pmem related parameters should not be discussed about in monitoring 
section. They can be in a separate section, say scheduling related 
configuration.
 - Hemanth> Can we add a note that in monitoring, when a task is killed, a 
message is logged so users can see it in the daemon logs.
   We also give this information, as to why the task is killed, in the tasks' 
diagnostic messages. We can say that in the documentation.
 - As for the overall documentation's organization, I too feel that we should 
separate cluster setup related information from user parameters.

capacity_scheduler.xml
 - May be we can give an example of how scheduling based on memory is done, 
citing real numbers and memory values. But I don't know for sure as the 
expected audience of this document doesn't look very clear to me.
 - That brings me to another point. May be we should separate 
capacity_scheduler.xml into different guides, or in the minimum different 
sections - for administrators, for users and a general configuration glossary - 
in the same vein as HOD's guides. Another point is that the scheduling steps 
are being described in good detail here. Only some part of it is actually 
needed/useful for the users. Once we have different guides/sections we can 
organize the documentation for memory based scheduling properly. Thoughts?

> Document TaskTracker's memory management functionality and 
> CapacityScheduler's memory based scheduling.
> -------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-4931
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4931
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/capacity-sched, mapred
>            Reporter: Vinod K V
>            Assignee: Vivek Ratan
>         Attachments: 4931.1.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-4931) Document TaskTracker's memory management functionality and CapacityScheduler's memory based scheduling.

Reply via email to