[ 
https://issues.apache.org/jira/browse/TAJO-317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13833504#comment-13833504
 ] 

Jihoon Son commented on TAJO-317:
---------------------------------

Thanks for your contribution.
There is a couple of things that need to be discussed.
* As you mentioned, 'tajo.worker.parallel-execution.max-num' is no longer used. 
Please remove it.
* The name of ResourceRequestType does not look to be proper for its purpose. 
As described in this issue, the main purpose of the request type is 
representing the priority. I think that ResourceRequestPriority is more 
suitable.
* I have a doubt why a special resource request type for the query master, that 
is  ResourceRequestType.QUERY_MASTER, is required. I think that it can be 
handled as a kind of MEMORY type.
* In TajoWorkerResourceManager.chooseWorker(), the worker resource is not 
locked when the resource type is QUERY_MASTER or DISK. It will cause unexpected 
operations.
* As described in this issue, when the resource request priority is MEMORY (or 
DISK), the required disk (or memory) resources should be reduced as well as 
memory (or disk). But, I can't find any codes for this operation.
* As described in this issue, resource requests should contain both min and max 
values. But, I can't find these changes.
* Since YarnTajoResourceManager does not work properly, it will be better to 
throw an UnimplementedException() when a function of it is called.

> Improve TajoResourceManager to support more elaborate resource management
> -------------------------------------------------------------------------
>
>                 Key: TAJO-317
>                 URL: https://issues.apache.org/jira/browse/TAJO-317
>             Project: Tajo
>          Issue Type: Improvement
>          Components: resource manager
>            Reporter: Hyunsik Choi
>            Assignee: Keuntae Park
>             Fix For: 0.8-incubating
>
>         Attachments: TAJO-317.patch
>
>
> h3. Status of the current Tajo Resource Manager (RM)
>  * Tajo RM manages CPU, DISK resource incompletely, and it only provides 
> resource management through memory allocations. 
>  * In addition, Tajo RM considers the memory resource as the fixed number of 
> slots.
> h3. Problem
> In many cases, workloads can be categorized into I/O intensive job and CPU 
> and memory consuming job. For example, scan and hash partition or INSERT 
> OVERWRITE may be belong to I/O intensive job. In general, Aggregation can be 
> belong to CPU-memory consuming job. The current RM is not fit to support 
> selectively I/O intensive job or CPU-memory consuming job because it provides 
> only memory slots. We need more elaborate resource management mechanism.
> In addition, in most resource management systems, the remain resource less 
> than required resource is not allocated in response to a resource request. It 
> is not good to fully utilize the cluster resources. In order to mitigate this 
> problem, we need to add resilience to allocation mechanism. For example, 
> min-max request would be useful for it.
> h3. Proposal
>  * Tajo RM should provides resource management for disk and cpu-memory.
>  ** Tajo RM should provide allocation request call with min, max memory 
> request, and min, max disk request.
>  *** min-max request will be useful to fully utilize remain cluster resources.
>  * Each resource request should have a priority. The priority can be disk or 
> memory.
>   ** If the priority is disk
>   *** disk allocation will be limited depending on the remain disk resource
>   *** memory allocation will be not limited regardless of the remain memory 
> resource, and just reduce the remain memory resource.
>   ** If the priority is memory
>    *** memory allocation will be limited depending on the remain memory 
> resource
>    *** disk allocation will be not limited regardless of the remain disk 
> resource, and just reduce the remain disk resource.
>  * disk resource in each worker is represented as a float value.
>  ** The initial disk resource will be the number of disks which participate 
> in HDFS data directory.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to