[
https://issues.apache.org/jira/browse/TAJO-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13963899#comment-13963899
]
Jinho Kim commented on TAJO-743:
--------------------------------
Created a review request against branch master in reviewboard
https://reviews.apache.org/r/20149/
> Change the default resource allocation policy of leaf tasks
> -----------------------------------------------------------
>
> Key: TAJO-743
> URL: https://issues.apache.org/jira/browse/TAJO-743
> Project: Tajo
> Issue Type: Improvement
> Components: resource manager
> Affects Versions: 0.8-incubating, 1.0-incubating
> Reporter: Jinho Kim
> Assignee: Jinho Kim
> Attachments: TAJO-743.patch, TAJO-743_branch-0.8.0.patch
>
>
> Currently, resource allocation is calculated by memory base. If a machine
> have a large memory, in default settings, heavy disk IO per disk is usually
> caused by high task concurrency. However, it is likely to seem to be
> problematic.
> When i tested the leaf task scan by 2(concurrency of SATA disk), the
> performance was better. if you have SAS Storage or SSD, you can increase the
> disk concurrency. This patch changes the default resource allocation policy
> to use disk resource.
> The following configs have been available so far:
> * tajo.worker.resource.disks - available disk resource of each worker
> * tajo.task.disk-slot.default - how many disk resource is consumed per task
> Below config is newly introduced in this patch
> * tajo.worker.resource.dfs-dir-aware - it can be true/false. If it is true,
> each worker uses the number of HDFS datanode's data dirs in the worker as the
> disk resource. So, tajo.worker.resource.disks is ignored.
--
This message was sent by Atlassian JIRA
(v6.2#6252)