Jinho Kim created TAJO-743:
------------------------------

             Summary: Change the default resource allocation policy of leaf 
tasks
                 Key: TAJO-743
                 URL: https://issues.apache.org/jira/browse/TAJO-743
             Project: Tajo
          Issue Type: Improvement
          Components: resource manager
    Affects Versions: 0.8-incubating, 1.0-incubating
            Reporter: Jinho Kim
            Assignee: Jinho Kim


Currently, resource allocation is calculated by memory base. If a machine have 
a large memory, in default settings, heavy disk IO per disk is usually caused 
by high task concurrency. However, it is likely to seem to be problematic.

When i tested the leaf task scan by 2(concurrency of SATA disk), the 
performance was better. if you have SAS Storage or SSD, you can increase the 
disk concurrency. This patch changes the default resource allocation policy to 
use disk resource.

The following configs have been available so far:
 * tajo.worker.resource.disks - available disk resource of each worker
 * tajo.task.disk-slot.default - how many disk resource is consumed per task

Below config is newly introduced in this patch
 * tajo.worker.resource.dfs-dir-aware - it can be true/false. If it is true, 
each worker uses the number of HDFS datanode's data dirs in the worker as the 
disk resource. So, tajo.worker.resource.disks is ignored.






--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to