[ 
https://issues.apache.org/jira/browse/TAJO-986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14108897#comment-14108897
 ] 

Mai Hai Thanh commented on TAJO-986:
------------------------------------

Hi [~jihoonson], thanks for your explanation. As you said, task size should be 
the length of a fragment instead of total fragment length. However, a SubQuery 
contains only 1 TaskSchedulerContext object, which also contains only 1 
taskSize value. So, do you think that it is necessary to make modifications so 
that a SubQuery or a TaskSchedulerContext object can have multiple taskSize 
values ?. How about setting task size as the biggest (or smallest) length among 
all fragments' lengths ?. Moreover, how important is the task size ? will it be 
used as the amount of requested memory when Tajo requests a Yarn container ?

Beside, I figured out that the task size is not used anywhere later, after 
{{subQuery.schedulerContext.setTaskSize(fragments.size());}}, in the case of 
DefaultTaskScheduler. So, currently this bug does not cause any problem. Do you 
plan to extend the DefaultTaskScheduler to use task size information in the 
future ?

> Task scheduler gets incorrect task size
> ---------------------------------------
>
>                 Key: TAJO-986
>                 URL: https://issues.apache.org/jira/browse/TAJO-986
>             Project: Tajo
>          Issue Type: Bug
>            Reporter: Mai Hai Thanh
>            Assignee: Mai Hai Thanh
>         Attachments: TAJO-986.140812.patch.txt
>
>
> In function {{scheduleFragmentsForLeafQuery}} in file SubQuery.java, the 
> following 2 lines exist
> {code}
> subQuery.schedulerContext.setTaskSize(fragments.size());
> ...
> subQuery.schedulerContext.setTaskSize(conf.getIntVar(ConfVars.TASK_DEFAULT_SIZE)
>  * 1024 * 1024);
> {code}
> It is very likely that one of them is not correct.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to