[
https://issues.apache.org/jira/browse/TAJO-986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14108897#comment-14108897
]
Mai Hai Thanh commented on TAJO-986:
------------------------------------
Hi [~jihoonson], thanks for your explanation. As you said, task size should be
the length of a fragment instead of total fragment length. However, a SubQuery
contains only 1 TaskSchedulerContext object, which also contains only 1
taskSize value. So, do you think that it is necessary to make modifications so
that a SubQuery or a TaskSchedulerContext object can have multiple taskSize
values ?. How about setting task size as the biggest (or smallest) length among
all fragments' lengths ?. Moreover, how important is the task size ? will it be
used as the amount of requested memory when Tajo requests a Yarn container ?
Beside, I figured out that the task size is not used anywhere later, after
{{subQuery.schedulerContext.setTaskSize(fragments.size());}}, in the case of
DefaultTaskScheduler. So, currently this bug does not cause any problem. Do you
plan to extend the DefaultTaskScheduler to use task size information in the
future ?
> Task scheduler gets incorrect task size
> ---------------------------------------
>
> Key: TAJO-986
> URL: https://issues.apache.org/jira/browse/TAJO-986
> Project: Tajo
> Issue Type: Bug
> Reporter: Mai Hai Thanh
> Assignee: Mai Hai Thanh
> Attachments: TAJO-986.140812.patch.txt
>
>
> In function {{scheduleFragmentsForLeafQuery}} in file SubQuery.java, the
> following 2 lines exist
> {code}
> subQuery.schedulerContext.setTaskSize(fragments.size());
> ...
> subQuery.schedulerContext.setTaskSize(conf.getIntVar(ConfVars.TASK_DEFAULT_SIZE)
> * 1024 * 1024);
> {code}
> It is very likely that one of them is not correct.
--
This message was sent by Atlassian JIRA
(v6.2#6252)