[
https://issues.apache.org/jira/browse/TAJO-292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13838475#comment-13838475
]
Jihoon Son commented on TAJO-292:
---------------------------------
I'm sorry that I misunderstood this issue.
The main purpose of this issue is to get the proper number of partitions.
Because each task processes each partition, I suggested like above.
However, the task size should be treated at something else where each task is
created.
So, I agree with this implementation.
I'll review the remaining part of the patch.
> Too many intermediate partition files
> -------------------------------------
>
> Key: TAJO-292
> URL: https://issues.apache.org/jira/browse/TAJO-292
> Project: Tajo
> Issue Type: Bug
> Components: repartitioning
> Affects Versions: 0.2-incubating
> Reporter: Hyunsik Choi
> Assignee: Jinho Kim
> Priority: Critical
> Fix For: 0.8-incubating
>
> Attachments: TAJO-292.patch
>
>
> Unlike the before, the number of partitions are being currently determined by
> the volume size and the number of distinct keys. It can cause unnecessary
> overheads. We need to improve the partition number determiner to consider the
> number of cluster nodes.
--
This message was sent by Atlassian JIRA
(v6.1#6144)