[
https://issues.apache.org/jira/browse/TAJO-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14100390#comment-14100390
]
ASF GitHub Bot commented on TAJO-992:
-------------------------------------
Github user babokim commented on a diff in the pull request:
https://github.com/apache/tajo/pull/115#discussion_r16340021
--- Diff:
tajo-yarn-pullserver/src/main/java/org/apache/tajo/pullserver/TajoPullServerService.java
---
@@ -207,10 +208,13 @@ public void init(Configuration conf) {
selector =
RpcChannelFactory.createServerChannelFactory("PullServerAuxService", workerNum);
localFS = new LocalFileSystem();
- super.init(new Configuration(conf));
+ //super.init(new Configuration(conf));
--- End diff --
Yes
> Reduce number of hash shuffle output file.
> ------------------------------------------
>
> Key: TAJO-992
> URL: https://issues.apache.org/jira/browse/TAJO-992
> Project: Tajo
> Issue Type: Sub-task
> Components: data shuffle
> Reporter: Hyoungjun Kim
> Assignee: Hyoungjun Kim
>
> Currently Tajo creates too many intermediate files in the case of hash
> shuffle. A execution block(SubQuery) on a TajoWorker creates intermediate
> files as following rule:
> # intermediate files in a worker = # tasks / # workers * # partitions
> This may cause 'too many file opens' error and makes it difficult to scale
> out. To solve this problem, We should reduce number of hash shuffle output
> file.
--
This message was sent by Atlassian JIRA
(v6.2#6252)