[ 
https://issues.apache.org/jira/browse/TAJO-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14100390#comment-14100390
 ] 

ASF GitHub Bot commented on TAJO-992:
-------------------------------------

Github user babokim commented on a diff in the pull request:

    https://github.com/apache/tajo/pull/115#discussion_r16340021
  
    --- Diff: 
tajo-yarn-pullserver/src/main/java/org/apache/tajo/pullserver/TajoPullServerService.java
 ---
    @@ -207,10 +208,13 @@ public void init(Configuration conf) {
           selector = 
RpcChannelFactory.createServerChannelFactory("PullServerAuxService", workerNum);
     
           localFS = new LocalFileSystem();
    -      super.init(new Configuration(conf));
    +      //super.init(new Configuration(conf));
    --- End diff --
    
    Yes


> Reduce number of hash shuffle output file.
> ------------------------------------------
>
>                 Key: TAJO-992
>                 URL: https://issues.apache.org/jira/browse/TAJO-992
>             Project: Tajo
>          Issue Type: Sub-task
>          Components: data shuffle
>            Reporter: Hyoungjun Kim
>            Assignee: Hyoungjun Kim
>
> Currently Tajo creates too many intermediate files in the case of hash 
> shuffle. A execution block(SubQuery) on a TajoWorker creates intermediate 
> files  as following rule:
>   # intermediate files  in a worker = # tasks / # workers * # partitions 
> This may cause 'too many file opens' error and makes it difficult to scale 
> out. To solve this problem, We should reduce number of hash shuffle output 
> file.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to