[ 
https://issues.apache.org/jira/browse/TAJO-9?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13852967#comment-13852967
 ] 

Hyunsik Choi commented on TAJO-9:
---------------------------------

This issue is duplicated to TAJO-435.

> Change the default intermediate data file format for hash repartitioning
> ------------------------------------------------------------------------
>
>                 Key: TAJO-9
>                 URL: https://issues.apache.org/jira/browse/TAJO-9
>             Project: Tajo
>          Issue Type: Improvement
>          Components: data shuffle
>            Reporter: Hyunsik Choi
>            Assignee: Hyunsik Choi
>             Fix For: 0.8-incubating
>
>
> For easy debugging, the hash repartitioning have used CSV as the default 
> intermediate data format. CSV file format may cause parsing overhead, and it 
> may cause relatively large intermediate data to be transmitted via networks. 
> We need to change it to RawFile or another efficient file format.
> Digging PartitionedStoredExec class is a good starting point for this issue.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Reply via email to