[ 
https://issues.apache.org/jira/browse/TAJO-9?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyunsik Choi resolved TAJO-9.
-----------------------------

       Resolution: Duplicate
    Fix Version/s: 0.8-incubating

> Change the default intermediate data file format for hash repartitioning
> ------------------------------------------------------------------------
>
>                 Key: TAJO-9
>                 URL: https://issues.apache.org/jira/browse/TAJO-9
>             Project: Tajo
>          Issue Type: Improvement
>          Components: data shuffle
>            Reporter: Hyunsik Choi
>            Assignee: Hyunsik Choi
>             Fix For: 0.8-incubating
>
>
> For easy debugging, the hash repartitioning have used CSV as the default 
> intermediate data format. CSV file format may cause parsing overhead, and it 
> may cause relatively large intermediate data to be transmitted via networks. 
> We need to change it to RawFile or another efficient file format.
> Digging PartitionedStoredExec class is a good starting point for this issue.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Reply via email to