[ 
https://issues.apache.org/jira/browse/TAJO-931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14105098#comment-14105098
 ] 

Hudson commented on TAJO-931:
-----------------------------

SUCCESS: Integrated in Tajo-master-build #343 (See 
[https://builds.apache.org/job/Tajo-master-build/343/])
TAJO-931: Output file can be punctuated depending on the file size. (hyunsik) 
(hyunsik: rev a1711d16be579082fb57e5abb43ff1872d424451)
* 
tajo-core/src/main/java/org/apache/tajo/engine/planner/PhysicalPlannerImpl.java
* 
tajo-storage/src/main/java/org/apache/tajo/storage/thirdparty/parquet/ParquetWriter.java
* 
tajo-catalog/tajo-catalog-common/src/test/java/org/apache/tajo/catalog/TestKeyValueSet.java
* tajo-storage/src/main/java/org/apache/tajo/storage/FileAppender.java
* CHANGES
* 
tajo-storage/src/main/java/org/apache/tajo/storage/thirdparty/parquet/InternalParquetRecordWriter.java
* tajo-storage/src/main/java/org/apache/tajo/storage/Appender.java
* 
tajo-core/src/main/java/org/apache/tajo/engine/planner/logical/PersistentStoreNode.java
* 
tajo-core/src/main/java/org/apache/tajo/engine/planner/physical/HashBasedColPartitionStoreExec.java
* 
tajo-storage/src/test/java/org/apache/tajo/storage/TestCompressionStorages.java
* tajo-client/src/main/java/org/apache/tajo/client/TajoGetConf.java
* 
tajo-core/src/main/java/org/apache/tajo/engine/planner/physical/ColPartitionStoreExec.java
* 
tajo-storage/src/main/java/org/apache/tajo/storage/parquet/ParquetAppender.java
* 
tajo-core/src/main/java/org/apache/tajo/engine/planner/physical/PhysicalPlanUtil.java
* tajo-common/src/main/java/org/apache/tajo/util/KeyValueSet.java
* 
tajo-core/src/main/java/org/apache/tajo/engine/planner/physical/SortBasedColPartitionStoreExec.java
* tajo-common/src/main/java/org/apache/tajo/util/BitArray.java
* 
tajo-core/src/main/java/org/apache/tajo/engine/planner/physical/StoreTableExec.java
* tajo-common/src/main/java/org/apache/tajo/OverridableConf.java
* tajo-storage/src/main/java/org/apache/tajo/storage/HashShuffleAppender.java
* 
tajo-core/src/test/java/org/apache/tajo/engine/planner/physical/TestPhysicalPlanner.java
* tajo-core/src/main/java/org/apache/tajo/master/session/Session.java


> Output file can be punctuated depending on the file size.
> ---------------------------------------------------------
>
>                 Key: TAJO-931
>                 URL: https://issues.apache.org/jira/browse/TAJO-931
>             Project: Tajo
>          Issue Type: Improvement
>          Components: physical operator
>            Reporter: Hyunsik Choi
>            Assignee: Hyunsik Choi
>             Fix For: 0.9.0
>
>
> There are some file formats (e.g., Parquet) which are not splittable. They 
> can usually span multiple HDFS blocks if one file is very large. It causes 
> remote HDFS access and limits the parallel degree, resulting in significant 
> performance degradation.
> We can solve this problem if StoreTableExec or 
> {Col|SortBased}PartitionStoreExec can punctuate the final output file 
> according to the written size.
> In addition, we need to support a session variable to determine the per file 
> size of final output files. So, TAJO-928 blocks this issue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to