[
https://issues.apache.org/jira/browse/TAJO-931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14099728#comment-14099728
]
ASF GitHub Bot commented on TAJO-931:
-------------------------------------
GitHub user hyunsik opened a pull request:
https://github.com/apache/tajo/pull/119
TAJO-931: Output file can be punctuated depending on the file size.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/hyunsik/tajo TAJO-931
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/tajo/pull/119.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #119
----
commit a3b78642abb6c160b147eae2f29a10e362c14cac
Author: Hyunsik Choi <[email protected]>
Date: 2014-07-08T08:47:42Z
Improve session variables to affect the query config.
commit 0a0035d9b259a1a05ba790b7a778a745251d27bd
Author: Hyunsik Choi <[email protected]>
Date: 2014-07-08T12:54:32Z
Fixed.
commit 3fb54a6dde89d2d8e972253c1eccd17f334180d4
Author: Hyunsik Choi <[email protected]>
Date: 2014-07-09T02:23:28Z
Completed output file rotating.
commit 8028f5f876af2050bb602e277026e76ca802619a
Author: Hyunsik Choi <[email protected]>
Date: 2014-07-15T03:57:29Z
Merge branch 'master' of https://git-wip-us.apache.org/repos/asf/tajo into
OUTPUT_ROTATING
Conflicts:
tajo-core/src/main/java/org/apache/tajo/engine/planner/global/GlobalPlanner.java
tajo-core/src/main/java/org/apache/tajo/master/querymaster/Repartitioner.java
tajo-core/src/main/java/org/apache/tajo/master/querymaster/SubQuery.java
commit 50f6af418b42704ba14a4c7a084372f80c7ce1ec
Author: Hyunsik Choi <[email protected]>
Date: 2014-07-15T06:25:09Z
Merge branch 'master' of https://git-wip-us.apache.org/repos/asf/tajo into
OUTPUT_ROTATING
commit 4d0abc0dfbf6c5898bce6bd0e1ecd4c995108571
Author: Hyunsik Choi <[email protected]>
Date: 2014-07-15T11:13:55Z
Merge branch 'master' of https://git-wip-us.apache.org/repos/asf/tajo into
OUTPUT_ROTATING
Conflicts:
tajo-core/src/main/java/org/apache/tajo/engine/planner/physical/HashBasedColPartitionStoreExec.java
tajo-core/src/main/java/org/apache/tajo/engine/planner/physical/SortBasedColPartitionStoreExec.java
commit dd79f666d81875bf6a547478b76fc55b60f37d09
Author: Hyunsik Choi <[email protected]>
Date: 2014-07-15T12:31:11Z
Added estimatedwrittensize.
commit da231ca89e5cf3638ea16faad281f8296854a9dd
Author: Hyunsik Choi <[email protected]>
Date: 2014-07-17T03:03:37Z
Merge branch 'master' of https://git-wip-us.apache.org/repos/asf/tajo into
OUTPUT_ROTATING
commit c006382a3b16973872d753c9a0e0150da1c0f687
Author: Hyunsik Choi <[email protected]>
Date: 2014-07-17T03:10:20Z
Reflect session variables to GlobalPlanner, Repartitioner, and
PhysicalPlannerImpl.
commit 681aa25916f8de8a45f2b953215de76b023393a0
Author: Hyunsik Choi <[email protected]>
Date: 2014-08-11T07:56:37Z
Merge branch 'master' of https://git-wip-us.apache.org/repos/asf/tajo into
OUTPUT_ROTATING
Conflicts:
tajo-core/src/main/java/org/apache/tajo/engine/planner/PhysicalPlannerImpl.java
tajo-core/src/main/java/org/apache/tajo/engine/planner/global/GlobalPlanner.java
tajo-core/src/main/java/org/apache/tajo/engine/planner/physical/SortBasedColPartitionStoreExec.java
tajo-core/src/main/java/org/apache/tajo/engine/planner/physical/StoreTableExec.java
tajo-core/src/main/java/org/apache/tajo/engine/query/QueryContext.java
tajo-core/src/main/java/org/apache/tajo/master/querymaster/Repartitioner.java
tajo-core/src/main/java/org/apache/tajo/worker/TaskAttemptContext.java
tajo-storage/src/main/java/org/apache/tajo/storage/Appender.java
commit b7a73bb22df1198010e2b18f3e67aaeeec30f52f
Author: Hyunsik Choi <[email protected]>
Date: 2014-08-11T08:33:51Z
Merge branch 'master' of https://git-wip-us.apache.org/repos/asf/tajo into
OUTPUT_ROTATING
Conflicts:
tajo-core/src/main/java/org/apache/tajo/engine/planner/PhysicalPlannerImpl.java
tajo-core/src/main/java/org/apache/tajo/engine/planner/global/GlobalPlanner.java
tajo-core/src/main/java/org/apache/tajo/engine/planner/physical/SortBasedColPartitionStoreExec.java
tajo-core/src/main/java/org/apache/tajo/engine/planner/physical/StoreTableExec.java
tajo-core/src/main/java/org/apache/tajo/engine/query/QueryContext.java
tajo-core/src/main/java/org/apache/tajo/master/querymaster/Repartitioner.java
tajo-core/src/main/java/org/apache/tajo/worker/TaskAttemptContext.java
tajo-storage/src/main/java/org/apache/tajo/storage/Appender.java
commit 803fb6a677b6831faf5e602bf77961b31128b7cd
Author: Hyunsik Choi <[email protected]>
Date: 2014-08-15T14:20:55Z
Merge branch 'master' of https://git-wip-us.apache.org/repos/asf/tajo into
TAJO-931
commit 8a6f782bb3c423ee7ef17f8522fd9206803da0cb
Author: Hyunsik Choi <[email protected]>
Date: 2014-08-16T18:05:20Z
TAJO-931: Output file can be punctuated depending on the file size.
----
> Output file can be punctuated depending on the file size.
> ---------------------------------------------------------
>
> Key: TAJO-931
> URL: https://issues.apache.org/jira/browse/TAJO-931
> Project: Tajo
> Issue Type: Improvement
> Components: physical operator
> Reporter: Hyunsik Choi
> Assignee: Hyunsik Choi
> Fix For: 0.9.0
>
>
> There are some file formats (e.g., Parquet) which are not splittable. They
> can usually span multiple HDFS blocks if one file is very large. It causes
> remote HDFS access and limits the parallel degree, resulting in significant
> performance degradation.
> We can solve this problem if StoreTableExec or
> {Col|SortBased}PartitionStoreExec can punctuate the final output file
> according to the written size.
> In addition, we need to support a session variable to determine the per file
> size of final output files. So, TAJO-928 blocks this issue.
--
This message was sent by Atlassian JIRA
(v6.2#6252)