Repository: spark Updated Branches: refs/heads/master ad853c567 -> f6255d7b7
[MINOR][SQL] Add disable bucketedRead workaround when throw RuntimeException ## What changes were proposed in this pull request? It will throw `RuntimeException` when read from bucketed table(about 1.7G per bucket file):  Default(enable bucket read):  Disable bucket read:  The reason is that each bucket file is too big. a workaround is disable bucket read. This PR add this workaround to Spark. ## How was this patch tested? manual tests Closes #23014 from wangyum/anotherWorkaround. Authored-by: Yuming Wang <[email protected]> Signed-off-by: hyukjinkwon <[email protected]> Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/f6255d7b Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/f6255d7b Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/f6255d7b Branch: refs/heads/master Commit: f6255d7b7cc4cc5d1f4fe0e5e493a1efee22f38f Parents: ad853c5 Author: Yuming Wang <[email protected]> Authored: Thu Nov 15 08:33:06 2018 +0800 Committer: hyukjinkwon <[email protected]> Committed: Thu Nov 15 08:33:06 2018 +0800 ---------------------------------------------------------------------- .../spark/sql/execution/vectorized/WritableColumnVector.java | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/spark/blob/f6255d7b/sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/WritableColumnVector.java ---------------------------------------------------------------------- diff --git a/sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/WritableColumnVector.java b/sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/WritableColumnVector.java index b0e119d..4f5e72c 100644 --- a/sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/WritableColumnVector.java +++ b/sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/WritableColumnVector.java @@ -101,10 +101,11 @@ public abstract class WritableColumnVector extends ColumnVector { String message = "Cannot reserve additional contiguous bytes in the vectorized reader (" + (requiredCapacity >= 0 ? "requested " + requiredCapacity + " bytes" : "integer overflow") + "). As a workaround, you can reduce the vectorized reader batch size, or disable the " + - "vectorized reader. For parquet file format, refer to " + + "vectorized reader, or disable " + SQLConf.BUCKETING_ENABLED().key() + " if you read " + + "from bucket table. For Parquet file format, refer to " + SQLConf.PARQUET_VECTORIZED_READER_BATCH_SIZE().key() + " (default " + SQLConf.PARQUET_VECTORIZED_READER_BATCH_SIZE().defaultValueString() + - ") and " + SQLConf.PARQUET_VECTORIZED_READER_ENABLED().key() + "; for orc file format, " + + ") and " + SQLConf.PARQUET_VECTORIZED_READER_ENABLED().key() + "; for ORC file format, " + "refer to " + SQLConf.ORC_VECTORIZED_READER_BATCH_SIZE().key() + " (default " + SQLConf.ORC_VECTORIZED_READER_BATCH_SIZE().defaultValueString() + ") and " + SQLConf.ORC_VECTORIZED_READER_ENABLED().key() + "."; --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
