GitHub user jiangxb1987 opened a pull request:
https://github.com/apache/spark/pull/20361
[SPARK-23188] [SQL] Make vectorized columar reader batch size configurable
## What changes were proposed in this pull request?
This PR include the following changes:
- Make the capacity of `VectorizedParquetRecordReader` configurable;
- Make the capacity of `OrcColumnarBatchReader` configurable;
- Update the error message when required capacity in writable columnar
vector cannot be fulfilled.
## How was this patch tested?
N/A
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/jiangxb1987/spark vectorCapacity
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/20361.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #20361
----
commit 927c6b4d16b5a4c6457a190f3c1b2b8a5e439f2a
Author: Xingbo Jiang <xingbo.jiang@...>
Date: 2018-01-23T08:14:33Z
make vector batch size configurable.
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]