Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/10593#discussion_r49821548
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/UnsafeRowParquetRecordReader.java
---
@@ -103,6 +105,25 @@
private static final int DEFAULT_VAR_LEN_SIZE = 32;
/**
+ * columnBatch object that is used for batch decoding. This is created
on first use and triggers
+ * batched decoding. It is not valid to interleave calls to the batched
interface with the row
+ * by row RecordReader APIs.
+ * This is only enabled with additional flags for development. This is
still a work in progress
+ * and currently unsupported cases will fail with potentially difficult
to diagnose errors.
+ * This should be only turned on for development to work on this feature.
+ *
+ * TODOs:
+ * - Implement all the encodings to support vectorized.
+ * - Implement v2 page formats (just make sure we create the correct
decoders).
+ */
+ private ColumnarBatch columnarBatch;
+
+ /**
+ * The default config on whether columnarBatch should be offheap.
+ */
+ private static final boolean DEFAULT_OFFHEAP = false;
--- End diff --
this naming is fairly confusing.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]