Github user henryr commented on a diff in the pull request:
https://github.com/apache/spark/pull/19769#discussion_r151830543
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedColumnReader.java
---
@@ -298,7 +304,10 @@ private void decodeDictionaryIds(
// TODO: Convert dictionary of Binaries to dictionary of Longs
if (!column.isNullAt(i)) {
Binary v =
dictionary.decodeToBinary(dictionaryIds.getDictId(i));
- column.putLong(i,
ParquetRowConverter.binaryToSQLTimestamp(v));
+ long rawTime = ParquetRowConverter.binaryToSQLTimestamp(v);
+ long adjTime =
--- End diff --
is it practical to consider caching the decoded and converted dictionary
values?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]