Github user squito commented on a diff in the pull request:
https://github.com/apache/spark/pull/19769#discussion_r152049670
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedColumnReader.java
---
@@ -298,7 +304,10 @@ private void decodeDictionaryIds(
// TODO: Convert dictionary of Binaries to dictionary of Longs
if (!column.isNullAt(i)) {
Binary v =
dictionary.decodeToBinary(dictionaryIds.getDictId(i));
- column.putLong(i,
ParquetRowConverter.binaryToSQLTimestamp(v));
+ long rawTime = ParquetRowConverter.binaryToSQLTimestamp(v);
+ long adjTime =
--- End diff --
oh excellent point. we'd just need to store an additional `int -> long`
map, but given that we've already got the dictionary this seems reasonable
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]