Github user dongjoon-hyun commented on a diff in the pull request:
https://github.com/apache/spark/pull/21372#discussion_r190042175
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/execution/datasources/orc/OrcColumnVector.java
---
@@ -136,7 +136,7 @@ public int getInt(int rowId) {
public long getLong(int rowId) {
int index = getRowIndex(rowId);
if (isTimestamp) {
- return timestampData.time[index] * 1000 + timestampData.nanos[index]
/ 1000;
+ return timestampData.time[index] * 1000 + timestampData.nanos[index]
/ 1000 % 1000;
--- End diff --
No, what I mean is, with ORC-306 and this fix, there is no external impact
outside Spark. More specifically, outside
`OrcColumnVector`/`OrcColumnarBatchReader`. In other words, ORC 1.4.4 cannot be
used with Apache Spark without this patch.
Java `Timestamp.getTime` and Timestamp.getNano` has an overlap by
definition. Previously, ORC didn't stick to the definition.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]