[GitHub] spark pull request #21372: [SPARK-24322][BUILD] Upgrade Apache ORC to 1.4.4

dongjoon-hyun Tue, 22 May 2018 13:29:20 -0700

Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21372#discussion_r190042175
  
    --- Diff: 
sql/core/src/main/java/org/apache/spark/sql/execution/datasources/orc/OrcColumnVector.java
 ---
    @@ -136,7 +136,7 @@ public int getInt(int rowId) {
       public long getLong(int rowId) {
         int index = getRowIndex(rowId);
         if (isTimestamp) {
    -      return timestampData.time[index] * 1000 + timestampData.nanos[index] 
/ 1000;
    +      return timestampData.time[index] * 1000 + timestampData.nanos[index] 
/ 1000 % 1000;
    --- End diff --
    
    No, what I mean is, with ORC-306 and this fix, there is no external impact 
outside Spark. More specifically, outside 
`OrcColumnVector`/`OrcColumnarBatchReader`. In other words, ORC 1.4.4 cannot be 
used with Apache Spark without this patch.
    
    Java `Timestamp.getTime` and Timestamp.getNano` has an overlap by 
definition. Previously, ORC didn't stick to the definition.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #21372: [SPARK-24322][BUILD] Upgrade Apache ORC to 1.4.4

Reply via email to