[ https://issues.apache.org/jira/browse/SPARK-24322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Wenchen Fan resolved SPARK-24322. --------------------------------- Resolution: Fixed Fix Version/s: 2.3.1 2.4.0 Issue resolved by pull request 21372 [https://github.com/apache/spark/pull/21372] > Upgrade Apache ORC to 1.4.4 > --------------------------- > > Key: SPARK-24322 > URL: https://issues.apache.org/jira/browse/SPARK-24322 > Project: Spark > Issue Type: Bug > Components: Build > Affects Versions: 2.4.0 > Reporter: Dongjoon Hyun > Assignee: Dongjoon Hyun > Priority: Major > Labels: correctness > Fix For: 2.4.0, 2.3.1 > > > ORC 1.4.4 includes [nine > fixes|https://issues.apache.org/jira/issues/?filter=12342568&jql=project%20%3D%20ORC%20AND%20resolution%20%3D%20Fixed%20AND%20fixVersion%20%3D%201.4.4]. > One of the issues is about `Timestamp` bug (ORC-306) which occurs when > `native` ORC vectorized reader reads ORC column vector's sub-vector `times` > and `nanos`. ORC-306 fixes this according to the [original > definition|https://github.com/apache/hive/blob/master/storage-api/src/java/org/apache/hadoop/hive/ql/exec/vector/TimestampColumnVector.java#L45-L46] > and the linked PR includes the updated interpretation on ORC column vectors. > Note that `hive` ORC reader and ORC MR reader is not affected. > {code} > scala> spark.version > res0: String = 2.3.0 > scala> spark.sql("set spark.sql.orc.impl=native") > scala> Seq(java.sql.Timestamp.valueOf("1900-05-05 > 12:34:56.000789")).toDF().write.orc("/tmp/orc") > scala> spark.read.orc("/tmp/orc").show(false) > +--------------------------+ > |value | > +--------------------------+ > |1900-05-05 12:34:55.000789| > +--------------------------+ > {code} > This issue aims to update Apache Spark to use it. > *FULL LIST* > || ID || TITLE || > | ORC-281 | Fix compiler warnings from clang 5.0 | > | ORC-301 | `extractFileTail` should open a file in `try` statement | > | ORC-304 | Fix TestRecordReaderImpl to not fail with new storage-api | > | ORC-306 | Fix incorrect workaround for bug in java.sql.Timestamp | > | ORC-324 | Add support for ARM and PPC arch | > | ORC-330 | Remove unnecessary Hive artifacts from root pom | > | ORC-332 | Add syntax version to orc_proto.proto | > | ORC-336 | Remove avro and parquet dependency management entries | > | ORC-360 | Implement error checking on subtype fields in Java | -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org