from:"omalley"

[GitHub] spark issue #13257: [SPARK-15474][SQL]ORC data source fails to write and rea...

2017-03-01 Thread omalley

Github user omalley commented on the issue: https://github.com/apache/spark/pull/13257 Ok, I see the problem. Hive's OrcInputFormat has that property, because it was getting the schema from the ObjectInspector, which only came with the values. When I get a chance, let me look at

[GitHub] spark issue #20511: [SPARK-23340][BUILD] Update ORC to 1.4.2

2018-02-10 Thread omalley

Github user omalley commented on the issue: https://github.com/apache/spark/pull/20511 Sorry, I forgot to transition the jira issues for the ORC 1.4.3, so they didn't show up in the search from the notes. The list of jiras closed by the 1.4.3 release is: https://s.apach

[GitHub] spark pull request #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-13 Thread omalley

Github user omalley commented on a diff in the pull request: https://github.com/apache/spark/pull/20511#discussion_r167950837 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcSourceSuite.scala --- @@ -160,6 +160,16 @@ abstract class OrcSuite

[GitHub] spark issue #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-16 Thread omalley

Github user omalley commented on the issue: https://github.com/apache/spark/pull/20511 I'm frustrated with the direction this has gone. The new reader is much better than the old reader, which uses Hive 1.2. ORC 1.4.3 had a pair of important, but not large or complex

[GitHub] spark issue #18640: [SPARK-21422][BUILD] Depend on Apache ORC 1.4.0

2017-08-07 Thread omalley

Github user omalley commented on the issue: https://github.com/apache/spark/pull/18640 @rxin The ORC core library's dependency tree is aggressively kept as small as possible. I've gone through and excluded unnecessary jars from our dependencies. I also kick back pull req

[GitHub] spark issue #18640: [SPARK-21422][BUILD] Depend on Apache ORC 1.4.0

2017-08-08 Thread omalley

Github user omalley commented on the issue: https://github.com/apache/spark/pull/18640 I would also comment that in the long term, Spark should move to using the vectorized reader in ORC's core. That would remove the dependence on ORC's mapreduce module, which provides

[GitHub] spark pull request #18640: [SPARK-21422][BUILD] Depend on Apache ORC 1.4.0

2017-08-15 Thread omalley

Github user omalley commented on a diff in the pull request: https://github.com/apache/spark/pull/18640#discussion_r133248648 --- Diff: sql/core/pom.xml --- @@ -87,6 +87,16 @@ + org.apache.orc + orc-core + ${orc.classifier

[GitHub] spark issue #13257: [SPARK-15474][SQL]ORC data source fails to write and rea...

[GitHub] spark issue #20511: [SPARK-23340][BUILD] Update ORC to 1.4.2

[GitHub] spark pull request #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

[GitHub] spark issue #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

[GitHub] spark issue #18640: [SPARK-21422][BUILD] Depend on Apache ORC 1.4.0

[GitHub] spark issue #18640: [SPARK-21422][BUILD] Depend on Apache ORC 1.4.0

[GitHub] spark pull request #18640: [SPARK-21422][BUILD] Depend on Apache ORC 1.4.0

7 matches

Site Navigation

Mail list logo

Footer information