[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user henrify commented on the issue: https://github.com/apache/spark/pull/19943 Great job guys! Also, check through the spam of your public github email address for a small gift @dongjoon-hyun @cloud-fan @viirya @kiszk @HyukjinKwon @mmccline --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/19943 Thank you so much, @cloud-fan , @mmccline , @viirya , @henrify , @kiszk , @HyukjinKwon ! I'll proceed to follow-ups. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19943 thanks, merging to master/2.3! Let's address the comments in follow-up. BTW @dongjoon-hyun let's keep our discussion on https://github.com/apache/spark/pull/19943#discussion_r160326383 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19943 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85845/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19943 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19943 **[Test build #85845 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85845/testReport)** for PR 19943 at commit [`2cf98b6`](https://github.com/apache/spark/commit/2cf98b6734c806f66e21df50520a465b03d9f060). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19943 **[Test build #85845 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85845/testReport)** for PR 19943 at commit [`2cf98b6`](https://github.com/apache/spark/commit/2cf98b6734c806f66e21df50520a465b03d9f060). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19943 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19943 **[Test build #85837 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85837/testReport)** for PR 19943 at commit [`2cf98b6`](https://github.com/apache/spark/commit/2cf98b6734c806f66e21df50520a465b03d9f060). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19943 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19943 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85837/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19943 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19943 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85829/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19943 **[Test build #85829 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85829/testReport)** for PR 19943 at commit [`db02555`](https://github.com/apache/spark/commit/db025552700f174686ddea9f6ea6f13078a64079). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19943 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19943 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85827/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19943 **[Test build #85827 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85827/testReport)** for PR 19943 at commit [`91b3d66`](https://github.com/apache/spark/commit/91b3d662fd99ad099b3d1226a8ecb261a6db0ae0). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19943 **[Test build #85837 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85837/testReport)** for PR 19943 at commit [`2cf98b6`](https://github.com/apache/spark/commit/2cf98b6734c806f66e21df50520a465b03d9f060). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/19943 It's strange. The two tests have been passed in local labtop while jenkins always seems to fail. I'll investigate it. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19943 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19943 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85823/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19943 **[Test build #85823 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85823/testReport)** for PR 19943 at commit [`8fc2162`](https://github.com/apache/spark/commit/8fc2162c3be968324c40a8717e4ddcc5cf173ec9). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19943 **[Test build #85829 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85829/testReport)** for PR 19943 at commit [`db02555`](https://github.com/apache/spark/commit/db025552700f174686ddea9f6ea6f13078a64079). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/19943 Thank you for your help, @henrify . I think it's within margin of deviation. Split methods will be better for maintenance. So, I pushed it. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user henrify commented on the issue: https://github.com/apache/spark/pull/19943 @dongjoon-hyun Thank you for testing the split methods. If anything the benchmark results look couple of percent slower now? Oh well, at least it is good to know that your code is as fast as it can be! I have no further ideas how performance could possibly be improved. Just many thanks to you and all reviewers for your hard work on this PR! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19943 **[Test build #85827 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85827/testReport)** for PR 19943 at commit [`91b3d66`](https://github.com/apache/spark/commit/91b3d662fd99ad099b3d1226a8ecb261a6db0ae0). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/19943 Oops. Without noticing your comments, I pushed another refactoring which split the functions. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19943 **[Test build #85823 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85823/testReport)** for PR 19943 at commit [`8fc2162`](https://github.com/apache/spark/commit/8fc2162c3be968324c40a8717e4ddcc5cf173ec9). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19943 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85818/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19943 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19943 **[Test build #85818 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85818/testReport)** for PR 19943 at commit [`b623ca4`](https://github.com/apache/spark/commit/b623ca4621d22d294974cb3a7f88260052b1f38c). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19943 LGTM except some minor comments --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19943 **[Test build #85818 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85818/testReport)** for PR 19943 at commit [`b623ca4`](https://github.com/apache/spark/commit/b623ca4621d22d294974cb3a7f88260052b1f38c). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user henrify commented on the issue: https://github.com/apache/spark/pull/19943 @dongjoon-hyun the nextBatch() is invoked 4096x less often than the main copy loops, so it doesn't matter much.. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/19943 After minimizing `nextBatch`, it becomes smaller than Parquet's `nextBatch`. But, it's inlined only some cases, but mostly not. It's not helpful. For the other technique, I'll try later. ``` org.apache.spark.sql.execution.datasources.orc.OrcColumnarBatchReader::nextBatch (107 bytes) inline (hot) ``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user henrify commented on the issue: https://github.com/apache/spark/pull/19943 @dongjoon-hyun Thanks. I don't think it matters if nextBatch() is inlined or not. I think what matters is 1) how the putX() etc methods calls inside the tight loops are inlined and 2) how complex the methods containing the tight loops are. For example the toColumn argument is megamorphic and the putX() implementation is bimorphic, and then you have about 10 of these in single method inside if-else 'instanceof' checks. That's quite complex for JVM to optimize. If you split the loops so that each loop has it's own method with the toColumn defined as exact type (BytesColumnVector etc), then the argument is monomorphic, putX() is 100% biased bimorphic, and there is only one of these. Lot easier for JVM to optimize. Again, i'm not sure if it makes difference, but it may, and it is easy to try (e.g. extract the for loops of just one data type to separate method and benchmark). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19943 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85808/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19943 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19943 **[Test build #85808 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85808/testReport)** for PR 19943 at commit [`ba03d20`](https://github.com/apache/spark/commit/ba03d20ac6c826b5f16307884e34c1f4022eb814). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/19943 Thanks. I checked the as-is inline behavior. As you told, ORC nextBatch is not inlined so far while Parquet nextBatch does. I'll try to optimize that. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19943 **[Test build #85808 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85808/testReport)** for PR 19943 at commit [`ba03d20`](https://github.com/apache/spark/commit/ba03d20ac6c826b5f16307884e34c1f4022eb814). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user henrify commented on the issue: https://github.com/apache/spark/pull/19943 @dongjoon-hyun Ok thanks. It is pity that the single buffer cannot be used, would have reduced number of arraycopy() calls by 5 orders of magnitude.. Btw have you tested the inlining behaviour or tried to extract the copying loop of one type to a small separate method? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/19943 @henrify , @cloud-fan . For @henrify 's question, I got the answer. The answer is negative like the official document. Even ORC reader side, the data for a VectorizedRowBatch comes from more than one internal buffer some cases. So, even ORC reader itself doesn't assume that. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19943 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85794/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19943 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85793/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19943 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19943 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19943 **[Test build #85794 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85794/testReport)** for PR 19943 at commit [`10e5d7a`](https://github.com/apache/spark/commit/10e5d7a4bbac748019508fe3104e48c392696d9f). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19943 **[Test build #85793 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85793/testReport)** for PR 19943 at commit [`15cac9c`](https://github.com/apache/spark/commit/15cac9cf6b99415b03fc818fbb14a16b722c9058). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user henrify commented on the issue: https://github.com/apache/spark/pull/19943 @dongjoon-hyun It is possible that the "multiple byte arrays" case happens only in write side when consumer code explicitly does it, and it is fine to use the single byte array and putArray() in read side. The doc is not clear. As there seemed to be clear performance impact, this might be worth checking from ORC devs or from their source. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19943 **[Test build #85793 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85793/testReport)** for PR 19943 at commit [`15cac9c`](https://github.com/apache/spark/commit/15cac9cf6b99415b03fc818fbb14a16b722c9058). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19943 **[Test build #85794 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85794/testReport)** for PR 19943 at commit [`10e5d7a`](https://github.com/apache/spark/commit/10e5d7a4bbac748019508fe3104e48c392696d9f). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19943 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19943 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85792/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19943 **[Test build #85792 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85792/testReport)** for PR 19943 at commit [`3df7d1e`](https://github.com/apache/spark/commit/3df7d1ee9d0cb9ea25a9d1e0e2db539121ad50de). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/19943 Could you be more specific, @henrify ? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/19943 Yes. It really does, @henrify . --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19943 **[Test build #85792 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85792/testReport)** for PR 19943 at commit [`3df7d1e`](https://github.com/apache/spark/commit/3df7d1ee9d0cb9ea25a9d1e0e2db539121ad50de). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user henrify commented on the issue: https://github.com/apache/spark/pull/19943 @cloud-fan Oh you are right, it is indeed byte[][]. The BytesColumnVector has separate per-row offset and length vectors/arrays, which seemed to indicate that it would be contiguous block, and those auxiliary arrays/vectors are used to mark the boundaries. I'm not sure why they need those two additional vectors then. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user henrify commented on the issue: https://github.com/apache/spark/pull/19943 @dongjoon-hyun Great, so it is bit faster with putX, but not that much. I'm still concerned how well the big nextBatch() method gets optimized; JVM can bail out of optimizing complex methods. If you don't want to do full refactor of splitting every for loop to small individual methods, you could try extracting for loop of just one data type and benchmark if it makes any difference. Also, have you checked how inlining goes with XX:+UnlockDiagnosticVMOptions and -XX:+PrintInlining ? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19943 BTW @dongjoon-hyun can you also address https://github.com/apache/spark/pull/19943#discussion_r160076528 ? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19943 @henrify I took a look at the string/binary type of ORC batch, the data is stored in a ` byte[][]`, which is not a continuous byte array and we can't do a single copy. For better performance, I think we need to use low-level ORC reader API, we can consider this in the future. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19943 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19943 **[Test build #85788 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85788/testReport)** for PR 19943 at commit [`3a0702a`](https://github.com/apache/spark/commit/3a0702ae0b31f762c9f3da06d267a02ec8d1a23b). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19943 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85788/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/19943 @henrify and @cloud-fan . I updated the PR with put APIs. You can check the BM result. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19943 **[Test build #85788 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85788/testReport)** for PR 19943 at commit [`3a0702a`](https://github.com/apache/spark/commit/3a0702ae0b31f762c9f3da06d267a02ec8d1a23b). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19943 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85772/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19943 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19943 **[Test build #85772 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85772/testReport)** for PR 19943 at commit [`7214ec0`](https://github.com/apache/spark/commit/7214ec03f8e48d51e0fd1f3314a0af6ac8275412). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `public class JavaOrcColumnarBatchReader extends RecordReader` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19943 **[Test build #85772 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85772/testReport)** for PR 19943 at commit [`7214ec0`](https://github.com/apache/spark/commit/7214ec03f8e48d51e0fd1f3314a0af6ac8275412). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/19943 @cloud-fan . According to your advice, I added JavaOrcColumnarBatchReader and compared the result. Could you review the PR again? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19943 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19943 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85763/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19943 **[Test build #85763 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85763/testReport)** for PR 19943 at commit [`0a44d7d`](https://github.com/apache/spark/commit/0a44d7d20e2a0df71fb499db67e0e4779fa46874). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19943 **[Test build #85763 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85763/testReport)** for PR 19943 at commit [`0a44d7d`](https://github.com/apache/spark/commit/0a44d7d20e2a0df71fb499db67e0e4779fa46874). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/19943 I'm still testing some other stuff this PR. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/19943 Thank you, @HyukjinKwon . --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19943 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19943 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85745/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19943 **[Test build #85745 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85745/testReport)** for PR 19943 at commit [`aeb6abd`](https://github.com/apache/spark/commit/aeb6abd66ee3338635edf9dca85894c14a05fb72). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19943 **[Test build #85745 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85745/testReport)** for PR 19943 at commit [`aeb6abd`](https://github.com/apache/spark/commit/aeb6abd66ee3338635edf9dca85894c14a05fb72). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19943 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19943 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85743/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19943 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85744/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19943 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19943 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19943 **[Test build #85743 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85743/testReport)** for PR 19943 at commit [`fca6a5f`](https://github.com/apache/spark/commit/fca6a5fe83c46d9cb1c2bdf163920a82fcd0b7a2). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19943 **[Test build #85744 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85744/testReport)** for PR 19943 at commit [`aeb6abd`](https://github.com/apache/spark/commit/aeb6abd66ee3338635edf9dca85894c14a05fb72). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19943 **[Test build #85744 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85744/testReport)** for PR 19943 at commit [`aeb6abd`](https://github.com/apache/spark/commit/aeb6abd66ee3338635edf9dca85894c14a05fb72). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19943 **[Test build #85743 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85743/testReport)** for PR 19943 at commit [`fca6a5f`](https://github.com/apache/spark/commit/fca6a5fe83c46d9cb1c2bdf163920a82fcd0b7a2). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/19943 I answered at the comment~ --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19943 overall looks good, my major concern is https://github.com/apache/spark/pull/19943/files#r159221758 , do you have an answer? This may be a big drawback compared to the wrapper solution. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/19943 @cloud-fan . I rebased based on #20116, could you review this again? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19943 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85674/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19943 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19943 **[Test build #85674 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85674/testReport)** for PR 19943 at commit [`83cc3b5`](https://github.com/apache/spark/commit/83cc3b5b327faa40ebd114f9c7d8f38326a30b5b). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19943 **[Test build #85674 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85674/testReport)** for PR 19943 at commit [`83cc3b5`](https://github.com/apache/spark/commit/83cc3b5b327faa40ebd114f9c7d8f38326a30b5b). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/19943 Thank you, @cloud-fan . I'll try to update this after #20116 lands on master branch. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19943 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85534/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org