[GitHub] spark issue #18014: [SPARK-20783][SQL] Enhance ColumnVector to keep UnsafeAr...

2017-10-28 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/18014 It is older one. Let me close this. Then, I will submit another PR very soon to do the same thing in different way. --- - To

[GitHub] spark issue #18014: [SPARK-20783][SQL] Enhance ColumnVector to keep UnsafeAr...

2017-10-27 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/18014 What is the latest status of this PR? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #18014: [SPARK-20783][SQL] Enhance ColumnVector to keep UnsafeAr...

2017-05-25 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/18014 @cloud-fan Here is an implementation based on Option 2 only for simple data types (e.g. boolean, int, double, and so on). Used bulk-copy for array body in `putArray()`, and used [element-wise copy

[GitHub] spark issue #18014: [SPARK-20783][SQL] Enhance ColumnVector to keep UnsafeAr...

2017-05-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18014 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77385/ Test PASSed. ---

[GitHub] spark issue #18014: [SPARK-20783][SQL] Enhance ColumnVector to keep UnsafeAr...

2017-05-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18014 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18014: [SPARK-20783][SQL] Enhance ColumnVector to keep UnsafeAr...

2017-05-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18014 **[Test build #77385 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77385/testReport)** for PR 18014 at commit

[GitHub] spark issue #18014: [SPARK-20783][SQL] Enhance ColumnVector to keep UnsafeAr...

2017-05-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18014 **[Test build #77385 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77385/testReport)** for PR 18014 at commit

[GitHub] spark issue #18014: [SPARK-20783][SQL] Enhance ColumnVector to keep UnsafeAr...

2017-05-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18014 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18014: [SPARK-20783][SQL] Enhance ColumnVector to keep UnsafeAr...

2017-05-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18014 **[Test build #77384 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77384/testReport)** for PR 18014 at commit

[GitHub] spark issue #18014: [SPARK-20783][SQL] Enhance ColumnVector to keep UnsafeAr...

2017-05-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18014 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77384/ Test FAILed. ---

[GitHub] spark issue #18014: [SPARK-20783][SQL] Enhance ColumnVector to keep UnsafeAr...

2017-05-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18014 **[Test build #77384 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77384/testReport)** for PR 18014 at commit

[GitHub] spark issue #18014: [SPARK-20783][SQL] Enhance ColumnVector to keep UnsafeAr...

2017-05-25 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/18014 I think option 2 is better. `ColumnVector.getArray()` should be as fast as possible. The caller side may just get an element from this array and then the final projection doesn't need to copy an

[GitHub] spark issue #18014: [SPARK-20783][SQL] Enhance ColumnVector to keep UnsafeAr...

2017-05-24 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/18014 @cloud-fan When I think about the use case of `ColumnVector.getArray` (i.e. in generated code by the whole-stage code geneneration), I think that it is better to return `UnsafeArrayData` instead of

[GitHub] spark issue #18014: [SPARK-20783][SQL] Enhance ColumnVector to keep UnsafeAr...

2017-05-24 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/18014 Yes. Let me implement new `putArray(int rowId, Array array)` that uses `ColumnVector.Array` and stores primitive-type elements into a primitive array (e.g. `intData`). --- If your project is set up

[GitHub] spark issue #18014: [SPARK-20783][SQL] Enhance ColumnVector to keep UnsafeAr...

2017-05-24 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/18014 I took a look at `ColumnVector.getArray`, seems it's already no cost? The writing needs some copy though. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark issue #18014: [SPARK-20783][SQL] Enhance ColumnVector to keep UnsafeAr...

2017-05-23 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/18014 @cloud-fan Thank you for your comments. Let me confirm your ideas. 1. Do you want to keep array contents in [a primitive data array (e.g.

[GitHub] spark issue #18014: [SPARK-20783][SQL] Enhance ColumnVector to keep UnsafeAr...

2017-05-23 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/18014 I think there is a gap between columnar format and the unsafe row format. The current `ColumnVector` format looks reasonable for array type, as it puts the leaf elements together, which is better

[GitHub] spark issue #18014: [SPARK-20783][SQL] Enhance ColumnVector to keep UnsafeAr...

2017-05-22 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/18014 @cloud-fan Could you please let us know your thoughts? Is it better to use binary type or to add simple logic for `UnsafeArrayData` and others in `ColumnVector`? --- If your project is set up

[GitHub] spark issue #18014: [SPARK-20783][SQL] Enhance ColumnVector to keep UnsafeAr...

2017-05-19 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/18014 @cloud-fan What would you think? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #18014: [SPARK-20783][SQL] Enhance ColumnVector to keep UnsafeAr...

2017-05-17 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/18014 I thought that idea is for Apache Arrow. We could use binary type for `UnsafeArrayData`. However, it involves some complexity to use

[GitHub] spark issue #18014: [SPARK-20783][SQL] Enhance ColumnVector to keep UnsafeAr...

2017-05-17 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/18014 I may miss something, can we just treat array type as binary type and put it in `ColumnVector`? --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark issue #18014: [SPARK-20783][SQL] Enhance ColumnVector to keep UnsafeAr...

2017-05-17 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/18014 @hvanhovell @sameeragarwal would it be possible to look at this? cc: @cloud-fan --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark issue #18014: [SPARK-20783][SQL] Enhance ColumnVector to keep UnsafeAr...

2017-05-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18014 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18014: [SPARK-20783][SQL] Enhance ColumnVector to keep UnsafeAr...

2017-05-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18014 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77024/ Test PASSed. ---

[GitHub] spark issue #18014: [SPARK-20783][SQL] Enhance ColumnVector to keep UnsafeAr...

2017-05-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18014 **[Test build #77024 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77024/testReport)** for PR 18014 at commit

[GitHub] spark issue #18014: [SPARK-20783][SQL] Enhance ColumnVector to keep UnsafeAr...

2017-05-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18014 **[Test build #77024 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77024/testReport)** for PR 18014 at commit

[GitHub] spark issue #18014: [SPARK-20783][SQL] Enhance ColumnVector to keep UnsafeAr...

2017-05-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18014 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18014: [SPARK-20783][SQL] Enhance ColumnVector to keep UnsafeAr...

2017-05-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18014 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77023/ Test FAILed. ---

[GitHub] spark issue #18014: [SPARK-20783][SQL] Enhance ColumnVector to keep UnsafeAr...

2017-05-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18014 **[Test build #77023 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77023/testReport)** for PR 18014 at commit

[GitHub] spark issue #18014: [SPARK-20783][SQL] Enhance ColumnVector to keep UnsafeAr...

2017-05-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18014 **[Test build #77023 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77023/testReport)** for PR 18014 at commit

[GitHub] spark issue #18014: [SPARK-20783][SQL] Enhance ColumnVector to keep UnsafeAr...

2017-05-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18014 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77017/ Test FAILed. ---

[GitHub] spark issue #18014: [SPARK-20783][SQL] Enhance ColumnVector to keep UnsafeAr...

2017-05-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18014 **[Test build #77017 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77017/testReport)** for PR 18014 at commit

[GitHub] spark issue #18014: [SPARK-20783][SQL] Enhance ColumnVector to keep UnsafeAr...

2017-05-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18014 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18014: [SPARK-20783][SQL] Enhance ColumnVector to keep UnsafeAr...

2017-05-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18014 **[Test build #77017 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77017/testReport)** for PR 18014 at commit

[GitHub] spark issue #18014: [SPARK-20783][SQL] Enhance ColumnVector to keep UnsafeAr...

2017-05-17 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/18014 Jenkins, test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #18014: [SPARK-20783][SQL] Enhance ColumnVector to keep UnsafeAr...

2017-05-17 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/18014 Jenkins, test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes