Github user kiszk commented on the issue:
https://github.com/apache/spark/pull/18014
It is older one. Let me close this. Then, I will submit another PR very
soon to do the same thing in different way.
---
-
To
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/18014
What is the latest status of this PR?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user kiszk commented on the issue:
https://github.com/apache/spark/pull/18014
@cloud-fan Here is an implementation based on Option 2 only for simple data
types (e.g. boolean, int, double, and so on). Used bulk-copy for array body in
`putArray()`, and used [element-wise copy
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/18014
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77385/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/18014
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/18014
**[Test build #77385 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77385/testReport)**
for PR 18014 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/18014
**[Test build #77385 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77385/testReport)**
for PR 18014 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/18014
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/18014
**[Test build #77384 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77384/testReport)**
for PR 18014 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/18014
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77384/
Test FAILed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/18014
**[Test build #77384 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77384/testReport)**
for PR 18014 at commit
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/18014
I think option 2 is better. `ColumnVector.getArray()` should be as fast as
possible. The caller side may just get an element from this array and then the
final projection doesn't need to copy an
Github user kiszk commented on the issue:
https://github.com/apache/spark/pull/18014
@cloud-fan When I think about the use case of `ColumnVector.getArray` (i.e.
in generated code by the whole-stage code geneneration), I think that it is
better to return `UnsafeArrayData` instead of
Github user kiszk commented on the issue:
https://github.com/apache/spark/pull/18014
Yes. Let me implement new `putArray(int rowId, Array array)` that uses
`ColumnVector.Array` and stores primitive-type elements into a primitive array
(e.g. `intData`).
---
If your project is set up
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/18014
I took a look at `ColumnVector.getArray`, seems it's already no cost? The
writing needs some copy though.
---
If your project is set up for it, you can reply to this email and have your
reply
Github user kiszk commented on the issue:
https://github.com/apache/spark/pull/18014
@cloud-fan Thank you for your comments. Let me confirm your ideas.
1. Do you want to keep array contents in [a primitive data array (e.g.
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/18014
I think there is a gap between columnar format and the unsafe row format.
The current `ColumnVector` format looks reasonable for array type, as it puts
the leaf elements together, which is better
Github user kiszk commented on the issue:
https://github.com/apache/spark/pull/18014
@cloud-fan Could you please let us know your thoughts?
Is it better to use binary type or to add simple logic for
`UnsafeArrayData` and others in `ColumnVector`?
---
If your project is set up
Github user kiszk commented on the issue:
https://github.com/apache/spark/pull/18014
@cloud-fan What would you think?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and
Github user kiszk commented on the issue:
https://github.com/apache/spark/pull/18014
I thought that idea is for Apache Arrow.
We could use binary type for `UnsafeArrayData`. However, it involves some
complexity to use
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/18014
I may miss something, can we just treat array type as binary type and put
it in `ColumnVector`?
---
If your project is set up for it, you can reply to this email and have your
reply appear on
Github user kiszk commented on the issue:
https://github.com/apache/spark/pull/18014
@hvanhovell @sameeragarwal would it be possible to look at this?
cc: @cloud-fan
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/18014
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/18014
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77024/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/18014
**[Test build #77024 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77024/testReport)**
for PR 18014 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/18014
**[Test build #77024 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77024/testReport)**
for PR 18014 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/18014
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/18014
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77023/
Test FAILed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/18014
**[Test build #77023 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77023/testReport)**
for PR 18014 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/18014
**[Test build #77023 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77023/testReport)**
for PR 18014 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/18014
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77017/
Test FAILed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/18014
**[Test build #77017 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77017/testReport)**
for PR 18014 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/18014
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/18014
**[Test build #77017 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77017/testReport)**
for PR 18014 at commit
Github user kiszk commented on the issue:
https://github.com/apache/spark/pull/18014
Jenkins, test this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes
Github user kiszk commented on the issue:
https://github.com/apache/spark/pull/18014
Jenkins, test this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes
36 matches
Mail list logo