Github user vamaral1 commented on the issue:
https://github.com/apache/spark/pull/21397
Thanks for the quick responses. I did try to build everything from scratch
and am still getting the error on large datasets. If I run on a few tens of GB,
there's no problem but once it gets to a
Github user BryanCutler commented on the issue:
https://github.com/apache/spark/pull/21397
@vamaral1 , I've seen this error too and I'm trying to remember what the
cause was.. I think it can happen when there is some files get mixed up when
updating/building. If you're building your
Github user icexelloss commented on the issue:
https://github.com/apache/spark/pull/21397
This seems to indicate that the arrow stream from java -> python is closed
prematurely. If you have a way to reproduce I am happy to take a lok=ok.
---
Github user vamaral1 commented on the issue:
https://github.com/apache/spark/pull/21397
Thanks for the fix. I was having the memory leak issue described in
[JIRA](https://issues.apache.org/jira/browse/SPARK-24334) when working with
pandas udf's but was able to fix it after upgrading
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/21397
Thank you for fixing this :-)
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands,
Github user icexelloss commented on the issue:
https://github.com/apache/spark/pull/21397
Thanks all for review!
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/21397
Merged to master and branch-2.3.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands,
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21397
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21397
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91201/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21397
**[Test build #91201 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91201/testReport)**
for PR 21397 at commit
Github user icexelloss commented on the issue:
https://github.com/apache/spark/pull/21397
Sure! Added.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/21397
Btw, can you add a short note in PR description for the reason why the test
is just in the PR description? Thanks.
---
-
To
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/21397
LGTM too.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21397
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21397
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/3619/
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21397
**[Test build #91201 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91201/testReport)**
for PR 21397 at commit
Github user BryanCutler commented on the issue:
https://github.com/apache/spark/pull/21397
This seems like it should be good to me. It's a little bit different than
the ArrowConverters that also have a listener, because they are iterators and
the cleanup can't be put in a finally.
Github user icexelloss commented on the issue:
https://github.com/apache/spark/pull/21397
Hey @BryanCutler, any more thoughts on this?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For
Github user icexelloss commented on the issue:
https://github.com/apache/spark/pull/21397
Only when udf raises error. In normal case, there is no race because the
writer thread always closes the root and allocator before task completion
listener runs.
---
Github user BryanCutler commented on the issue:
https://github.com/apache/spark/pull/21397
One more question, do you only observe this when the python udf raises an
error or have you seen it in normal runtime operation?
---
Github user icexelloss commented on the issue:
https://github.com/apache/spark/pull/21397
Yeah this is a bit tricky because of the race. The test does fail on my
machine without the fix.
I have been changing the test data size until I can reproduce it, so it's
not great. If
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21397
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/90991/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21397
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21397
**[Test build #90991 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90991/testReport)**
for PR 21397 at commit
Github user BryanCutler commented on the issue:
https://github.com/apache/spark/pull/21397
@icexelloss the test you added passes for me without the fix, but I did see
a message of a suppressed exception in the finally block
"java.lang.IllegalStateException: ArrowBuf[3] refCnt has
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21397
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/3475/
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21397
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user icexelloss commented on the issue:
https://github.com/apache/spark/pull/21397
@BryanCutler @HyukjinKwon I am able to have it reproduced in unit test.
Please take a look thanks! :)
---
-
To unsubscribe,
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21397
**[Test build #90991 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90991/testReport)**
for PR 21397 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21397
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21397
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/90972/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21397
**[Test build #90972 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90972/testReport)**
for PR 21397 at commit
Github user icexelloss commented on the issue:
https://github.com/apache/spark/pull/21397
So far I have been using a local parquet file to test. Let me try if I can
create one on the fly to reproduce this.
---
-
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/21397
Yea, I was wondering about it too. It should be nicer if we have some steps
in the PR description.
---
-
To unsubscribe,
Github user BryanCutler commented on the issue:
https://github.com/apache/spark/pull/21397
I'm guessing making a reliable test is too difficult? is it possible to
provide some steps to reproduce?
---
-
To
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21397
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/3463/
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21397
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user icexelloss commented on the issue:
https://github.com/apache/spark/pull/21397
cc @BryanCutler
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21397
**[Test build #90972 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90972/testReport)**
for PR 21397 at commit
39 matches
Mail list logo