[GitHub] spark pull request #21397: [SPARK-24334] Fix race condition in ArrowPythonRu...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/21397 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21397: [SPARK-24334] Fix race condition in ArrowPythonRu...
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/21397#discussion_r191082371 --- Diff: python/pyspark/sql/tests.py --- @@ -4931,6 +4931,30 @@ def foo3(key, pdf): expected4 = udf3.func((), pdf) self.assertPandasEqual(expected4, result4) +# Regression test for SPARK-24334 +def test_memory_leak(self): --- End diff -- SGTM! Moved to PR description. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21397: [SPARK-24334] Fix race condition in ArrowPythonRu...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/21397#discussion_r191070397 --- Diff: python/pyspark/sql/tests.py --- @@ -4931,6 +4931,30 @@ def foo3(key, pdf): expected4 = udf3.func((), pdf) self.assertPandasEqual(expected4, result4) +# Regression test for SPARK-24334 +def test_memory_leak(self): --- End diff -- Yea, I think it's good enough to have it in PR description. Let's just put this in the PR description. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21397: [SPARK-24334] Fix race condition in ArrowPythonRu...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/21397#discussion_r191040434 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/ArrowPythonRunner.scala --- @@ -94,8 +88,18 @@ class ArrowPythonRunner( writer.writeBatch() arrowWriter.reset() } -} { writer.end() --- End diff -- maybe add a comment to explain the same - that this should be outside of the finally? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21397: [SPARK-24334] Fix race condition in ArrowPythonRu...
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/21397#discussion_r189996870 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/ArrowPythonRunner.scala --- @@ -94,8 +88,18 @@ class ArrowPythonRunner( writer.writeBatch() arrowWriter.reset() } -} { writer.end() --- End diff -- The routine in `TaskCompletionListener` is not really a safe belt, it actually causes the race condition. But you are right, I believe putting the cleanup routine in finally block is enough. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21397: [SPARK-24334] Fix race condition in ArrowPythonRu...
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/21397#discussion_r189973566 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/ArrowPythonRunner.scala --- @@ -94,8 +88,18 @@ class ArrowPythonRunner( writer.writeBatch() arrowWriter.reset() } -} { writer.end() --- End diff -- So basically your change makes sure that `root` and `allocator` is safely closed in the final block and hence we don't need the safe belt in `TaskCompletionListener`? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21397: [SPARK-24334] Fix race condition in ArrowPythonRu...
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/21397#discussion_r189942402 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/ArrowPythonRunner.scala --- @@ -94,8 +88,18 @@ class ArrowPythonRunner( writer.writeBatch() arrowWriter.reset() } -} { writer.end() --- End diff -- This is moved out of the finally block because this function will throw exception if the thread is interrupted. writer.end() also doesn't do any clean up - it just writes some final bits to the output channel so we don't need to call it in clean up. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21397: [SPARK-24334] Fix race condition in ArrowPythonRu...
GitHub user icexelloss opened a pull request: https://github.com/apache/spark/pull/21397 [SPARK-24334] Fix race condition in ArrowPythonRunner causes unclean shutdown of Arrow memory allocator ## What changes were proposed in this pull request? There is a race condition of closing Arrow VectorSchemaRoot and Allocator in the writer thread of ArrowPythonRunner. The race results in memory leak exception when closing the allocator. This patch removes the closing routine from the TaskCompletionListener and make the writer thread responsible for cleaning up the Arrow memory. ## How was this patch tested? Because of the race condition, the bug cannot be unit test easily. So far it has only happens on large amount of data. This is currently tested manually. Please review http://spark.apache.org/contributing.html before opening a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/icexelloss/spark SPARK-24334-arrow-memory-leak Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/21397.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #21397 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org