bogao007 commented on code in PR #48373:
URL: https://github.com/apache/spark/pull/48373#discussion_r1823513880
##########
python/pyspark/worker.py:
##########
@@ -1890,6 +1890,14 @@ def process():
try:
serializer.dump_stream(out_iter, outfile)
finally:
+ # Sending a signal to TransformWithState UDF to perform proper
cleanup steps.
Review Comment:
@HeartSaVioR @jingz-db I made a change for properly calling `close()` and
other cleanup steps, could you help take a look and see if this change makes
sense? The change is mainly in this file and `group_ops.py` I've verified with
both manual test and exiting unit test to confirm this change works as
expected. I'll fix the merge conflict issue later, just wanted to get some
early feedbacks on this specific change, thanks!
##########
python/pyspark/worker.py:
##########
@@ -1890,6 +1890,14 @@ def process():
try:
serializer.dump_stream(out_iter, outfile)
finally:
+ # Sending a signal to TransformWithState UDF to perform proper
cleanup steps.
Review Comment:
@HeartSaVioR @jingz-db I made a change for properly calling `close()` and
other cleanup steps, could you help take a look and see if this change makes
sense? The change is mainly in this file and `group_ops.py`. I've verified with
both manual test and exiting unit test to confirm this change works as
expected. I'll fix the merge conflict issue later, just wanted to get some
early feedbacks on this specific change, thanks!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]