xuanyuanking commented on a change in pull request #23470:
[SPARK-26549][PySpark] Fix for python worker reuse take no effect for Python3
URL: https://github.com/apache/spark/pull/23470#discussion_r245575744
##########
File path: python/pyspark/worker.py
##########
@@ -446,7 +446,12 @@ def process():
pickleSer._write_with_length((aid, accum._value), outfile)
# check end of stream
- if read_int(infile) == SpecialLengths.END_OF_STREAM:
+ res = read_int(infile)
+ if sys.version >= '3' and res == SpecialLengths.END_OF_DATA_SECTION:
Review comment:
It is not an 'additional' `SpecialLengths.END_OF_DATA_SECTION`, both
Python2\Python3 get this -1 value which generated from
`PythonRunner.writeIteratorToStream`.
The bug caused only in Python3 because Python2 handled this while Python3
not, as I described in JIRA
description:https://issues.apache.org/jira/browse/SPARK-26549.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]