Github user galv commented on a diff in the pull request:
https://github.com/apache/spark/pull/21494#discussion_r193291076
--- Diff: python/pyspark/worker.py ---
@@ -232,6 +236,13 @@ def main(infile, outfile):
shuffle.DiskBytesSpilled = 0
_accumulatorRegistry.clear()
+ if (isBarrier):
+ port = 25333 + 2 + 2 * taskContext._partitionId
--- End diff --
I recommend using DEFAULT_PORT and DEFAULT_PYTHON_PORT. They are exposed as
part of the public API of py4j:
https://github.com/bartdag/py4j/blob/216432d859de41441f0d1a0d55b31b5d8d09dd28/py4j-python/src/py4j/java_gateway.py#L54
By the way, acquiring ports like this is a little hacky and may require
more thought.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]