HyukjinKwon commented on a change in pull request #24834: [SPARK-27992][PYTHON] 
Synchronize with Python connection thread to propagate errors
URL: https://github.com/apache/spark/pull/24834#discussion_r296549441
 
 

 ##########
 File path: python/pyspark/rdd.py
 ##########
 @@ -140,14 +140,29 @@ def _parse_memory(s):
 
 
 def _create_local_socket(sock_info):
-    (sockfile, sock) = local_connect_and_auth(*sock_info)
+    """
+    Create a local socket that can be used to load deserialized data from the 
JVM
+
+    :param sock_info: Tuple containing port number and authentication secret 
for a local socket.
+    :return: sockfile file descriptor of the local socket
+    """
+    port = sock_info[0]
+    auth_secret = sock_info[1]
+    sockfile, sock = local_connect_and_auth(port, auth_secret)
     # The RDD materialization time is unpredictable, if we set a timeout for 
socket reading
     # operation, it will very possibly fail. See SPARK-18281.
     sock.settimeout(None)
     return sockfile
 
 
 def _load_from_socket(sock_info, serializer):
 
 Review comment:
   @BryanCutler, what does `sock_info` expect to be? Seems it can be both 
2-tuple and 3-tuple (with server).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to