[ https://issues.apache.org/jira/browse/SPARK-26019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16692716#comment-16692716 ]
Ruslan Dautkhanov commented on SPARK-26019: ------------------------------------------- [~viirya] exception stack reads that error happened in SocketServer.py, BaseRequestHandler class constructor, excerpt from the full exception stack above : {code:python} ... File "/opt/cloudera/parcels/Anaconda/lib/python2.7/SocketServer.py", line 652, in __init__ self.handle() ... {code} Notice constructor here calls `self.handle()` - https://github.com/python/cpython/blob/2.7/Lib/SocketServer.py#L655 `handle()` is defined in derived class _UpdateRequestHandler here https://github.com/apache/spark/blob/master/python/pyspark/accumulators.py#L232 and expects `auth_token` to be set : https://github.com/apache/spark/blob/master/python/pyspark/accumulators.py#L254 - that's exactly where exception happens. [~irashid] was right - those two lines have to be swapped. [~hyukjin.kwon] that's odd you closed this jira, although I said it always reproduces for me (100 % of times ), and even [posted reproducer here|https://issues.apache.org/jira/secure/EditComment!default.jspa?id=13197858&commentId=16692219]. [~saivarunvishal] also said it happens for him in SPARK-26113 and you closed that jira as well. It seems not in line with https://spark.apache.org/contributing.html - "Contributing Bug Reports". Please let me know what I miss here. I called out [~bersprockets] because we use Cloudera distribution of Spark and Cloudera has a few patches on top of open-source Spark. I wanted to make sure it's not Cloudera distro specific. Also we worked with Bruce on several other Spark issue and noticed here's in watchers list on this jira... Now I see that this issue is not Cloudera specific though. > pyspark/accumulators.py: "TypeError: object of type 'NoneType' has no len()" > in authenticate_and_accum_updates() > ---------------------------------------------------------------------------------------------------------------- > > Key: SPARK-26019 > URL: https://issues.apache.org/jira/browse/SPARK-26019 > Project: Spark > Issue Type: Bug > Components: PySpark > Affects Versions: 2.3.2, 2.4.0 > Reporter: Ruslan Dautkhanov > Priority: Major > > Started happening after 2.3.1 -> 2.3.2 upgrade. > > {code:python} > Exception happened during processing of request from ('127.0.0.1', 43418) > ---------------------------------------- > Traceback (most recent call last): > File "/opt/cloudera/parcels/Anaconda/lib/python2.7/SocketServer.py", line > 290, in _handle_request_noblock > self.process_request(request, client_address) > File "/opt/cloudera/parcels/Anaconda/lib/python2.7/SocketServer.py", line > 318, in process_request > self.finish_request(request, client_address) > File "/opt/cloudera/parcels/Anaconda/lib/python2.7/SocketServer.py", line > 331, in finish_request > self.RequestHandlerClass(request, client_address, self) > File "/opt/cloudera/parcels/Anaconda/lib/python2.7/SocketServer.py", line > 652, in __init__ > self.handle() > File > "/opt/cloudera/parcels/SPARK2-2.3.0.cloudera4-1.cdh5.13.3.p0.611179/lib/spark2/python/lib/pyspark.zip/pyspark/accumulators.py", > line 263, in handle > poll(authenticate_and_accum_updates) > File > "/opt/cloudera/parcels/SPARK2-2.3.0.cloudera4-1.cdh5.13.3.p0.611179/lib/spark2/python/lib/pyspark.zip/pyspark/accumulators.py", > line 238, in poll > if func(): > File > "/opt/cloudera/parcels/SPARK2-2.3.0.cloudera4-1.cdh5.13.3.p0.611179/lib/spark2/python/lib/pyspark.zip/pyspark/accumulators.py", > line 251, in authenticate_and_accum_updates > received_token = self.rfile.read(len(auth_token)) > TypeError: object of type 'NoneType' has no len() > > {code} > > Error happens here: > https://github.com/apache/spark/blob/cb90617f894fd51a092710271823ec7d1cd3a668/python/pyspark/accumulators.py#L254 > The PySpark code was just running a simple pipeline of > binary_rdd = sc.binaryRecords(full_file_path, record_length).map(lambda .. ) > and then converting it to a dataframe and running a count on it. > It seems error is flaky - on next rerun it didn't happen. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org