[ https://issues.apache.org/jira/browse/SPARK-22792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16292162#comment-16292162 ]
Annamalai Venugopal commented on SPARK-22792: --------------------------------------------- Sorry am new to this.I'll change it now > PySpark UDF registering issue > ----------------------------- > > Key: SPARK-22792 > URL: https://issues.apache.org/jira/browse/SPARK-22792 > Project: Spark > Issue Type: Question > Components: PySpark > Affects Versions: 2.2.1 > Environment: Windows OS, Python pycharm ,Spark > Reporter: Annamalai Venugopal > Labels: windows > Original Estimate: 72h > Remaining Estimate: 72h > > I am doing a project with pyspark i am struck with an issue > Traceback (most recent call last): > File "C:/Users/avenugopal/PycharmProjects/POC_for_vectors/main.py", line > 187, in <module> > hypernym_extracted_data = result.withColumn("hypernym_extracted_data", > hypernym_fn(F.column("token_extracted_data"))) > File > "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\site-packages\pyspark\sql\functions.py", > line 1957, in wrapper > return udf_obj(*args) > File > "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\site-packages\pyspark\sql\functions.py", > line 1916, in __call__ > judf = self._judf > File > "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\site-packages\pyspark\sql\functions.py", > line 1900, in _judf > self._judf_placeholder = self._create_judf() > File > "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\site-packages\pyspark\sql\functions.py", > line 1909, in _create_judf > wrapped_func = _wrap_function(sc, self.func, self.returnType) > File > "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\site-packages\pyspark\sql\functions.py", > line 1866, in _wrap_function > pickled_command, broadcast_vars, env, includes = > _prepare_for_python_RDD(sc, command) > File > "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\site-packages\pyspark\rdd.py", > line 2374, in _prepare_for_python_RDD > pickled_command = ser.dumps(command) > File > "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\site-packages\pyspark\serializers.py", > line 460, in dumps > return cloudpickle.dumps(obj, 2) > File > "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\site-packages\pyspark\cloudpickle.py", > line 704, in dumps > cp.dump(obj) > File > "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\site-packages\pyspark\cloudpickle.py", > line 148, in dump > return Pickler.dump(self, obj) > File > "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\pickle.py", > line 409, in dump > self.save(obj) > File > "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\pickle.py", > line 476, in save > f(self, obj) # Call unbound method with explicit self > File > "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\pickle.py", > line 736, in save_tuple > save(element) > File > "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\pickle.py", > line 476, in save > f(self, obj) # Call unbound method with explicit self > File > "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\site-packages\pyspark\cloudpickle.py", > line 249, in save_function > self.save_function_tuple(obj) > File > "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\site-packages\pyspark\cloudpickle.py", > line 297, in save_function_tuple > save(f_globals) > File > "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\pickle.py", > line 476, in save > f(self, obj) # Call unbound method with explicit self > File > "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\pickle.py", > line 821, in save_dict > self._batch_setitems(obj.items()) > File > "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\pickle.py", > line 852, in _batch_setitems > save(v) > File > "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\pickle.py", > line 476, in save > f(self, obj) # Call unbound method with explicit self > File > "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\site-packages\pyspark\cloudpickle.py", > line 249, in save_function > self.save_function_tuple(obj) > File > "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\site-packages\pyspark\cloudpickle.py", > line 297, in save_function_tuple > save(f_globals) > File > "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\pickle.py", > line 476, in save > f(self, obj) # Call unbound method with explicit self > File > "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\pickle.py", > line 821, in save_dict > self._batch_setitems(obj.items()) > File > "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\pickle.py", > line 852, in _batch_setitems > save(v) > File > "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\pickle.py", > line 521, in save > self.save_reduce(obj=obj, *rv) > File > "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\site-packages\pyspark\cloudpickle.py", > line 565, in save_reduce > "args[0] from __newobj__ args has the wrong class") > _pickle.PicklingError: args[0] from __newobj__ args has the wrong class -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org