[
https://issues.apache.org/jira/browse/SPARK-22792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sean Owen resolved SPARK-22792.
-------------------------------
Resolution: Invalid
For JIRAs, you'd need to narrow this down to a clearly-described and narrow
issue. Like your other JIRA, this is just a paste of your code.
> PySpark UDF registering issue
> -----------------------------
>
> Key: SPARK-22792
> URL: https://issues.apache.org/jira/browse/SPARK-22792
> Project: Spark
> Issue Type: Bug
> Components: PySpark
> Affects Versions: 2.2.1
> Environment: Windows OS, Python pycharm ,Spark
> Reporter: Annamalai Venugopal
> Labels: windows
> Original Estimate: 72h
> Remaining Estimate: 72h
>
> I am doing a project with pyspark i am struck with an issue
> Traceback (most recent call last):
> File "C:/Users/avenugopal/PycharmProjects/POC_for_vectors/main.py", line
> 187, in <module>
> hypernym_extracted_data = result.withColumn("hypernym_extracted_data",
> hypernym_fn(F.column("token_extracted_data")))
> File
> "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\site-packages\pyspark\sql\functions.py",
> line 1957, in wrapper
> return udf_obj(*args)
> File
> "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\site-packages\pyspark\sql\functions.py",
> line 1916, in __call__
> judf = self._judf
> File
> "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\site-packages\pyspark\sql\functions.py",
> line 1900, in _judf
> self._judf_placeholder = self._create_judf()
> File
> "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\site-packages\pyspark\sql\functions.py",
> line 1909, in _create_judf
> wrapped_func = _wrap_function(sc, self.func, self.returnType)
> File
> "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\site-packages\pyspark\sql\functions.py",
> line 1866, in _wrap_function
> pickled_command, broadcast_vars, env, includes =
> _prepare_for_python_RDD(sc, command)
> File
> "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\site-packages\pyspark\rdd.py",
> line 2374, in _prepare_for_python_RDD
> pickled_command = ser.dumps(command)
> File
> "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\site-packages\pyspark\serializers.py",
> line 460, in dumps
> return cloudpickle.dumps(obj, 2)
> File
> "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\site-packages\pyspark\cloudpickle.py",
> line 704, in dumps
> cp.dump(obj)
> File
> "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\site-packages\pyspark\cloudpickle.py",
> line 148, in dump
> return Pickler.dump(self, obj)
> File
> "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\pickle.py",
> line 409, in dump
> self.save(obj)
> File
> "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\pickle.py",
> line 476, in save
> f(self, obj) # Call unbound method with explicit self
> File
> "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\pickle.py",
> line 736, in save_tuple
> save(element)
> File
> "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\pickle.py",
> line 476, in save
> f(self, obj) # Call unbound method with explicit self
> File
> "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\site-packages\pyspark\cloudpickle.py",
> line 249, in save_function
> self.save_function_tuple(obj)
> File
> "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\site-packages\pyspark\cloudpickle.py",
> line 297, in save_function_tuple
> save(f_globals)
> File
> "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\pickle.py",
> line 476, in save
> f(self, obj) # Call unbound method with explicit self
> File
> "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\pickle.py",
> line 821, in save_dict
> self._batch_setitems(obj.items())
> File
> "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\pickle.py",
> line 852, in _batch_setitems
> save(v)
> File
> "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\pickle.py",
> line 476, in save
> f(self, obj) # Call unbound method with explicit self
> File
> "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\site-packages\pyspark\cloudpickle.py",
> line 249, in save_function
> self.save_function_tuple(obj)
> File
> "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\site-packages\pyspark\cloudpickle.py",
> line 297, in save_function_tuple
> save(f_globals)
> File
> "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\pickle.py",
> line 476, in save
> f(self, obj) # Call unbound method with explicit self
> File
> "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\pickle.py",
> line 821, in save_dict
> self._batch_setitems(obj.items())
> File
> "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\pickle.py",
> line 852, in _batch_setitems
> save(v)
> File
> "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\pickle.py",
> line 521, in save
> self.save_reduce(obj=obj, *rv)
> File
> "C:\Users\avenugopal\AppData\Local\Programs\Python\Python36\lib\site-packages\pyspark\cloudpickle.py",
> line 565, in save_reduce
> "args[0] from __newobj__ args has the wrong class")
> _pickle.PicklingError: args[0] from __newobj__ args has the wrong class
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]