kehao created SPARK-19627:
-----------------------------
Summary: pyspark call jvm function defined by ourselves
Key: SPARK-19627
URL: https://issues.apache.org/jira/browse/SPARK-19627
Project: Spark
Issue Type: Bug
Components: Deploy
Affects Versions: 1.6.1
Reporter: kehao
Fix For: 1.6.1
hi, I have a question that pyspark couldn't execute suceess by call jvm's
function defined by myself, please view the code below:
from pyspark import SparkConf,SparkContext
from py4j.java_gateway import java_import
if __name__ == "__main__":
# conf = SparkConf().setAppName("testing")
# sc = SparkContext(conf=conf)
sc = SparkContext(appName="Py4jTesting")
def foo(x):
java_import(sc._jvm, "Calculate")
func = sc._jvm.Calculate()
func.sqAdd(x)
rdd = sc.parallelize([1, 2, 3])
result = rdd.map(foo).collect()
print("$$$$$$$$$$$$$$$$$$$$$$")
print(result)
the result shows as below ,who can help me?
Traceback (most recent call last):
File "/home/manager/data/software/mytest/kehao/driver.py", line 19, in
<module>
result = rdd.map(foo).collect()
File
"/home/manager/data/software/spark-1.6.1-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/rdd.py",
line 771, in collect
File
"/home/manager/data/software/spark-1.6.1-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/rdd.py",
line 2379, in _jrdd
File
"/home/manager/data/software/spark-1.6.1-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/rdd.py",
line 2299, in _prepare_for_python_RDD
File
"/home/manager/data/software/spark-1.6.1-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/serializers.py",
line 428, in dumps
File
"/home/manager/data/software/spark-1.6.1-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/cloudpickle.py",
line 646, in dumps
File
"/home/manager/data/software/spark-1.6.1-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/cloudpickle.py",
line 107, in dump
File "/usr/lib/python3.4/pickle.py", line 412, in dump
self.save(obj)
File "/usr/lib/python3.4/pickle.py", line 479, in save
f(self, obj) # Call unbound method with explicit self
File "/usr/lib/python3.4/pickle.py", line 744, in save_tuple
save(element)
File "/usr/lib/python3.4/pickle.py", line 479, in save
f(self, obj) # Call unbound method with explicit self
File
"/home/manager/data/software/spark-1.6.1-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/cloudpickle.py",
line 199, in save_function
File
"/home/manager/data/software/spark-1.6.1-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/cloudpickle.py",
line 236, in save_function_tuple
File "/usr/lib/python3.4/pickle.py", line 479, in save
f(self, obj) # Call unbound method with explicit self
File "/usr/lib/python3.4/pickle.py", line 729, in save_tuple
save(element)
File "/usr/lib/python3.4/pickle.py", line 479, in save
f(self, obj) # Call unbound method with explicit self
File "/usr/lib/python3.4/pickle.py", line 774, in save_list
self._batch_appends(obj)
File "/usr/lib/python3.4/pickle.py", line 801, in _batch_appends
save(tmp[0])
File "/usr/lib/python3.4/pickle.py", line 479, in save
f(self, obj) # Call unbound method with explicit self
File
"/home/manager/data/software/spark-1.6.1-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/cloudpickle.py",
line 193, in save_function
File
"/home/manager/data/software/spark-1.6.1-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/cloudpickle.py",
line 241, in save_function_tuple
File "/usr/lib/python3.4/pickle.py", line 479, in save
f(self, obj) # Call unbound method with explicit self
File "/usr/lib/python3.4/pickle.py", line 814, in save_dict
self._batch_setitems(obj.items())
File "/usr/lib/python3.4/pickle.py", line 840, in _batch_setitems
save(v)
File "/usr/lib/python3.4/pickle.py", line 499, in save
rv = reduce(self.proto)
File
"/home/manager/data/software/spark-1.6.1-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/context.py",
line 268, in __getnewargs__
Exception: It appears that you are attempting to reference SparkContext from a
broadcast variable, action, or transformation. SparkContext can only be used on
the driver, not in code that it run on workers. For more information, see
SPARK-5063
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]