[
https://issues.apache.org/jira/browse/SPARK-6404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14368383#comment-14368383
]
Yifan Wang commented on SPARK-6404:
-----------------------------------
I got an error. Is that expected?
{code}
Traceback (most recent call last):
File
"/nail/home/yifan/pg/yifan_spark_mr_steps/spark-1.2.1-bin-hadoop2.4/python/pyspark/streaming/util.py",
line 90, in dumps
return bytearray(self.serializer.dumps((func.func, func.deserializers)))
File
"/nail/home/yifan/pg/yifan_spark_mr_steps/spark-1.2.1-bin-hadoop2.4/python/pyspark/serializers.py",
line 405, in dumps
return cloudpickle.dumps(obj, 2)
File
"/nail/home/yifan/pg/yifan_spark_mr_steps/spark-1.2.1-bin-hadoop2.4/python/pyspark/cloudpickle.py",
line 816, in dumps
cp.dump(obj)
File
"/nail/home/yifan/pg/yifan_spark_mr_steps/spark-1.2.1-bin-hadoop2.4/python/pyspark/cloudpickle.py",
line 133, in dump
return pickle.Pickler.dump(self, obj)
File "/usr/lib/python2.6/pickle.py", line 224, in dump
self.save(obj)
File "/usr/lib/python2.6/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
File "/usr/lib/python2.6/pickle.py", line 548, in save_tuple
save(element)
File "/usr/lib/python2.6/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
File
"/nail/home/yifan/pg/yifan_spark_mr_steps/spark-1.2.1-bin-hadoop2.4/python/pyspark/cloudpickle.py",
line 249, in save_function
self.save_function_tuple(obj, modList)
File
"/nail/home/yifan/pg/yifan_spark_mr_steps/spark-1.2.1-bin-hadoop2.4/python/pyspark/cloudpickle.py",
line 304, in save_function_tuple
save((code, closure, base_globals))
File "/usr/lib/python2.6/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
File "/usr/lib/python2.6/pickle.py", line 548, in save_tuple
save(element)
File "/usr/lib/python2.6/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
File "/usr/lib/python2.6/pickle.py", line 600, in save_list
self._batch_appends(iter(obj))
File "/usr/lib/python2.6/pickle.py", line 636, in _batch_appends
save(tmp[0])
File "/usr/lib/python2.6/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
File
"/nail/home/yifan/pg/yifan_spark_mr_steps/spark-1.2.1-bin-hadoop2.4/python/pyspark/cloudpickle.py",
line 249, in save_function
self.save_function_tuple(obj, modList)
File
"/nail/home/yifan/pg/yifan_spark_mr_steps/spark-1.2.1-bin-hadoop2.4/python/pyspark/cloudpickle.py",
line 309, in save_function_tuple
save(f_globals)
File "/usr/lib/python2.6/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
File
"/nail/home/yifan/pg/yifan_spark_mr_steps/spark-1.2.1-bin-hadoop2.4/python/pyspark/cloudpickle.py",
line 174, in save_dict
pickle.Pickler.save_dict(self, obj)
File "/usr/lib/python2.6/pickle.py", line 649, in save_dict
self._batch_setitems(obj.iteritems())
File "/usr/lib/python2.6/pickle.py", line 681, in _batch_setitems
save(v)
File "/usr/lib/python2.6/pickle.py", line 306, in save
rv = reduce(self.proto)
File
"/nail/home/yifan/pg/yifan_spark_mr_steps/spark-1.2.1-bin-hadoop2.4/python/pyspark/context.py",
line 236, in __getnewargs__
"It appears that you are attempting to reference SparkContext from a
broadcast "
Exception: It appears that you are attempting to reference SparkContext from a
broadcast variable, action, or transforamtion. SparkContext can only be used on
the driver, not in code that it run on workers. For more information, see
SPARK-5063.
{code}
> Call broadcast() in each interval for spark streaming programs.
> ---------------------------------------------------------------
>
> Key: SPARK-6404
> URL: https://issues.apache.org/jira/browse/SPARK-6404
> Project: Spark
> Issue Type: Improvement
> Components: Spark Core
> Reporter: Yifan Wang
>
> If I understand it correctly, Spark’s broadcast() function will be called
> only once at the beginning of the batch. For streaming applications that need
> to run for 24/7, it is often needed to update variables that shared by
> broadcast() dynamically. It would be ideal if broadcast() could be called at
> the beginning of each interval.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]