[jira] [Commented] (SPARK-13634) Assigning spark context to variable results in serialization error

Chris A. Mattmann (JIRA) Mon, 07 Mar 2016 23:40:56 -0800

    [ 
https://issues.apache.org/jira/browse/SPARK-13634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15184597#comment-15184597
 ]


Chris A. Mattmann commented on SPARK-13634:
-------------------------------------------

Sean, thanks for your reply. We can agree to disagree on the semantics. I've 
been doing open source for a long time, and leaving JIRAs open for longer than 
43 minutes is not damaging by any means. As a former Spark mentor too during 
its Incubation and its Champion, I also disagree, and was involved in Spark 
from its early inception here at the ASF and so have not always seen this type 
of behavior, which is why it's troubling to me. Your comparison of one end of 
the spectrum (10) to 1000s in size of JIRAs and activity also kind of leaves a 
sour taste in my mouth. I know Spark gets lots of activity. So do many of the 
projects I've helped start and contribute to (Hadoop, Lucene/Solr, Nutch during 
its hey day, etc etc). I  left JIRAs open for longer than 43 mins in those 
projects as did many others wiser than me and that have been around a lot 
longer than me in open source. 

Thanks for taking time to think through what may be causing it. I'll choose to 
take the positive away from your reply and try to report back more on our 
workarounds in SciSpark and on our project.

--Chris

> Assigning spark context to variable results in serialization error
> ------------------------------------------------------------------
>
>                 Key: SPARK-13634
>                 URL: https://issues.apache.org/jira/browse/SPARK-13634
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Shell
>            Reporter: Rahul Palamuttam
>            Priority: Minor
>
> The following lines of code cause a task serialization error when executed in 
> the spark-shell. 
> Note that the error does not occur when submitting the code as a batch job - 
> via spark-submit.
> val temp = 10
> val newSC = sc
> val new RDD = newSC.parallelize(0 to 100).map(p => p + temp)
> For some reason when temp is being pulled in to the referencing environment 
> of the closure, so is the SparkContext. 
> We originally hit this issue in the SciSpark project, when referencing a 
> string variable inside of a lambda expression in RDD.map(...)
> Any insight into how this could be resolved would be appreciated.
> While the above code is trivial, SciSpark uses a wrapper around the 
> SparkContext to read from various file formats. We want to keep this class 
> structure and also use it in notebook and shell environments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-13634) Assigning spark context to variable results in serialization error

Reply via email to