[jira] [Commented] (SPARK-13634) Assigning spark context to variable results in serialization error

Sean Owen (JIRA) Mon, 07 Mar 2016 23:08:48 -0800

    [ 
https://issues.apache.org/jira/browse/SPARK-13634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15184563#comment-15184563
 ]


Sean Owen commented on SPARK-13634:
-----------------------------------

JIRAs can be reopened, and should be if there's a change, like: you have a pull 
request to propose, or a different example or more analysis that suggests it's 
not just a Scala REPL thing. People can still comment on JIRAs too.

All else equal, a reply in 43 minutes is a good thing. While I can appreciate 
that, ideally, we'd always let the reporter explicitly confirm they're done or 
something, that's not feasible in this project. On average a JIRA is opened 
every _hour_, many of which never receive any follow-up. Leaving them open is 
damaging too, since people inevitably parse that as "legitimate issue I should 
work on or wait on". If I see a quite-likely answer, I'd rather reflect it in 
JIRA, and once in a while overturn it, since reopening is a normal lightweight 
operation that can be performed by the reporter.

Further, the reality is that about half of those JIRAs are not problems, badly 
described, poorly researched, etc (not this one), and actually _need_ rapid 
pushback with pointers to the contribution guide to discourage more of the 
behavior.

This is why some things get resolved fast in general, and it's with the intent 
of putting limited time to best use for the most people, and getting most 
people some quick feedback. I understand it's not how a project with 10 JIRAs a 
month probably operates, but I disagree that my reply was wrong or impolite.

Instead I'd certainly welcome materially more information and proposed change 
if you want to pursue and reopen this. For example, off the top of my head: 
does the ClosureCleaner specially treat {{sc}}? it may do so because there 
isn't supposed to be a second context in the application.

However if this is your real code, I strongly suspect you have a simple 
workaround in refactoring the third line into a function on an {{object}} (i.e. 
static). The layer of indirection, or something similar, likely avoids tripping 
on this. This is what I've suggested you pursue next. If that works, that's 
great info to paste here, at least as confirmation. Or if not, add it here 
anyway to show what else doesn't work.

> Assigning spark context to variable results in serialization error
> ------------------------------------------------------------------
>
>                 Key: SPARK-13634
>                 URL: https://issues.apache.org/jira/browse/SPARK-13634
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Shell
>            Reporter: Rahul Palamuttam
>            Priority: Minor
>
> The following lines of code cause a task serialization error when executed in 
> the spark-shell. 
> Note that the error does not occur when submitting the code as a batch job - 
> via spark-submit.
> val temp = 10
> val newSC = sc
> val new RDD = newSC.parallelize(0 to 100).map(p => p + temp)
> For some reason when temp is being pulled in to the referencing environment 
> of the closure, so is the SparkContext. 
> We originally hit this issue in the SciSpark project, when referencing a 
> string variable inside of a lambda expression in RDD.map(...)
> Any insight into how this could be resolved would be appreciated.
> While the above code is trivial, SciSpark uses a wrapper around the 
> SparkContext to read from various file formats. We want to keep this class 
> structure and also use it in notebook and shell environments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-13634) Assigning spark context to variable results in serialization error

Reply via email to