[
https://issues.apache.org/jira/browse/SPARK-12760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sean Owen updated SPARK-12760:
--
Priority: Minor (was: Trivial)
Issue Type: Bug (was: Question)
Summary: inaccurate description for difference between local vs cluster
mode in closure handling (was: inaccurate description for difference between
local vs cluster mode )
I think the example needs an update, but not for this reason. There's no
separate "memory space" in local mode. It's one JVM. However it's undefined
whether the copy of {{counter}} is the same or different in this case.
Actually, I find a copy is serialized with the closure at this point so the
result is still 0.
I think the explanation should be changed to say the result is undefined here,
and could be 0 or not, and explain why. Do you want to try a PR?
> inaccurate description for difference between local vs cluster mode in
> closure handling
> ---
>
> Key: SPARK-12760
> URL: https://issues.apache.org/jira/browse/SPARK-12760
> Project: Spark
> Issue Type: Bug
> Components: Documentation
>Reporter: Mortada Mehyar
>Priority: Minor
>
> In the spark documentation there's an example for illustrating how `local`
> and `cluster` mode can differ
> http://spark.apache.org/docs/latest/programming-guide.html#example
> " In local mode with a single JVM, the above code will sum the values within
> the RDD and store it in counter. This is because both the RDD and the
> variable counter are in the same memory space on the driver node."
> However the above doesn't seem to be true. Even in `local` mode it seems like
> the counter value should still be 0, because the variable will be summed up
> in the executor memory space, but the final value in the driver memory space
> is still 0. I tested this snippet and verified that in `local` mode the value
> is indeed still 0.
> Is the doc wrong or perhaps I'm missing something the doc is trying to say?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org