[
https://issues.apache.org/jira/browse/SPARK-3326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14116327#comment-14116327
]
Sean Owen commented on SPARK-3326:
--
The call to Foo.getSome() occurs remotely, on a different JVM with a different
copy of your class. You may initialize your instance in the driver, but this
leaves it uninitialized in the remote workers.
You can initialize this in a static block. Or you can simply reference the
value of Foo.getSome() directly in your map function and then it is serialized
in the closure. All that you send right now is a function that depends on what
Foo.getSome() returns when it's called, not what it happens to return on the
driver. Consider broadcast variables if it's large.
If that's what's going on then this is normal behavior.
can't access a static variable after init in mapper
---
Key: SPARK-3326
URL: https://issues.apache.org/jira/browse/SPARK-3326
Project: Spark
Issue Type: Bug
Environment: CDH5.1.0
Spark1.0.0
Reporter: Gavin Zhang
I wrote a object like:
object Foo {
private Bar bar = null
def init(Bar bar){
this.bar = bar
}
def getSome(){
bar.someDef()
}
}
In Spark main def, I read some text from HDFS and init this object. And after
then calling getSome().
I was successful with this code:
sc.textFile(args(0)).take(10).map(println(Foo.getSome()))
However, when I changed it for write output to HDFS, I found the bar variable
in Foo object is null:
sc.textFile(args(0)).map(line=Foo.getSome()).saveAsTextFile(args(1))
WHY?
--
This message was sent by Atlassian JIRA
(v6.2#6252)
-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org