[jira] [Updated] (SPARK-6834) Failed with error: ‘invalid package name’ Error in as.name(name) : attempt to use zero-length variable name
[ https://issues.apache.org/jira/browse/SPARK-6834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-6834: Labels: bulk-closed (was: ) > Failed with error: ‘invalid package name’ Error in as.name(name) : attempt to > use zero-length variable name > --- > > Key: SPARK-6834 > URL: https://issues.apache.org/jira/browse/SPARK-6834 > Project: Spark > Issue Type: Bug > Components: SparkR >Affects Versions: 1.4.0 >Reporter: Shivaram Venkataraman >Priority: Major > Labels: bulk-closed > > Context: trying to interface SparkR with foreach. Foreach is an abstraction > over several parallel backends. This would enable execution on spark cluster > of > 50 R packages including very popular caret and plyr. Simple foreach > examples work. caret unfortunately does not. > I have a repro but it is somewhat complex (it is the main example of model > fitting in caret on their website though, not something made on purpose to > make SparkR fail). If I find anything more straightforward, I will comment > here. Reproduced in an R --vanilla session, but I can uninstall all of my > packages, so I may have missed some deps. > Reproduce with: > install.packages(c("caret", "foreach", "devtools", "mlbench", "gbm", > "survival", "splines")) > library(caret) > library(foreach) > library(devtools) > install_github("RevolutionAnalytics/doParallelSpark", subdir = "pkg") > library(doParallelSpark) > registerDoParallelSpark() > library(mlbench) > data(Sonar) > set.seed(998) > inTraining <- createDataPartition(Sonar$Class, p = .75, list = FALSE) > training <- Sonar[ inTraining,] > testing <- Sonar[-inTraining,] > fitControl <- trainControl(## 10-fold CV > method = "repeatedcv", > number = 10, > ## repeated ten times > repeats = 10) > set.seed(825) > gbmFit1 <- train(Class ~ ., data = training, > method = "gbm", > trControl = fitControl, > ## This last option is actually one > ## for gbm() that passes through > verbose = FALSE) > Stack trace > Failed with error: ‘invalid package name’ > Failed with error: ‘invalid package name’ > Failed with error: ‘invalid package name’ > Failed with error: ‘invalid package name’ > Error in as.name(name) : attempt to use zero-length variable name > Calls: source ... withVisible -> eval -> eval -> getNamespace -> as.name > Execution halted > 15/03/26 14:32:30 ERROR Executor: Exception in task 0.0 in stage 3.0 (TID 5) > org.apache.spark.SparkException: R computation failed with > Failed with error: ‘invalid package name’ > Failed with error: ‘invalid package name’ > Failed with error: ‘invalid package name’ > Failed with error: ‘invalid package name’ > Error in as.name(name) : attempt to use zero-length variable name > Calls: source ... withVisible -> eval -> eval -> getNamespace -> as.name > Execution halted > at edu.berkeley.cs.amplab.sparkr.BaseRRDD.compute(RRDD.scala:80) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:229) > at edu.berkeley.cs.amplab.sparkr.BaseRRDD.compute(RRDD.scala:32) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:229) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) > at org.apache.spark.scheduler.Task.run(Task.scala:54) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > 15/03/26 14:32:30 WARN TaskSetManager: Lost task 0.0 in stage 3.0 (TID 5, > localhost): org.apache.spark.SparkException: R computation failed with > Failed with error: ‘invalid package name’ > Failed with error: ‘invalid package name’ > Failed with error: ‘invalid package name’ > Failed with error: ‘invalid package name’ > Error in as.name(name) : attempt to use zero-length variable name > Calls: source ... withVisible -> eval -> eval -> getNamespace -> as.name > Execution halted > edu.berkeley.cs.amplab.sparkr.BaseRRDD.compute(RRDD.scala:80) > org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) > org.apache.spark.rdd.RDD.iterator(RDD.scala:229) > edu.berkeley.cs.amplab.sparkr.BaseRRDD.compute(RRDD.scala:32) > org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) >
[jira] [Updated] (SPARK-6834) Failed with error: ‘invalid package name’ Error in as.name(name) : attempt to use zero-length variable name
[ https://issues.apache.org/jira/browse/SPARK-6834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shivaram Venkataraman updated SPARK-6834: - Target Version/s: (was: 1.6.0) > Failed with error: ‘invalid package name’ Error in as.name(name) : attempt to > use zero-length variable name > --- > > Key: SPARK-6834 > URL: https://issues.apache.org/jira/browse/SPARK-6834 > Project: Spark > Issue Type: Bug > Components: SparkR >Affects Versions: 1.4.0 >Reporter: Shivaram Venkataraman > > Context: trying to interface SparkR with foreach. Foreach is an abstraction > over several parallel backends. This would enable execution on spark cluster > of > 50 R packages including very popular caret and plyr. Simple foreach > examples work. caret unfortunately does not. > I have a repro but it is somewhat complex (it is the main example of model > fitting in caret on their website though, not something made on purpose to > make SparkR fail). If I find anything more straightforward, I will comment > here. Reproduced in an R --vanilla session, but I can uninstall all of my > packages, so I may have missed some deps. > Reproduce with: > install.packages(c("caret", "foreach", "devtools", "mlbench", "gbm", > "survival", "splines")) > library(caret) > library(foreach) > library(devtools) > install_github("RevolutionAnalytics/doParallelSpark", subdir = "pkg") > library(doParallelSpark) > registerDoParallelSpark() > library(mlbench) > data(Sonar) > set.seed(998) > inTraining <- createDataPartition(Sonar$Class, p = .75, list = FALSE) > training <- Sonar[ inTraining,] > testing <- Sonar[-inTraining,] > fitControl <- trainControl(## 10-fold CV > method = "repeatedcv", > number = 10, > ## repeated ten times > repeats = 10) > set.seed(825) > gbmFit1 <- train(Class ~ ., data = training, > method = "gbm", > trControl = fitControl, > ## This last option is actually one > ## for gbm() that passes through > verbose = FALSE) > Stack trace > Failed with error: ‘invalid package name’ > Failed with error: ‘invalid package name’ > Failed with error: ‘invalid package name’ > Failed with error: ‘invalid package name’ > Error in as.name(name) : attempt to use zero-length variable name > Calls: source ... withVisible -> eval -> eval -> getNamespace -> as.name > Execution halted > 15/03/26 14:32:30 ERROR Executor: Exception in task 0.0 in stage 3.0 (TID 5) > org.apache.spark.SparkException: R computation failed with > Failed with error: ‘invalid package name’ > Failed with error: ‘invalid package name’ > Failed with error: ‘invalid package name’ > Failed with error: ‘invalid package name’ > Error in as.name(name) : attempt to use zero-length variable name > Calls: source ... withVisible -> eval -> eval -> getNamespace -> as.name > Execution halted > at edu.berkeley.cs.amplab.sparkr.BaseRRDD.compute(RRDD.scala:80) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:229) > at edu.berkeley.cs.amplab.sparkr.BaseRRDD.compute(RRDD.scala:32) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:229) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) > at org.apache.spark.scheduler.Task.run(Task.scala:54) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > 15/03/26 14:32:30 WARN TaskSetManager: Lost task 0.0 in stage 3.0 (TID 5, > localhost): org.apache.spark.SparkException: R computation failed with > Failed with error: ‘invalid package name’ > Failed with error: ‘invalid package name’ > Failed with error: ‘invalid package name’ > Failed with error: ‘invalid package name’ > Error in as.name(name) : attempt to use zero-length variable name > Calls: source ... withVisible -> eval -> eval -> getNamespace -> as.name > Execution halted > edu.berkeley.cs.amplab.sparkr.BaseRRDD.compute(RRDD.scala:80) > org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) > org.apache.spark.rdd.RDD.iterator(RDD.scala:229) > edu.berkeley.cs.amplab.sparkr.BaseRRDD.compute(RRDD.scala:32) > org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) > org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
[jira] [Updated] (SPARK-6834) Failed with error: ‘invalid package name’ Error in as.name(name) : attempt to use zero-length variable name
[ https://issues.apache.org/jira/browse/SPARK-6834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-6834: - Target Version/s: 1.6.0 (was: 1.5.1, 1.6.0) > Failed with error: ‘invalid package name’ Error in as.name(name) : attempt to > use zero-length variable name > --- > > Key: SPARK-6834 > URL: https://issues.apache.org/jira/browse/SPARK-6834 > Project: Spark > Issue Type: Bug > Components: SparkR >Affects Versions: 1.4.0 >Reporter: Shivaram Venkataraman > > Context: trying to interface SparkR with foreach. Foreach is an abstraction > over several parallel backends. This would enable execution on spark cluster > of > 50 R packages including very popular caret and plyr. Simple foreach > examples work. caret unfortunately does not. > I have a repro but it is somewhat complex (it is the main example of model > fitting in caret on their website though, not something made on purpose to > make SparkR fail). If I find anything more straightforward, I will comment > here. Reproduced in an R --vanilla session, but I can uninstall all of my > packages, so I may have missed some deps. > Reproduce with: > install.packages(c("caret", "foreach", "devtools", "mlbench", "gbm", > "survival", "splines")) > library(caret) > library(foreach) > library(devtools) > install_github("RevolutionAnalytics/doParallelSpark", subdir = "pkg") > library(doParallelSpark) > registerDoParallelSpark() > library(mlbench) > data(Sonar) > set.seed(998) > inTraining <- createDataPartition(Sonar$Class, p = .75, list = FALSE) > training <- Sonar[ inTraining,] > testing <- Sonar[-inTraining,] > fitControl <- trainControl(## 10-fold CV > method = "repeatedcv", > number = 10, > ## repeated ten times > repeats = 10) > set.seed(825) > gbmFit1 <- train(Class ~ ., data = training, > method = "gbm", > trControl = fitControl, > ## This last option is actually one > ## for gbm() that passes through > verbose = FALSE) > Stack trace > Failed with error: ‘invalid package name’ > Failed with error: ‘invalid package name’ > Failed with error: ‘invalid package name’ > Failed with error: ‘invalid package name’ > Error in as.name(name) : attempt to use zero-length variable name > Calls: source ... withVisible -> eval -> eval -> getNamespace -> as.name > Execution halted > 15/03/26 14:32:30 ERROR Executor: Exception in task 0.0 in stage 3.0 (TID 5) > org.apache.spark.SparkException: R computation failed with > Failed with error: ‘invalid package name’ > Failed with error: ‘invalid package name’ > Failed with error: ‘invalid package name’ > Failed with error: ‘invalid package name’ > Error in as.name(name) : attempt to use zero-length variable name > Calls: source ... withVisible -> eval -> eval -> getNamespace -> as.name > Execution halted > at edu.berkeley.cs.amplab.sparkr.BaseRRDD.compute(RRDD.scala:80) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:229) > at edu.berkeley.cs.amplab.sparkr.BaseRRDD.compute(RRDD.scala:32) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:229) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) > at org.apache.spark.scheduler.Task.run(Task.scala:54) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > 15/03/26 14:32:30 WARN TaskSetManager: Lost task 0.0 in stage 3.0 (TID 5, > localhost): org.apache.spark.SparkException: R computation failed with > Failed with error: ‘invalid package name’ > Failed with error: ‘invalid package name’ > Failed with error: ‘invalid package name’ > Failed with error: ‘invalid package name’ > Error in as.name(name) : attempt to use zero-length variable name > Calls: source ... withVisible -> eval -> eval -> getNamespace -> as.name > Execution halted > edu.berkeley.cs.amplab.sparkr.BaseRRDD.compute(RRDD.scala:80) > org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) > org.apache.spark.rdd.RDD.iterator(RDD.scala:229) > edu.berkeley.cs.amplab.sparkr.BaseRRDD.compute(RRDD.scala:32) > org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) > org.apache.spark.rdd.RDD.iterator(RDD.scala:229) >
[jira] [Updated] (SPARK-6834) Failed with error: ‘invalid package name’ Error in as.name(name) : attempt to use zero-length variable name
[ https://issues.apache.org/jira/browse/SPARK-6834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shivaram Venkataraman updated SPARK-6834: - Target Version/s: 1.6.0, 1.5.1 (was: 1.5.0) Failed with error: ‘invalid package name’ Error in as.name(name) : attempt to use zero-length variable name --- Key: SPARK-6834 URL: https://issues.apache.org/jira/browse/SPARK-6834 Project: Spark Issue Type: Bug Components: SparkR Affects Versions: 1.4.0 Reporter: Shivaram Venkataraman Context: trying to interface SparkR with foreach. Foreach is an abstraction over several parallel backends. This would enable execution on spark cluster of 50 R packages including very popular caret and plyr. Simple foreach examples work. caret unfortunately does not. I have a repro but it is somewhat complex (it is the main example of model fitting in caret on their website though, not something made on purpose to make SparkR fail). If I find anything more straightforward, I will comment here. Reproduced in an R --vanilla session, but I can uninstall all of my packages, so I may have missed some deps. Reproduce with: install.packages(c(caret, foreach, devtools, mlbench, gbm, survival, splines)) library(caret) library(foreach) library(devtools) install_github(RevolutionAnalytics/doParallelSpark, subdir = pkg) library(doParallelSpark) registerDoParallelSpark() library(mlbench) data(Sonar) set.seed(998) inTraining - createDataPartition(Sonar$Class, p = .75, list = FALSE) training - Sonar[ inTraining,] testing - Sonar[-inTraining,] fitControl - trainControl(## 10-fold CV method = repeatedcv, number = 10, ## repeated ten times repeats = 10) set.seed(825) gbmFit1 - train(Class ~ ., data = training, method = gbm, trControl = fitControl, ## This last option is actually one ## for gbm() that passes through verbose = FALSE) Stack trace Failed with error: ‘invalid package name’ Failed with error: ‘invalid package name’ Failed with error: ‘invalid package name’ Failed with error: ‘invalid package name’ Error in as.name(name) : attempt to use zero-length variable name Calls: source ... withVisible - eval - eval - getNamespace - as.name Execution halted 15/03/26 14:32:30 ERROR Executor: Exception in task 0.0 in stage 3.0 (TID 5) org.apache.spark.SparkException: R computation failed with Failed with error: ‘invalid package name’ Failed with error: ‘invalid package name’ Failed with error: ‘invalid package name’ Failed with error: ‘invalid package name’ Error in as.name(name) : attempt to use zero-length variable name Calls: source ... withVisible - eval - eval - getNamespace - as.name Execution halted at edu.berkeley.cs.amplab.sparkr.BaseRRDD.compute(RRDD.scala:80) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) at org.apache.spark.rdd.RDD.iterator(RDD.scala:229) at edu.berkeley.cs.amplab.sparkr.BaseRRDD.compute(RRDD.scala:32) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) at org.apache.spark.rdd.RDD.iterator(RDD.scala:229) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:54) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) 15/03/26 14:32:30 WARN TaskSetManager: Lost task 0.0 in stage 3.0 (TID 5, localhost): org.apache.spark.SparkException: R computation failed with Failed with error: ‘invalid package name’ Failed with error: ‘invalid package name’ Failed with error: ‘invalid package name’ Failed with error: ‘invalid package name’ Error in as.name(name) : attempt to use zero-length variable name Calls: source ... withVisible - eval - eval - getNamespace - as.name Execution halted edu.berkeley.cs.amplab.sparkr.BaseRRDD.compute(RRDD.scala:80) org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) org.apache.spark.rdd.RDD.iterator(RDD.scala:229) edu.berkeley.cs.amplab.sparkr.BaseRRDD.compute(RRDD.scala:32) org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) org.apache.spark.rdd.RDD.iterator(RDD.scala:229) org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
[jira] [Updated] (SPARK-6834) Failed with error: ‘invalid package name’ Error in as.name(name) : attempt to use zero-length variable name
[ https://issues.apache.org/jira/browse/SPARK-6834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shivaram Venkataraman updated SPARK-6834: - Target Version/s: 1.5.0 (was: 1.4.0) Failed with error: ‘invalid package name’ Error in as.name(name) : attempt to use zero-length variable name --- Key: SPARK-6834 URL: https://issues.apache.org/jira/browse/SPARK-6834 Project: Spark Issue Type: Bug Components: SparkR Affects Versions: 1.4.0 Reporter: Shivaram Venkataraman Context: trying to interface SparkR with foreach. Foreach is an abstraction over several parallel backends. This would enable execution on spark cluster of 50 R packages including very popular caret and plyr. Simple foreach examples work. caret unfortunately does not. I have a repro but it is somewhat complex (it is the main example of model fitting in caret on their website though, not something made on purpose to make SparkR fail). If I find anything more straightforward, I will comment here. Reproduced in an R --vanilla session, but I can uninstall all of my packages, so I may have missed some deps. Reproduce with: install.packages(c(caret, foreach, devtools, mlbench, gbm, survival, splines)) library(caret) library(foreach) library(devtools) install_github(RevolutionAnalytics/doParallelSpark, subdir = pkg) library(doParallelSpark) registerDoParallelSpark() library(mlbench) data(Sonar) set.seed(998) inTraining - createDataPartition(Sonar$Class, p = .75, list = FALSE) training - Sonar[ inTraining,] testing - Sonar[-inTraining,] fitControl - trainControl(## 10-fold CV method = repeatedcv, number = 10, ## repeated ten times repeats = 10) set.seed(825) gbmFit1 - train(Class ~ ., data = training, method = gbm, trControl = fitControl, ## This last option is actually one ## for gbm() that passes through verbose = FALSE) Stack trace Failed with error: ‘invalid package name’ Failed with error: ‘invalid package name’ Failed with error: ‘invalid package name’ Failed with error: ‘invalid package name’ Error in as.name(name) : attempt to use zero-length variable name Calls: source ... withVisible - eval - eval - getNamespace - as.name Execution halted 15/03/26 14:32:30 ERROR Executor: Exception in task 0.0 in stage 3.0 (TID 5) org.apache.spark.SparkException: R computation failed with Failed with error: ‘invalid package name’ Failed with error: ‘invalid package name’ Failed with error: ‘invalid package name’ Failed with error: ‘invalid package name’ Error in as.name(name) : attempt to use zero-length variable name Calls: source ... withVisible - eval - eval - getNamespace - as.name Execution halted at edu.berkeley.cs.amplab.sparkr.BaseRRDD.compute(RRDD.scala:80) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) at org.apache.spark.rdd.RDD.iterator(RDD.scala:229) at edu.berkeley.cs.amplab.sparkr.BaseRRDD.compute(RRDD.scala:32) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) at org.apache.spark.rdd.RDD.iterator(RDD.scala:229) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:54) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) 15/03/26 14:32:30 WARN TaskSetManager: Lost task 0.0 in stage 3.0 (TID 5, localhost): org.apache.spark.SparkException: R computation failed with Failed with error: ‘invalid package name’ Failed with error: ‘invalid package name’ Failed with error: ‘invalid package name’ Failed with error: ‘invalid package name’ Error in as.name(name) : attempt to use zero-length variable name Calls: source ... withVisible - eval - eval - getNamespace - as.name Execution halted edu.berkeley.cs.amplab.sparkr.BaseRRDD.compute(RRDD.scala:80) org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) org.apache.spark.rdd.RDD.iterator(RDD.scala:229) edu.berkeley.cs.amplab.sparkr.BaseRRDD.compute(RRDD.scala:32) org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) org.apache.spark.rdd.RDD.iterator(RDD.scala:229) org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)