[
https://issues.apache.org/jira/browse/SPARK-8340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14584894#comment-14584894
]
nevi_me commented on SPARK-8340:
--------------------------------
Finally found the issue.
1. Documentation on Spark not so great for Windows users. I struggled to start
a context under a master anywhere other than local. That meant that everything
ran in local.
2. Due to the above, I would somehow connect sporadically to spark. I tried
fixing that by adjusting default config, which seems to have created the
connection error.
The initial error that I was getting was returnStatus == 0 is not TRUE, and
looking at the SparkR package source, I couldn't figure out what the issue was.
Now that I can connect to a master with workers, I can see the cause of the
returnStatus issue, which is below:
{code}
Job aborted due to stage failure: Serialized task 0:0 was 77821642 bytes, which
exceeds max allowed: spark.akka.frameSize (10485760 bytes) - reserved (204800
bytes). Consider increasing spark.akka.frameSize or using broadcast variables
for large values.
{code}
What I would suggest on the documentation is specifying that to start the
master and workers by hand, Windows users should use the following:
{code}
spark-class.cmd org.apache.spark.deploy.master.Master
spark-class.cmd org.apache.spark.deploy.worker.Worker spark://host:port
{code}
It would have saved me half a day and a smile on my face.
> Error creating sparkR dataframe: Error in writeJobj(ocn, object): invalid
> jobj 2
> --------------------------------------------------------------------------------
>
> Key: SPARK-8340
> URL: https://issues.apache.org/jira/browse/SPARK-8340
> Project: Spark
> Issue Type: Bug
> Components: SparkR
> Affects Versions: 1.4.0
> Environment: Windows 8.1, RRO R 3.2.0
> Reporter: nevi_me
>
> I tried executing the following code:
> {code:title="extract from dataframe.R in examples"}
> library(SparkR)
> sc <- sparkR.init(appName="SparkR-DataFrame-example")
> sqlContext <- sparkRSQL.init(sc)
> localDF <- data.frame(name=c("John", "Smith", "Sarah"), age=c(19, 23, 18))
> df <- createDataFrame(sqlContext, localDF)
> {code}
> I get the error:
> {code:title="error"}
> Error in writeJobj(con, object) : invalid jobj 2
> {code}
> I also tried creating dataframes out of existing data.frames in my project,
> but I get the same error.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]