[ 
https://issues.apache.org/jira/browse/SPARK-8340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14584894#comment-14584894
 ] 

nevi_me commented on SPARK-8340:
--------------------------------

Finally found the issue.

1. Documentation on Spark not so great for Windows users. I struggled to start 
a context under a master anywhere other than local. That meant that everything 
ran in local.

2. Due to the above, I would somehow connect sporadically to spark. I tried 
fixing that by adjusting default config, which seems to have created the 
connection error.

The initial error that I was getting was returnStatus == 0 is not TRUE, and 
looking at the SparkR package source, I couldn't figure out what the issue was.
Now that I can connect to a master with workers, I can see the cause of the 
returnStatus issue, which is below:

{code}
Job aborted due to stage failure: Serialized task 0:0 was 77821642 bytes, which 
exceeds max allowed: spark.akka.frameSize (10485760 bytes) - reserved (204800 
bytes). Consider increasing spark.akka.frameSize or using broadcast variables 
for large values.
{code}

What I would suggest on the documentation is specifying that to start the 
master and workers by hand, Windows users should use the following:
{code}
spark-class.cmd org.apache.spark.deploy.master.Master 

spark-class.cmd org.apache.spark.deploy.worker.Worker spark://host:port
{code}

It would have saved me half a day and a smile on my face.

> Error creating sparkR dataframe: Error in writeJobj(ocn, object): invalid 
> jobj 2
> --------------------------------------------------------------------------------
>
>                 Key: SPARK-8340
>                 URL: https://issues.apache.org/jira/browse/SPARK-8340
>             Project: Spark
>          Issue Type: Bug
>          Components: SparkR
>    Affects Versions: 1.4.0
>         Environment: Windows 8.1, RRO R 3.2.0
>            Reporter: nevi_me
>
> I tried executing the following code:
> {code:title="extract from dataframe.R in examples"}
>   library(SparkR)
>   sc <- sparkR.init(appName="SparkR-DataFrame-example")
>   sqlContext <- sparkRSQL.init(sc)
>   localDF <- data.frame(name=c("John", "Smith", "Sarah"), age=c(19, 23, 18))
>   df <- createDataFrame(sqlContext, localDF)
> {code}
> I get the error:
> {code:title="error"}
>   Error in writeJobj(con, object) : invalid jobj 2
> {code}
> I also tried creating dataframes out of existing data.frames in my project, 
> but I get the same error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to