[ 
https://issues.apache.org/jira/browse/SPARK-23632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16394865#comment-16394865
 ] 

Jaehyeon Kim commented on SPARK-23632:
--------------------------------------

It wouldn't be an issue if code is run in an interactive way or spark session 
is created previously. For the former, I can just repeat _sparkR.session()_ 
and, for the latter, packages that're downloaded will be used.

However, let say a new Spark cluster is spinned off and code is run by _Rscript 
app-code.R_, it'll fail due to the timeout.

> sparkR.session() error with spark packages - JVM is not ready after 10 seconds
> ------------------------------------------------------------------------------
>
>                 Key: SPARK-23632
>                 URL: https://issues.apache.org/jira/browse/SPARK-23632
>             Project: Spark
>          Issue Type: Bug
>          Components: SparkR
>    Affects Versions: 2.2.0, 2.2.1, 2.3.0
>            Reporter: Jaehyeon Kim
>            Priority: Minor
>
> Hi
> When I execute _sparkR.session()_ with _org.apache.hadoop:hadoop-aws:2.8.2_ 
> as following,
> {code:java}
> library(SparkR, lib.loc=file.path(Sys.getenv('SPARK_HOME'),'R', 'lib'))
> ext_opts <- '-Dhttp.proxyHost=10.74.1.25 -Dhttp.proxyPort=8080 
> -Dhttps.proxyHost=10.74.1.25 -Dhttps.proxyPort=8080'
> sparkR.session(master = "spark://master:7077",
>                appName = 'ml demo',
>                sparkConfig = list(spark.driver.memory = '2g'), 
>                sparkPackages = 'org.apache.hadoop:hadoop-aws:2.8.2',
>                spark.driver.extraJavaOptions = ext_opts)
> {code}
> I see *JVM is not ready after 10 seconds* error. Below shows some of the log 
> messages.
> {code:java}
> Ivy Default Cache set to: /home/rstudio/.ivy2/cache
> The jars for the packages stored in: /home/rstudio/.ivy2/jars
> :: loading settings :: url = 
> jar:file:/usr/local/spark-2.2.1/jars/ivy-2.4.0.jar!/org/apache/ivy/core/settings/ivysettings.xml
> org.apache.hadoop#hadoop-aws added as a dependency
> :: resolving dependencies :: org.apache.spark#spark-submit-parent;1.0
>       confs: [default]
>       found org.apache.hadoop#hadoop-aws;2.8.2 in central
> ...
> ...
>       found javax.servlet.jsp#jsp-api;2.1 in central
> Error in sparkR.sparkContext(master, appName, sparkHome, sparkConfigMap,  : 
>   JVM is not ready after 10 seconds
> ...
> ...
>       found joda-time#joda-time;2.9.4 in central
> downloading 
> https://repo1.maven.org/maven2/org/apache/hadoop/hadoop-aws/2.8.2/hadoop-aws-2.8.2.jar
>  ...
> ...
> ...
>       xmlenc#xmlenc;0.52 from central in [default]
>       ---------------------------------------------------------------------
>       |                  |            modules            ||   artifacts   |
>       |       conf       | number| search|dwnlded|evicted|| number|dwnlded|
>       ---------------------------------------------------------------------
>       |      default     |   76  |   76  |   76  |   0   ||   76  |   76  |
>       ---------------------------------------------------------------------
> :: retrieving :: org.apache.spark#spark-submit-parent
>       confs: [default]
>       76 artifacts copied, 0 already retrieved (27334kB/56ms)
> {code}
> It's fine if I re-execute it after the package and its dependencies are 
> downloaded.
> I consider it's because of this part - 
> https://github.com/apache/spark/blob/master/R/pkg/R/sparkR.R#L181
> {code:java}
> if (!file.exists(path)) {
>   stop("JVM is not ready after 10 seconds")
> }
> {code}
> Just wonder if it may be possible to update so that a user can determine how 
> much to wait?
> Thanks.
> Regards
> Jaehyeon



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to