Jaehyeon Kim created SPARK-23632:
------------------------------------

             Summary: sparkR.session() error with spark packages - JVM is not 
ready after 10 seconds
                 Key: SPARK-23632
                 URL: https://issues.apache.org/jira/browse/SPARK-23632
             Project: Spark
          Issue Type: Bug
          Components: SparkR
    Affects Versions: 2.3.0, 2.2.1, 2.2.0
            Reporter: Jaehyeon Kim


Hi

When I execute _sparkR.session()_ with _org.apache.hadoop:hadoop-aws:2.8.2_ as 
following,

{code:java}
library(SparkR, lib.loc=file.path(Sys.getenv('SPARK_HOME'),'R', 'lib'))

ext_opts <- '-Dhttp.proxyHost=10.74.1.25 -Dhttp.proxyPort=8080 
-Dhttps.proxyHost=10.74.1.25 -Dhttps.proxyPort=8080'
sparkR.session(master = "spark://master:7077",
               appName = 'ml demo',
               sparkConfig = list(spark.driver.memory = '2g'), 
               sparkPackages = 'org.apache.hadoop:hadoop-aws:2.8.2',
               spark.driver.extraJavaOptions = ext_opts)
{code}

I see *JVM is not ready after 10 seconds* error. Below shows some of the log 
messages.

{code:java}
Ivy Default Cache set to: /home/rstudio/.ivy2/cache
The jars for the packages stored in: /home/rstudio/.ivy2/jars
:: loading settings :: url = 
jar:file:/usr/local/spark-2.2.1/jars/ivy-2.4.0.jar!/org/apache/ivy/core/settings/ivysettings.xml
org.apache.hadoop#hadoop-aws added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent;1.0
        confs: [default]
        found org.apache.hadoop#hadoop-aws;2.8.2 in central
...
...
        found javax.servlet.jsp#jsp-api;2.1 in central
Error in sparkR.sparkContext(master, appName, sparkHome, sparkConfigMap,  : 
  JVM is not ready after 10 seconds
...
...
        found joda-time#joda-time;2.9.4 in central
downloading 
https://repo1.maven.org/maven2/org/apache/hadoop/hadoop-aws/2.8.2/hadoop-aws-2.8.2.jar
 ...
...
...
        xmlenc#xmlenc;0.52 from central in [default]
        ---------------------------------------------------------------------
        |                  |            modules            ||   artifacts   |
        |       conf       | number| search|dwnlded|evicted|| number|dwnlded|
        ---------------------------------------------------------------------
        |      default     |   76  |   76  |   76  |   0   ||   76  |   76  |
        ---------------------------------------------------------------------
:: retrieving :: org.apache.spark#spark-submit-parent
        confs: [default]
        76 artifacts copied, 0 already retrieved (27334kB/56ms)
{code}

It's fine if I re-execute it after the package and its dependencies are 
downloaded.

I consider it's because of this part - 
https://github.com/apache/spark/blob/master/R/pkg/R/sparkR.R#L181

{code:java}
if (!file.exists(path)) {
  stop("JVM is not ready after 10 seconds")
}
{code}

Just wonder if it may be possible to update so that a user can determine how 
much to wait?

Thanks.

Regards
Jaehyeon



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to