[ https://issues.apache.org/jira/browse/SPARK-23632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16394865#comment-16394865 ]
Jaehyeon Kim commented on SPARK-23632: -------------------------------------- It wouldn't be an issue if code is run in an interactive way or spark session is created previously. For the former, I can just repeat _sparkR.session()_ and, for the latter, packages that're downloaded will be used. However, let say a new Spark cluster is spinned off and code is run by _Rscript app-code.R_, it'll fail due to the timeout. > sparkR.session() error with spark packages - JVM is not ready after 10 seconds > ------------------------------------------------------------------------------ > > Key: SPARK-23632 > URL: https://issues.apache.org/jira/browse/SPARK-23632 > Project: Spark > Issue Type: Bug > Components: SparkR > Affects Versions: 2.2.0, 2.2.1, 2.3.0 > Reporter: Jaehyeon Kim > Priority: Minor > > Hi > When I execute _sparkR.session()_ with _org.apache.hadoop:hadoop-aws:2.8.2_ > as following, > {code:java} > library(SparkR, lib.loc=file.path(Sys.getenv('SPARK_HOME'),'R', 'lib')) > ext_opts <- '-Dhttp.proxyHost=10.74.1.25 -Dhttp.proxyPort=8080 > -Dhttps.proxyHost=10.74.1.25 -Dhttps.proxyPort=8080' > sparkR.session(master = "spark://master:7077", > appName = 'ml demo', > sparkConfig = list(spark.driver.memory = '2g'), > sparkPackages = 'org.apache.hadoop:hadoop-aws:2.8.2', > spark.driver.extraJavaOptions = ext_opts) > {code} > I see *JVM is not ready after 10 seconds* error. Below shows some of the log > messages. > {code:java} > Ivy Default Cache set to: /home/rstudio/.ivy2/cache > The jars for the packages stored in: /home/rstudio/.ivy2/jars > :: loading settings :: url = > jar:file:/usr/local/spark-2.2.1/jars/ivy-2.4.0.jar!/org/apache/ivy/core/settings/ivysettings.xml > org.apache.hadoop#hadoop-aws added as a dependency > :: resolving dependencies :: org.apache.spark#spark-submit-parent;1.0 > confs: [default] > found org.apache.hadoop#hadoop-aws;2.8.2 in central > ... > ... > found javax.servlet.jsp#jsp-api;2.1 in central > Error in sparkR.sparkContext(master, appName, sparkHome, sparkConfigMap, : > JVM is not ready after 10 seconds > ... > ... > found joda-time#joda-time;2.9.4 in central > downloading > https://repo1.maven.org/maven2/org/apache/hadoop/hadoop-aws/2.8.2/hadoop-aws-2.8.2.jar > ... > ... > ... > xmlenc#xmlenc;0.52 from central in [default] > --------------------------------------------------------------------- > | | modules || artifacts | > | conf | number| search|dwnlded|evicted|| number|dwnlded| > --------------------------------------------------------------------- > | default | 76 | 76 | 76 | 0 || 76 | 76 | > --------------------------------------------------------------------- > :: retrieving :: org.apache.spark#spark-submit-parent > confs: [default] > 76 artifacts copied, 0 already retrieved (27334kB/56ms) > {code} > It's fine if I re-execute it after the package and its dependencies are > downloaded. > I consider it's because of this part - > https://github.com/apache/spark/blob/master/R/pkg/R/sparkR.R#L181 > {code:java} > if (!file.exists(path)) { > stop("JVM is not ready after 10 seconds") > } > {code} > Just wonder if it may be possible to update so that a user can determine how > much to wait? > Thanks. > Regards > Jaehyeon -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org