[ https://issues.apache.org/jira/browse/SPARK-45557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17818942#comment-17818942 ]
Albert Wong commented on SPARK-45557: ------------------------------------- Related https://issues.apache.org/jira/browse/SPARK-47105 > Spark Connect can not be started because of missing user home dir in Docker > container > ------------------------------------------------------------------------------------- > > Key: SPARK-45557 > URL: https://issues.apache.org/jira/browse/SPARK-45557 > Project: Spark > Issue Type: Bug > Components: Spark Docker > Affects Versions: 3.4.0, 3.4.1, 3.5.0 > Reporter: Niels Pardon > Priority: Minor > > I was trying to start Spark Connect within a container using the Spark Docker > container images and ran into an issue where Ivy could not pull the Spark > Connect JAR since the user home /home/spark does not exist. > Steps to reproduce: > 1. Start the Spark container with `/bin/bash` as the command: > {code:java} > docker run -it --rm apache/spark:3.5.0 /bin/bash {code} > 2. Try to start Spark Connect within the container: > > {code:java} > /opt/spark/sbin/start-connect-server.sh --packages > org.apache.spark:spark-connect_2.12:3.5.0 {code} > which lead to this output: > > > {code:java} > starting org.apache.spark.sql.connect.service.SparkConnectServer, logging to > /opt/spark/logs/spark--org.apache.spark.sql.connect.service.SparkConnectServer-1-d8470a71dbd7.out > failed to launch: nice -n 0 bash /opt/spark/bin/spark-submit --class > org.apache.spark.sql.connect.service.SparkConnectServer --name Spark Connect > server --packages org.apache.spark:spark-connect_2.12:3.5.0 > at > org.apache.spark.deploy.SparkSubmitUtils$.resolveMavenCoordinates(SparkSubmit.scala:1535) > at > org.apache.spark.util.DependencyUtils$.resolveMavenDependencies(DependencyUtils.scala:185) > at > org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:334) > at > org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:964) > at > org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:194) > at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:217) > at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:91) > at > org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1120) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1129) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > full log in > /opt/spark/logs/spark--org.apache.spark.sql.connect.service.SparkConnectServer-1-d8470a71dbd7.out > {code} > where then the full log file looks like this: > {code:java} > Spark Command: /opt/java/openjdk/bin/java -cp > /opt/spark/conf:/opt/spark/jars/* -Xmx1g -XX:+IgnoreUnrecognizedVMOptions > --add-opens=java.base/java.lang=ALL-UNNAMED > --add-opens=java.base/java.lang.invoke=ALL-UNNAMED > --add-opens=java.base/java.lang.reflect=ALL-UNNAMED > --add-opens=java.base/java.io=ALL-UNNAMED > --add-opens=java.base/java.net=ALL-UNNAMED > --add-opens=java.base/java.nio=ALL-UNNAMED > --add-opens=java.base/java.util=ALL-UNNAMED > --add-opens=java.base/java.util.concurrent=ALL-UNNAMED > --add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED > --add-opens=java.base/sun.nio.ch=ALL-UNNAMED > --add-opens=java.base/sun.nio.cs=ALL-UNNAMED > --add-opens=java.base/sun.security.action=ALL-UNNAMED > --add-opens=java.base/sun.util.calendar=ALL-UNNAMED > --add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED > -Djdk.reflect.useDirectMethodHandle=false org.apache.spark.deploy.SparkSubmit > --class org.apache.spark.sql.connect.service.SparkConnectServer --name Spark > Connect server --packages org.apache.spark:spark-connect_2.12:3.5.0 > spark-internal > ======================================== > :: loading settings :: url = > jar:file:/opt/spark/jars/ivy-2.5.1.jar!/org/apache/ivy/core/settings/ivysettings.xml > Ivy Default Cache set to: /home/spark/.ivy2/cache > The jars for the packages stored in: /home/spark/.ivy2/jars > org.apache.spark#spark-connect_2.12 added as a dependency > :: resolving dependencies :: > org.apache.spark#spark-submit-parent-f8a04936-e8af-4f37-bdb0-e4026a8a3be5;1.0 > confs: [default] > Exception in thread "main" java.io.FileNotFoundException: > /home/spark/.ivy2/cache/resolved-org.apache.spark-spark-submit-parent-f8a04936-e8af-4f37-bdb0-e4026a8a3be5-1.0.xml > (No such file or directory) > at java.base/java.io.FileOutputStream.open0(Native Method) > at java.base/java.io.FileOutputStream.open(Unknown Source) > at java.base/java.io.FileOutputStream.<init>(Unknown Source) > at java.base/java.io.FileOutputStream.<init>(Unknown Source) > at > org.apache.ivy.plugins.parser.xml.XmlModuleDescriptorWriter.write(XmlModuleDescriptorWriter.java:71) > at > org.apache.ivy.plugins.parser.xml.XmlModuleDescriptorWriter.write(XmlModuleDescriptorWriter.java:63) > at > org.apache.ivy.core.module.descriptor.DefaultModuleDescriptor.toIvyFile(DefaultModuleDescriptor.java:553) > at > org.apache.ivy.core.cache.DefaultResolutionCacheManager.saveResolvedModuleDescriptor(DefaultResolutionCacheManager.java:184) > at > org.apache.ivy.core.resolve.ResolveEngine.resolve(ResolveEngine.java:259) > at org.apache.ivy.Ivy.resolve(Ivy.java:522) > at > org.apache.spark.deploy.SparkSubmitUtils$.resolveMavenCoordinates(SparkSubmit.scala:1535) > at > org.apache.spark.util.DependencyUtils$.resolveMavenDependencies(DependencyUtils.scala:185) > at > org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:334) > at > org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:964) > at > org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:194) > at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:217) > at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:91) > at > org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1120) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1129) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) {code} > > The issue is that the user home /home/spark directory does not exist. > {code:java} > $ ls -l /home > total 0 > $ > {code} > It seems there is an easy fix: simply switching from useradd to adduser in > the Dockerfile should get the user home directory created. > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org