[jira] [Commented] (SPARK-5754) Spark AM not launching on Windows
[ https://issues.apache.org/jira/browse/SPARK-5754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14662975#comment-14662975 ] Carsten Blank commented on SPARK-5754: -- BTW, the reason why '-Dspark.driver.port=21390' doesn't work is a JVM issue, since it takes the enclosed information "as is" if single quotes are used, so it thinks this is a java class which happens to start with a '-'. I am not sure, if this is a Windows JVM specific situation, or if this happens in Linux just the same. > Spark AM not launching on Windows > - > > Key: SPARK-5754 > URL: https://issues.apache.org/jira/browse/SPARK-5754 > Project: Spark > Issue Type: Bug > Components: Windows, YARN >Affects Versions: 1.1.1, 1.2.0 > Environment: Windows Server 2012, Hadoop 2.4.1. >Reporter: Inigo > > I'm trying to run Spark Pi on a YARN cluster running on Windows and the AM > container fails to start. The problem seems to be in the generation of the > YARN command which adds single quotes (') surrounding some of the java > options. In particular, the part of the code that is adding those is the > escapeForShell function in YarnSparkHadoopUtil. Apparently, Windows does not > like the quotes for these options. Here is an example of the command that the > container tries to execute: > @call %JAVA_HOME%/bin/java -server -Xmx512m -Djava.io.tmpdir=%PWD%/tmp > '-Dspark.yarn.secondary.jars=' > '-Dspark.app.name=org.apache.spark.examples.SparkPi' > '-Dspark.master=yarn-cluster' org.apache.spark.deploy.yarn.ApplicationMaster > --class 'org.apache.spark.examples.SparkPi' --jar > 'file:/D:/data/spark-1.1.1-bin-hadoop2.4/bin/../lib/spark-examples-1.1.1-hadoop2.4.0.jar' > --executor-memory 1024 --executor-cores 1 --num-executors 2 > Once I transform it into: > @call %JAVA_HOME%/bin/java -server -Xmx512m -Djava.io.tmpdir=%PWD%/tmp > -Dspark.yarn.secondary.jars= > -Dspark.app.name=org.apache.spark.examples.SparkPi > -Dspark.master=yarn-cluster org.apache.spark.deploy.yarn.ApplicationMaster > --class 'org.apache.spark.examples.SparkPi' --jar > 'file:/D:/data/spark-1.1.1-bin-hadoop2.4/bin/../lib/spark-examples-1.1.1-hadoop2.4.0.jar' > --executor-memory 1024 --executor-cores 1 --num-executors 2 > Everything seems to start. > How should I deal with this? Creating a separate function like escapeForShell > for Windows and call it whenever I detect this is for Windows? Or should I > add some sanity check on YARN? > I checked a little and there seems to be people that is able to run Spark on > YARN on Windows, so it might be something else. I didn't find anything > related on Jira either. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-5754) Spark AM not launching on Windows
[ https://issues.apache.org/jira/browse/SPARK-5754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14662919#comment-14662919 ] Carsten Blank commented on SPARK-5754: -- Okay, looking over the case and not making the project admin's happy with my solution (and damn rightly so!!) I revisited the problem. It occurs to me, that this is *not* really a Windows cmd Problem. It rather seems to be a problem with the jvm in combination with cmd. My findings: Firstly, -XX:OnOutOfMemoryError='kill %p' will never ever work on Windows cmd. So far this is a cmd-specific situation. The reason why this is an issue lies in the fact that only double quotes are respected by cmd for arguments that belong together. (See http://ss64.com/nt/syntax-esc.html) Everything with a space thus must be enclosed by double quotes! That was in essence dharmeshkakadia suggestion as a PR. Secondly, -XX:OnOutOfMemoryError='kill %p' will only work if you got 'kill' installed. As having GNU Tools a must for anyone that needs to build hadoop from scratch on Windows it will not really come up as a problem. Still, should be fixed to -XX:OnOutOfMemoryError="taskkill /F /PID %p" see e.g. https://www.java.net/node/692850 Thirdly, Usually cmd expects, as written earlier, that if a % is read, that a environment variable is about to be parsed. Although I wasn't able to really verify that this is a problem on the command line, it seems to be an issue in .cmd files. The correct way to avoid any hassle is using %%. In fact, please look at http://www.oracle.com/technetwork/java/javase/clopts-139448.html#gbmum it is explained that %p and %%p are treated equally and that they map to the current PID. Thus, I suggest the basic escaping logic in spark to distuingish the quoting between unix (') and windows ("). Furthermore, all option variables shall be written with %%, as the JVM interprets them as a single % completely platform independent. The last change suggestion should be in distuingishing which "kill" command to use, according to platform. I will close my PR and open a new one, once I have verified in action all that I have written here. I would be very glad about comments and experience. > Spark AM not launching on Windows > - > > Key: SPARK-5754 > URL: https://issues.apache.org/jira/browse/SPARK-5754 > Project: Spark > Issue Type: Bug > Components: Windows, YARN >Affects Versions: 1.1.1, 1.2.0 > Environment: Windows Server 2012, Hadoop 2.4.1. >Reporter: Inigo > > I'm trying to run Spark Pi on a YARN cluster running on Windows and the AM > container fails to start. The problem seems to be in the generation of the > YARN command which adds single quotes (') surrounding some of the java > options. In particular, the part of the code that is adding those is the > escapeForShell function in YarnSparkHadoopUtil. Apparently, Windows does not > like the quotes for these options. Here is an example of the command that the > container tries to execute: > @call %JAVA_HOME%/bin/java -server -Xmx512m -Djava.io.tmpdir=%PWD%/tmp > '-Dspark.yarn.secondary.jars=' > '-Dspark.app.name=org.apache.spark.examples.SparkPi' > '-Dspark.master=yarn-cluster' org.apache.spark.deploy.yarn.ApplicationMaster > --class 'org.apache.spark.examples.SparkPi' --jar > 'file:/D:/data/spark-1.1.1-bin-hadoop2.4/bin/../lib/spark-examples-1.1.1-hadoop2.4.0.jar' > --executor-memory 1024 --executor-cores 1 --num-executors 2 > Once I transform it into: > @call %JAVA_HOME%/bin/java -server -Xmx512m -Djava.io.tmpdir=%PWD%/tmp > -Dspark.yarn.secondary.jars= > -Dspark.app.name=org.apache.spark.examples.SparkPi > -Dspark.master=yarn-cluster org.apache.spark.deploy.yarn.ApplicationMaster > --class 'org.apache.spark.examples.SparkPi' --jar > 'file:/D:/data/spark-1.1.1-bin-hadoop2.4/bin/../lib/spark-examples-1.1.1-hadoop2.4.0.jar' > --executor-memory 1024 --executor-cores 1 --num-executors 2 > Everything seems to start. > How should I deal with this? Creating a separate function like escapeForShell > for Windows and call it whenever I detect this is for Windows? Or should I > add some sanity check on YARN? > I checked a little and there seems to be people that is able to run Spark on > YARN on Windows, so it might be something else. I didn't find anything > related on Jira either. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-5754) Spark AM not launching on Windows
[ https://issues.apache.org/jira/browse/SPARK-5754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14650677#comment-14650677 ] Carsten Blank commented on SPARK-5754: -- Ah I see! Yes for a single situation such as yours, mine and the others here that need to use spark on hadoop/yarn/windows this will work. I actually did something like that too ^^ It seems however to me we should find a good solution to fix this, for good. The discussion on dharmeshkakadia PR is useful. > Spark AM not launching on Windows > - > > Key: SPARK-5754 > URL: https://issues.apache.org/jira/browse/SPARK-5754 > Project: Spark > Issue Type: Bug > Components: Windows, YARN >Affects Versions: 1.1.1, 1.2.0 > Environment: Windows Server 2012, Hadoop 2.4.1. >Reporter: Inigo > > I'm trying to run Spark Pi on a YARN cluster running on Windows and the AM > container fails to start. The problem seems to be in the generation of the > YARN command which adds single quotes (') surrounding some of the java > options. In particular, the part of the code that is adding those is the > escapeForShell function in YarnSparkHadoopUtil. Apparently, Windows does not > like the quotes for these options. Here is an example of the command that the > container tries to execute: > @call %JAVA_HOME%/bin/java -server -Xmx512m -Djava.io.tmpdir=%PWD%/tmp > '-Dspark.yarn.secondary.jars=' > '-Dspark.app.name=org.apache.spark.examples.SparkPi' > '-Dspark.master=yarn-cluster' org.apache.spark.deploy.yarn.ApplicationMaster > --class 'org.apache.spark.examples.SparkPi' --jar > 'file:/D:/data/spark-1.1.1-bin-hadoop2.4/bin/../lib/spark-examples-1.1.1-hadoop2.4.0.jar' > --executor-memory 1024 --executor-cores 1 --num-executors 2 > Once I transform it into: > @call %JAVA_HOME%/bin/java -server -Xmx512m -Djava.io.tmpdir=%PWD%/tmp > -Dspark.yarn.secondary.jars= > -Dspark.app.name=org.apache.spark.examples.SparkPi > -Dspark.master=yarn-cluster org.apache.spark.deploy.yarn.ApplicationMaster > --class 'org.apache.spark.examples.SparkPi' --jar > 'file:/D:/data/spark-1.1.1-bin-hadoop2.4/bin/../lib/spark-examples-1.1.1-hadoop2.4.0.jar' > --executor-memory 1024 --executor-cores 1 --num-executors 2 > Everything seems to start. > How should I deal with this? Creating a separate function like escapeForShell > for Windows and call it whenever I detect this is for Windows? Or should I > add some sanity check on YARN? > I checked a little and there seems to be people that is able to run Spark on > YARN on Windows, so it might be something else. I didn't find anything > related on Jira either. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-5754) Spark AM not launching on Windows
[ https://issues.apache.org/jira/browse/SPARK-5754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14650444#comment-14650444 ] Carsten Blank commented on SPARK-5754: -- Okay so I have thought about this more and kinda have a "qualified" opinion now. I assume that you have fixed this issue for your problem? Did you write a separate escapeForShell for Windows? I have and I would like to suggest something like that for a PR. How did you solve this? > Spark AM not launching on Windows > - > > Key: SPARK-5754 > URL: https://issues.apache.org/jira/browse/SPARK-5754 > Project: Spark > Issue Type: Bug > Components: Windows, YARN >Affects Versions: 1.1.1, 1.2.0 > Environment: Windows Server 2012, Hadoop 2.4.1. >Reporter: Inigo > > I'm trying to run Spark Pi on a YARN cluster running on Windows and the AM > container fails to start. The problem seems to be in the generation of the > YARN command which adds single quotes (') surrounding some of the java > options. In particular, the part of the code that is adding those is the > escapeForShell function in YarnSparkHadoopUtil. Apparently, Windows does not > like the quotes for these options. Here is an example of the command that the > container tries to execute: > @call %JAVA_HOME%/bin/java -server -Xmx512m -Djava.io.tmpdir=%PWD%/tmp > '-Dspark.yarn.secondary.jars=' > '-Dspark.app.name=org.apache.spark.examples.SparkPi' > '-Dspark.master=yarn-cluster' org.apache.spark.deploy.yarn.ApplicationMaster > --class 'org.apache.spark.examples.SparkPi' --jar > 'file:/D:/data/spark-1.1.1-bin-hadoop2.4/bin/../lib/spark-examples-1.1.1-hadoop2.4.0.jar' > --executor-memory 1024 --executor-cores 1 --num-executors 2 > Once I transform it into: > @call %JAVA_HOME%/bin/java -server -Xmx512m -Djava.io.tmpdir=%PWD%/tmp > -Dspark.yarn.secondary.jars= > -Dspark.app.name=org.apache.spark.examples.SparkPi > -Dspark.master=yarn-cluster org.apache.spark.deploy.yarn.ApplicationMaster > --class 'org.apache.spark.examples.SparkPi' --jar > 'file:/D:/data/spark-1.1.1-bin-hadoop2.4/bin/../lib/spark-examples-1.1.1-hadoop2.4.0.jar' > --executor-memory 1024 --executor-cores 1 --num-executors 2 > Everything seems to start. > How should I deal with this? Creating a separate function like escapeForShell > for Windows and call it whenever I detect this is for Windows? Or should I > add some sanity check on YARN? > I checked a little and there seems to be people that is able to run Spark on > YARN on Windows, so it might be something else. I didn't find anything > related on Jira either. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-5754) Spark AM not launching on Windows
[ https://issues.apache.org/jira/browse/SPARK-5754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14649364#comment-14649364 ] Carsten Blank commented on SPARK-5754: -- Thanks, I will put in more time and come up with a suggestion. Catch ya later > Spark AM not launching on Windows > - > > Key: SPARK-5754 > URL: https://issues.apache.org/jira/browse/SPARK-5754 > Project: Spark > Issue Type: Bug > Components: Windows, YARN >Affects Versions: 1.1.1, 1.2.0 > Environment: Windows Server 2012, Hadoop 2.4.1. >Reporter: Inigo > > I'm trying to run Spark Pi on a YARN cluster running on Windows and the AM > container fails to start. The problem seems to be in the generation of the > YARN command which adds single quotes (') surrounding some of the java > options. In particular, the part of the code that is adding those is the > escapeForShell function in YarnSparkHadoopUtil. Apparently, Windows does not > like the quotes for these options. Here is an example of the command that the > container tries to execute: > @call %JAVA_HOME%/bin/java -server -Xmx512m -Djava.io.tmpdir=%PWD%/tmp > '-Dspark.yarn.secondary.jars=' > '-Dspark.app.name=org.apache.spark.examples.SparkPi' > '-Dspark.master=yarn-cluster' org.apache.spark.deploy.yarn.ApplicationMaster > --class 'org.apache.spark.examples.SparkPi' --jar > 'file:/D:/data/spark-1.1.1-bin-hadoop2.4/bin/../lib/spark-examples-1.1.1-hadoop2.4.0.jar' > --executor-memory 1024 --executor-cores 1 --num-executors 2 > Once I transform it into: > @call %JAVA_HOME%/bin/java -server -Xmx512m -Djava.io.tmpdir=%PWD%/tmp > -Dspark.yarn.secondary.jars= > -Dspark.app.name=org.apache.spark.examples.SparkPi > -Dspark.master=yarn-cluster org.apache.spark.deploy.yarn.ApplicationMaster > --class 'org.apache.spark.examples.SparkPi' --jar > 'file:/D:/data/spark-1.1.1-bin-hadoop2.4/bin/../lib/spark-examples-1.1.1-hadoop2.4.0.jar' > --executor-memory 1024 --executor-cores 1 --num-executors 2 > Everything seems to start. > How should I deal with this? Creating a separate function like escapeForShell > for Windows and call it whenever I detect this is for Windows? Or should I > add some sanity check on YARN? > I checked a little and there seems to be people that is able to run Spark on > YARN on Windows, so it might be something else. I didn't find anything > related on Jira either. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-5754) Spark AM not launching on Windows
[ https://issues.apache.org/jira/browse/SPARK-5754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14646733#comment-14646733 ] Carsten Blank commented on SPARK-5754: -- And in 1.5.0. Actually I have quickly hacked a couple of lines to fix this issue. However, I also suggest that a command helper should be written, such that we populate a List of CommandEntry-Objects (oho creative naming!!) that each have key, value and a keyValueSeparator. We could assemble the correct command (either as a String or a List of Strings) plattform dependent. That way we could make sure, that commands don't break the container, regardless of the plattform. One more thought regarding the above mentioned persisting problem with -XX:OnOutOfMemoryError='kill %p' I have checked and it seems to come from the fact, that cmd expects a environment variable at %p. Consequently every thing breaks. One way to deal with this is using '%%'. Again, this kind of stuff should be in a helper class. I could contribute if that is okay. However, I never have so a bit of help would be appreciated! > Spark AM not launching on Windows > - > > Key: SPARK-5754 > URL: https://issues.apache.org/jira/browse/SPARK-5754 > Project: Spark > Issue Type: Bug > Components: Windows, YARN >Affects Versions: 1.1.1, 1.2.0 > Environment: Windows Server 2012, Hadoop 2.4.1. >Reporter: Inigo > > I'm trying to run Spark Pi on a YARN cluster running on Windows and the AM > container fails to start. The problem seems to be in the generation of the > YARN command which adds single quotes (') surrounding some of the java > options. In particular, the part of the code that is adding those is the > escapeForShell function in YarnSparkHadoopUtil. Apparently, Windows does not > like the quotes for these options. Here is an example of the command that the > container tries to execute: > @call %JAVA_HOME%/bin/java -server -Xmx512m -Djava.io.tmpdir=%PWD%/tmp > '-Dspark.yarn.secondary.jars=' > '-Dspark.app.name=org.apache.spark.examples.SparkPi' > '-Dspark.master=yarn-cluster' org.apache.spark.deploy.yarn.ApplicationMaster > --class 'org.apache.spark.examples.SparkPi' --jar > 'file:/D:/data/spark-1.1.1-bin-hadoop2.4/bin/../lib/spark-examples-1.1.1-hadoop2.4.0.jar' > --executor-memory 1024 --executor-cores 1 --num-executors 2 > Once I transform it into: > @call %JAVA_HOME%/bin/java -server -Xmx512m -Djava.io.tmpdir=%PWD%/tmp > -Dspark.yarn.secondary.jars= > -Dspark.app.name=org.apache.spark.examples.SparkPi > -Dspark.master=yarn-cluster org.apache.spark.deploy.yarn.ApplicationMaster > --class 'org.apache.spark.examples.SparkPi' --jar > 'file:/D:/data/spark-1.1.1-bin-hadoop2.4/bin/../lib/spark-examples-1.1.1-hadoop2.4.0.jar' > --executor-memory 1024 --executor-cores 1 --num-executors 2 > Everything seems to start. > How should I deal with this? Creating a separate function like escapeForShell > for Windows and call it whenever I detect this is for Windows? Or should I > add some sanity check on YARN? > I checked a little and there seems to be people that is able to run Spark on > YARN on Windows, so it might be something else. I didn't find anything > related on Jira either. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org