[jira] [Commented] (SPARK-5754) Spark AM not launching on Windows

2015-08-08 Thread Carsten Blank (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14662975#comment-14662975
 ] 

Carsten Blank commented on SPARK-5754:
--

BTW, the reason why 
'-Dspark.driver.port=21390'
doesn't work is a JVM issue, since it takes the enclosed information "as is" if 
single quotes are used, so it thinks this is a java class which happens to 
start with a '-'. I am not sure, if this is a Windows JVM specific situation, 
or if this happens in Linux just the same.

> Spark AM not launching on Windows
> -
>
> Key: SPARK-5754
> URL: https://issues.apache.org/jira/browse/SPARK-5754
> Project: Spark
>  Issue Type: Bug
>  Components: Windows, YARN
>Affects Versions: 1.1.1, 1.2.0
> Environment: Windows Server 2012, Hadoop 2.4.1.
>Reporter: Inigo
>
> I'm trying to run Spark Pi on a YARN cluster running on Windows and the AM 
> container fails to start. The problem seems to be in the generation of the 
> YARN command which adds single quotes (') surrounding some of the java 
> options. In particular, the part of the code that is adding those is the 
> escapeForShell function in YarnSparkHadoopUtil. Apparently, Windows does not 
> like the quotes for these options. Here is an example of the command that the 
> container tries to execute:
> @call %JAVA_HOME%/bin/java -server -Xmx512m -Djava.io.tmpdir=%PWD%/tmp 
> '-Dspark.yarn.secondary.jars=' 
> '-Dspark.app.name=org.apache.spark.examples.SparkPi' 
> '-Dspark.master=yarn-cluster' org.apache.spark.deploy.yarn.ApplicationMaster 
> --class 'org.apache.spark.examples.SparkPi' --jar  
> 'file:/D:/data/spark-1.1.1-bin-hadoop2.4/bin/../lib/spark-examples-1.1.1-hadoop2.4.0.jar'
>   --executor-memory 1024 --executor-cores 1 --num-executors 2
> Once I transform it into:
> @call %JAVA_HOME%/bin/java -server -Xmx512m -Djava.io.tmpdir=%PWD%/tmp 
> -Dspark.yarn.secondary.jars= 
> -Dspark.app.name=org.apache.spark.examples.SparkPi 
> -Dspark.master=yarn-cluster org.apache.spark.deploy.yarn.ApplicationMaster 
> --class 'org.apache.spark.examples.SparkPi' --jar  
> 'file:/D:/data/spark-1.1.1-bin-hadoop2.4/bin/../lib/spark-examples-1.1.1-hadoop2.4.0.jar'
>   --executor-memory 1024 --executor-cores 1 --num-executors 2
> Everything seems to start.
> How should I deal with this? Creating a separate function like escapeForShell 
> for Windows and call it whenever I detect this is for Windows? Or should I 
> add some sanity check on YARN?
> I checked a little and there seems to be people that is able to run Spark on 
> YARN on Windows, so it might be something else. I didn't find anything 
> related on Jira either.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5754) Spark AM not launching on Windows

2015-08-08 Thread Carsten Blank (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14662919#comment-14662919
 ] 

Carsten Blank commented on SPARK-5754:
--

Okay, looking over the case and not making the project admin's happy with my 
solution (and damn rightly so!!) I revisited the problem. 

It occurs to me, that this is *not* really a Windows cmd Problem. It rather 
seems to be a problem with the jvm in combination with cmd. My findings:

Firstly,
-XX:OnOutOfMemoryError='kill %p'
will never ever work on Windows cmd. So far this is a cmd-specific situation. 
The reason why this is an issue lies in the fact that only double quotes are 
respected by cmd for arguments that belong together. (See 
http://ss64.com/nt/syntax-esc.html) Everything with a space thus must be 
enclosed by double quotes! That was in essence dharmeshkakadia suggestion as a 
PR.

Secondly,
-XX:OnOutOfMemoryError='kill %p'
will only work if you got 'kill' installed. As having GNU Tools a must for 
anyone that needs to build hadoop from scratch on Windows it will not really 
come up as a problem. Still, should be fixed to 
-XX:OnOutOfMemoryError="taskkill /F /PID %p"
see e.g. https://www.java.net/node/692850

Thirdly, 
Usually cmd expects, as written earlier, that if a % is read, that a 
environment variable is about to be parsed. Although I wasn't able to really 
verify that this is a problem on the command line, it seems to be an issue in 
.cmd files. The correct way to avoid any hassle is using %%. In fact, please 
look at 
http://www.oracle.com/technetwork/java/javase/clopts-139448.html#gbmum
it is explained that %p and %%p are treated equally and that they map to the 
current PID. 

Thus, I suggest the basic escaping logic in spark to distuingish the quoting 
between unix (') and windows ("). Furthermore, all option variables shall be 
written with %%, as the JVM interprets them as a single % completely platform 
independent. The last change suggestion should be in distuingishing which 
"kill" command to use, according to platform.

I will close my PR and open a new one, once I have verified in action all that 
I have written here. I would be very glad about comments and experience.


> Spark AM not launching on Windows
> -
>
> Key: SPARK-5754
> URL: https://issues.apache.org/jira/browse/SPARK-5754
> Project: Spark
>  Issue Type: Bug
>  Components: Windows, YARN
>Affects Versions: 1.1.1, 1.2.0
> Environment: Windows Server 2012, Hadoop 2.4.1.
>Reporter: Inigo
>
> I'm trying to run Spark Pi on a YARN cluster running on Windows and the AM 
> container fails to start. The problem seems to be in the generation of the 
> YARN command which adds single quotes (') surrounding some of the java 
> options. In particular, the part of the code that is adding those is the 
> escapeForShell function in YarnSparkHadoopUtil. Apparently, Windows does not 
> like the quotes for these options. Here is an example of the command that the 
> container tries to execute:
> @call %JAVA_HOME%/bin/java -server -Xmx512m -Djava.io.tmpdir=%PWD%/tmp 
> '-Dspark.yarn.secondary.jars=' 
> '-Dspark.app.name=org.apache.spark.examples.SparkPi' 
> '-Dspark.master=yarn-cluster' org.apache.spark.deploy.yarn.ApplicationMaster 
> --class 'org.apache.spark.examples.SparkPi' --jar  
> 'file:/D:/data/spark-1.1.1-bin-hadoop2.4/bin/../lib/spark-examples-1.1.1-hadoop2.4.0.jar'
>   --executor-memory 1024 --executor-cores 1 --num-executors 2
> Once I transform it into:
> @call %JAVA_HOME%/bin/java -server -Xmx512m -Djava.io.tmpdir=%PWD%/tmp 
> -Dspark.yarn.secondary.jars= 
> -Dspark.app.name=org.apache.spark.examples.SparkPi 
> -Dspark.master=yarn-cluster org.apache.spark.deploy.yarn.ApplicationMaster 
> --class 'org.apache.spark.examples.SparkPi' --jar  
> 'file:/D:/data/spark-1.1.1-bin-hadoop2.4/bin/../lib/spark-examples-1.1.1-hadoop2.4.0.jar'
>   --executor-memory 1024 --executor-cores 1 --num-executors 2
> Everything seems to start.
> How should I deal with this? Creating a separate function like escapeForShell 
> for Windows and call it whenever I detect this is for Windows? Or should I 
> add some sanity check on YARN?
> I checked a little and there seems to be people that is able to run Spark on 
> YARN on Windows, so it might be something else. I didn't find anything 
> related on Jira either.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5754) Spark AM not launching on Windows

2015-08-02 Thread Carsten Blank (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14650677#comment-14650677
 ] 

Carsten Blank commented on SPARK-5754:
--

Ah I see! Yes for a single situation such as yours, mine and the others here 
that need to use spark on hadoop/yarn/windows this will work. I actually did 
something like that too ^^ 

It seems however to me we should find a good solution to fix this, for good. 
The discussion on dharmeshkakadia PR is useful.

> Spark AM not launching on Windows
> -
>
> Key: SPARK-5754
> URL: https://issues.apache.org/jira/browse/SPARK-5754
> Project: Spark
>  Issue Type: Bug
>  Components: Windows, YARN
>Affects Versions: 1.1.1, 1.2.0
> Environment: Windows Server 2012, Hadoop 2.4.1.
>Reporter: Inigo
>
> I'm trying to run Spark Pi on a YARN cluster running on Windows and the AM 
> container fails to start. The problem seems to be in the generation of the 
> YARN command which adds single quotes (') surrounding some of the java 
> options. In particular, the part of the code that is adding those is the 
> escapeForShell function in YarnSparkHadoopUtil. Apparently, Windows does not 
> like the quotes for these options. Here is an example of the command that the 
> container tries to execute:
> @call %JAVA_HOME%/bin/java -server -Xmx512m -Djava.io.tmpdir=%PWD%/tmp 
> '-Dspark.yarn.secondary.jars=' 
> '-Dspark.app.name=org.apache.spark.examples.SparkPi' 
> '-Dspark.master=yarn-cluster' org.apache.spark.deploy.yarn.ApplicationMaster 
> --class 'org.apache.spark.examples.SparkPi' --jar  
> 'file:/D:/data/spark-1.1.1-bin-hadoop2.4/bin/../lib/spark-examples-1.1.1-hadoop2.4.0.jar'
>   --executor-memory 1024 --executor-cores 1 --num-executors 2
> Once I transform it into:
> @call %JAVA_HOME%/bin/java -server -Xmx512m -Djava.io.tmpdir=%PWD%/tmp 
> -Dspark.yarn.secondary.jars= 
> -Dspark.app.name=org.apache.spark.examples.SparkPi 
> -Dspark.master=yarn-cluster org.apache.spark.deploy.yarn.ApplicationMaster 
> --class 'org.apache.spark.examples.SparkPi' --jar  
> 'file:/D:/data/spark-1.1.1-bin-hadoop2.4/bin/../lib/spark-examples-1.1.1-hadoop2.4.0.jar'
>   --executor-memory 1024 --executor-cores 1 --num-executors 2
> Everything seems to start.
> How should I deal with this? Creating a separate function like escapeForShell 
> for Windows and call it whenever I detect this is for Windows? Or should I 
> add some sanity check on YARN?
> I checked a little and there seems to be people that is able to run Spark on 
> YARN on Windows, so it might be something else. I didn't find anything 
> related on Jira either.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5754) Spark AM not launching on Windows

2015-08-01 Thread Carsten Blank (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14650444#comment-14650444
 ] 

Carsten Blank commented on SPARK-5754:
--

Okay so I have thought about this more and kinda have a "qualified" opinion now.

I assume that you have fixed this issue for your problem? Did you write a 
separate escapeForShell for Windows? 

I have and I would like to suggest something like that for a PR. How did you 
solve this?

> Spark AM not launching on Windows
> -
>
> Key: SPARK-5754
> URL: https://issues.apache.org/jira/browse/SPARK-5754
> Project: Spark
>  Issue Type: Bug
>  Components: Windows, YARN
>Affects Versions: 1.1.1, 1.2.0
> Environment: Windows Server 2012, Hadoop 2.4.1.
>Reporter: Inigo
>
> I'm trying to run Spark Pi on a YARN cluster running on Windows and the AM 
> container fails to start. The problem seems to be in the generation of the 
> YARN command which adds single quotes (') surrounding some of the java 
> options. In particular, the part of the code that is adding those is the 
> escapeForShell function in YarnSparkHadoopUtil. Apparently, Windows does not 
> like the quotes for these options. Here is an example of the command that the 
> container tries to execute:
> @call %JAVA_HOME%/bin/java -server -Xmx512m -Djava.io.tmpdir=%PWD%/tmp 
> '-Dspark.yarn.secondary.jars=' 
> '-Dspark.app.name=org.apache.spark.examples.SparkPi' 
> '-Dspark.master=yarn-cluster' org.apache.spark.deploy.yarn.ApplicationMaster 
> --class 'org.apache.spark.examples.SparkPi' --jar  
> 'file:/D:/data/spark-1.1.1-bin-hadoop2.4/bin/../lib/spark-examples-1.1.1-hadoop2.4.0.jar'
>   --executor-memory 1024 --executor-cores 1 --num-executors 2
> Once I transform it into:
> @call %JAVA_HOME%/bin/java -server -Xmx512m -Djava.io.tmpdir=%PWD%/tmp 
> -Dspark.yarn.secondary.jars= 
> -Dspark.app.name=org.apache.spark.examples.SparkPi 
> -Dspark.master=yarn-cluster org.apache.spark.deploy.yarn.ApplicationMaster 
> --class 'org.apache.spark.examples.SparkPi' --jar  
> 'file:/D:/data/spark-1.1.1-bin-hadoop2.4/bin/../lib/spark-examples-1.1.1-hadoop2.4.0.jar'
>   --executor-memory 1024 --executor-cores 1 --num-executors 2
> Everything seems to start.
> How should I deal with this? Creating a separate function like escapeForShell 
> for Windows and call it whenever I detect this is for Windows? Or should I 
> add some sanity check on YARN?
> I checked a little and there seems to be people that is able to run Spark on 
> YARN on Windows, so it might be something else. I didn't find anything 
> related on Jira either.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5754) Spark AM not launching on Windows

2015-07-31 Thread Carsten Blank (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14649364#comment-14649364
 ] 

Carsten Blank commented on SPARK-5754:
--

Thanks, I will put in more time and come up with a suggestion. 
Catch ya later

> Spark AM not launching on Windows
> -
>
> Key: SPARK-5754
> URL: https://issues.apache.org/jira/browse/SPARK-5754
> Project: Spark
>  Issue Type: Bug
>  Components: Windows, YARN
>Affects Versions: 1.1.1, 1.2.0
> Environment: Windows Server 2012, Hadoop 2.4.1.
>Reporter: Inigo
>
> I'm trying to run Spark Pi on a YARN cluster running on Windows and the AM 
> container fails to start. The problem seems to be in the generation of the 
> YARN command which adds single quotes (') surrounding some of the java 
> options. In particular, the part of the code that is adding those is the 
> escapeForShell function in YarnSparkHadoopUtil. Apparently, Windows does not 
> like the quotes for these options. Here is an example of the command that the 
> container tries to execute:
> @call %JAVA_HOME%/bin/java -server -Xmx512m -Djava.io.tmpdir=%PWD%/tmp 
> '-Dspark.yarn.secondary.jars=' 
> '-Dspark.app.name=org.apache.spark.examples.SparkPi' 
> '-Dspark.master=yarn-cluster' org.apache.spark.deploy.yarn.ApplicationMaster 
> --class 'org.apache.spark.examples.SparkPi' --jar  
> 'file:/D:/data/spark-1.1.1-bin-hadoop2.4/bin/../lib/spark-examples-1.1.1-hadoop2.4.0.jar'
>   --executor-memory 1024 --executor-cores 1 --num-executors 2
> Once I transform it into:
> @call %JAVA_HOME%/bin/java -server -Xmx512m -Djava.io.tmpdir=%PWD%/tmp 
> -Dspark.yarn.secondary.jars= 
> -Dspark.app.name=org.apache.spark.examples.SparkPi 
> -Dspark.master=yarn-cluster org.apache.spark.deploy.yarn.ApplicationMaster 
> --class 'org.apache.spark.examples.SparkPi' --jar  
> 'file:/D:/data/spark-1.1.1-bin-hadoop2.4/bin/../lib/spark-examples-1.1.1-hadoop2.4.0.jar'
>   --executor-memory 1024 --executor-cores 1 --num-executors 2
> Everything seems to start.
> How should I deal with this? Creating a separate function like escapeForShell 
> for Windows and call it whenever I detect this is for Windows? Or should I 
> add some sanity check on YARN?
> I checked a little and there seems to be people that is able to run Spark on 
> YARN on Windows, so it might be something else. I didn't find anything 
> related on Jira either.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5754) Spark AM not launching on Windows

2015-07-29 Thread Carsten Blank (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14646733#comment-14646733
 ] 

Carsten Blank commented on SPARK-5754:
--

And in 1.5.0. Actually I have quickly hacked a couple of lines to fix this 
issue. However, I also suggest that a command helper should be written, such 
that we populate a List of CommandEntry-Objects (oho creative naming!!) that 
each have key, value and a keyValueSeparator. We could assemble the correct 
command (either as a String or a List of Strings) plattform dependent. That way 
we could make sure, that commands don't break the container, regardless of the 
plattform. 

One more thought regarding the above mentioned persisting problem with 

-XX:OnOutOfMemoryError='kill %p'

I have checked and it seems to come from the fact, that cmd expects a 
environment variable at %p. Consequently every thing breaks. One way to deal 
with this is using '%%'. Again, this kind of stuff should be in a helper class. 
I could contribute if that is okay. However, I never have so a bit of help 
would be appreciated!


> Spark AM not launching on Windows
> -
>
> Key: SPARK-5754
> URL: https://issues.apache.org/jira/browse/SPARK-5754
> Project: Spark
>  Issue Type: Bug
>  Components: Windows, YARN
>Affects Versions: 1.1.1, 1.2.0
> Environment: Windows Server 2012, Hadoop 2.4.1.
>Reporter: Inigo
>
> I'm trying to run Spark Pi on a YARN cluster running on Windows and the AM 
> container fails to start. The problem seems to be in the generation of the 
> YARN command which adds single quotes (') surrounding some of the java 
> options. In particular, the part of the code that is adding those is the 
> escapeForShell function in YarnSparkHadoopUtil. Apparently, Windows does not 
> like the quotes for these options. Here is an example of the command that the 
> container tries to execute:
> @call %JAVA_HOME%/bin/java -server -Xmx512m -Djava.io.tmpdir=%PWD%/tmp 
> '-Dspark.yarn.secondary.jars=' 
> '-Dspark.app.name=org.apache.spark.examples.SparkPi' 
> '-Dspark.master=yarn-cluster' org.apache.spark.deploy.yarn.ApplicationMaster 
> --class 'org.apache.spark.examples.SparkPi' --jar  
> 'file:/D:/data/spark-1.1.1-bin-hadoop2.4/bin/../lib/spark-examples-1.1.1-hadoop2.4.0.jar'
>   --executor-memory 1024 --executor-cores 1 --num-executors 2
> Once I transform it into:
> @call %JAVA_HOME%/bin/java -server -Xmx512m -Djava.io.tmpdir=%PWD%/tmp 
> -Dspark.yarn.secondary.jars= 
> -Dspark.app.name=org.apache.spark.examples.SparkPi 
> -Dspark.master=yarn-cluster org.apache.spark.deploy.yarn.ApplicationMaster 
> --class 'org.apache.spark.examples.SparkPi' --jar  
> 'file:/D:/data/spark-1.1.1-bin-hadoop2.4/bin/../lib/spark-examples-1.1.1-hadoop2.4.0.jar'
>   --executor-memory 1024 --executor-cores 1 --num-executors 2
> Everything seems to start.
> How should I deal with this? Creating a separate function like escapeForShell 
> for Windows and call it whenever I detect this is for Windows? Or should I 
> add some sanity check on YARN?
> I checked a little and there seems to be people that is able to run Spark on 
> YARN on Windows, so it might be something else. I didn't find anything 
> related on Jira either.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org