GitHub user qualiu opened a pull request:
https://github.com/apache/spark/pull/14807
Remove double quotes in spark/bin batch files to avoid cutting off
arguments that double quoted as contain special character.
## What changes were proposed in this pull request?
Remove double quotes in 3 cmd files in spark/bin to avoid cutting off
argument that has special characters and double quoted.
## How was this patch tested?
Just copy and paste following command then execute it in **spark/bin**
`spark-submit.cmd --jars just-to-start
"jdbc:mysql://localhost:3306/lzdb?user=guest&password=abc123" my_table`
It will not even be started, the argument of mysql connection string was
cut off :
```
bin\spark-submit2.cmd" --jars just-to-start
"jdbc:mysql://localhost:3306/lzdb?user' is not recognized as an internal or
external command,
operable program or batch file.
'password' is not recognized as an internal or external command,
operable program or batch file.
```
This's a conservative fix that keeps the rules of using cmd /V /E /C and
will work as it's able to work.
It's not a best idea to use /V /E to avoid polluting environment and enable
/V /E /C , this will restrain the argument passing.
(1) Cannot start XXX.cmd itself if it's full path has white spaces even if
quoted :
`cmd /V /E /C "%~dp0XXX.cmd" xxx "%~dp0XXX.cmd" xxx`
(2) Cannot pass double quoted arguments to spark-submit.cmd (as mentioned
above) under :
`cmd /V /E /C`
May be it's better to force set the variables that need to avoid pollution,
since it's difficult to require the users to keep using "SetLocal
EnableDelayedExpansion" alike in the batch files (*.cmd/*.bat).
By the way, I didn't change the pyspark , R, beeline etc. scripts because
they seems work fine for long.
### What's more in addition
A tool to fast **change/restore** the files in spark/bin if you like :
https://github.com/qualiu/lzmw
* Remove quotes (delete the --nf xxx will modify all matched files): <br>
`lzmw -i -t "(^cmd.*?/[VCE])\s+\"+(%~dp0\S+\.cmd)\"+" -o "$1 $2" --nf
"pyspark|sparkR|beeline|example" -p . -R`
* Add/Restore quotes :
`lzmw -it "\"*(%~dp0\S+\.cmd)\"*" -o "\"$1\"" -p . -R`
* Or remove the head :
`lzmw -f "\.cmd$" -it "^cmd /V /E /C " -o "" -p %CD% -R`
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/qualiu/spark remove-quotes-in-bin-cmd
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/14807.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #14807
----
commit 685bd47d924635cdd9dd1048e9a4ba16462be433
Author: qualiu <[email protected]>
Date: 2016-08-25T10:18:26Z
Remove double quotes in spark/bin batch files to avoid cutting off
arguments that double quoted as contain special character.
To simply validate (for example, mysql connection string), cannot just
start it :
spark-submit.cmd --jars just-to-start
"jdbc:mysql://localhost:3306/lzdb?user=guest&password=abc123" my_table
It's not a best idea to use /V /E to avoid polluting environment and enable
/V /E /C , this will restrain the argument passing.
(1) Cannot start XXX.cmd itself if it's full path has white spaces even if
quoted : cmd /V /E /C "%~dp0XXX.cmd" xxx "%~dp0XXX.cmd" xxx
(2) Cannot pass double quoted arguments to spark-submit.cmd (as mentioned
above) under : cmd /V /E /C
May be it's better to force set the variables that need to avoid pollution,
since it's difficult to require the users to keep using "SetLocal
EnableDelayedExpansion" alike in the batch files (*.cmd/*.bat).
By the way, I didn't change the pyspark , R, beeline etc. scripts because
they seems work fine for long.
What's more in addition, a tool to fast change the files in spark/bin if
you like : https://github.com/qualiu/lzmw
(1) Remove quotes : lzmw -i -t "(^cmd.*?/[VCE])\s+\"+(%~dp0\S+\.cmd)\"+" -o
"$1 $2" --nf "pyspark|sparkR|beeline|example" -p . -R
(2) Add/Restore : lzmw -it "\"*(%~dp0\S+\.cmd)\"*" -o "\"$1\"" -p . -R
(3) Or remove the head : lzmw -f "\.cmd$" -it "^cmd /V /E /C " -o "" -p
%CD% -R
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]