This is an automated email from the ASF dual-hosted git repository.
srowen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push:
new 880d9bb3fcb [SPARK-40739][SPARK-40738] Fixes for cygwin/msys2/mingw
sbt build and bash scripts
880d9bb3fcb is described below
commit 880d9bb3fcb69001512886496f2988ed17cc4c50
Author: Phil <[email protected]>
AuthorDate: Mon Oct 24 08:28:54 2022 -0500
[SPARK-40739][SPARK-40738] Fixes for cygwin/msys2/mingw sbt build and bash
scripts
This fixes two problems that affect development in a Windows shell
environment, such as `cygwin` or `msys2`.
### The fixed build error
Running `./build/sbt packageBin` from A Windows cygwin `bash` session fails.
This occurs if `WSL` is installed, because `project\SparkBuild.scala`
creates a `bash` process, but `WSL bash` is called, even though `cygwin bash`
appears earlier in the `PATH`. In addition, file path arguments to bash
contain backslashes. The fix is to insure that the correct `bash` is called,
and that arguments passed to `bash` are passed with slashes rather than
**slashes.**
### The build error message:
```bash
./build.sbt packageBin
```
<pre>
[info] compiling 9 Java sources to
C:\Users\philwalk\workspace\spark\common\sketch\target\scala-2.12\classes ...
/bin/bash: C:Usersphilwalkworkspacesparkcore/../build/spark-build-info: No
such file or directory
[info] compiling 1 Scala source to
C:\Users\philwalk\workspace\spark\tools\target\scala-2.12\classes ...
[info] compiling 5 Scala sources to
C:\Users\philwalk\workspace\spark\mllib-local\target\scala-2.12\classes ...
[info] Compiling 5 protobuf files to
C:\Users\philwalk\workspace\spark\connector\connect\target\scala-2.12\src_managed\main
[error] stack trace is suppressed; run last core / Compile /
managedResources for the full output
[error] (core / Compile / managedResources) Nonzero exit value: 127
[error] Total time: 42 s, completed Oct 8, 2022, 4:49:12 PM
sbt:spark-parent>
sbt:spark-parent> last core /Compile /managedResources
last core /Compile /managedResources
[error] java.lang.RuntimeException: Nonzero exit value: 127
[error] at scala.sys.package$.error(package.scala:30)
[error] at
scala.sys.process.ProcessBuilderImpl$AbstractBuilder.slurp(ProcessBuilderImpl.scala:138)
[error] at
scala.sys.process.ProcessBuilderImpl$AbstractBuilder.$bang$bang(ProcessBuilderImpl.scala:108)
[error] at Core$.$anonfun$settings$4(SparkBuild.scala:604)
[error] at scala.Function1.$anonfun$compose$1(Function1.scala:49)
[error] at
sbt.internal.util.$tilde$greater.$anonfun$$u2219$1(TypeFunctions.scala:62)
[error] at sbt.std.Transform$$anon$4.work(Transform.scala:68)
[error] at sbt.Execute.$anonfun$submit$2(Execute.scala:282)
[error] at
sbt.internal.util.ErrorHandling$.wideConvert(ErrorHandling.scala:23)
[error] at sbt.Execute.work(Execute.scala:291)
[error] at sbt.Execute.$anonfun$submit$1(Execute.scala:282)
[error] at
sbt.ConcurrentRestrictions$$anon$4.$anonfun$submitValid$1(ConcurrentRestrictions.scala:265)
[error] at
sbt.CompletionService$$anon$2.call(CompletionService.scala:64)
[error] at
java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
[error] at
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
[error] at
java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
[error] at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
[error] at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
[error] at java.base/java.lang.Thread.run(Thread.java:834)
[error] (core / Compile / managedResources) Nonzero exit value: 127
</pre>
### bash scripts fail when run from `cygwin` or `msys2`
The other problem fixed by the PR is to address problems preventing the
`bash` scripts (`spark-shell`, `spark-submit`, etc.) from being used in Windows
`SHELL` environments. The problem is that the bash version of `spark-class`
fails in a Windows shell environment, the result of
`launcher/src/main/java/org/apache/spark/launcher/Main.java` not following the
convention expected by `spark-class`, and also appending CR to line endings.
The resulting error message not helpful.
There are two parts to this fix:
1. modify `Main.java` to treat a `SHELL` session on Windows as a `bash`
session
2. remove the appended CR character when parsing the output produced by
`Main.java`
### Does this PR introduce _any_ user-facing change?
These changes should NOT affect anyone who is not trying build or run bash
scripts from a Windows SHELL environment.
### How was this patch tested?
Manual tests were performed to verify both changes.
### related JIRA issues
The following 2 JIRA issue were created. Both are fixed by this PR. They
are both linked to this PR.
- Bug SPARK-40739 "sbt packageBin" fails in cygwin or other windows bash
session
- Bug SPARK-40738 spark-shell fails with "bad array"
Closes #38228 from philwalk/windows-shell-env-fixes.
Authored-by: Phil <[email protected]>
Signed-off-by: Sean Owen <[email protected]>
---
bin/spark-class | 3 ++-
bin/spark-class2.cmd | 2 ++
build/spark-build-info | 2 +-
launcher/src/main/java/org/apache/spark/launcher/Main.java | 6 ++++--
project/SparkBuild.scala | 9 ++++++++-
5 files changed, 17 insertions(+), 5 deletions(-)
diff --git a/bin/spark-class b/bin/spark-class
index c1461a77122..fc343ca29fd 100755
--- a/bin/spark-class
+++ b/bin/spark-class
@@ -77,7 +77,8 @@ set +o posix
CMD=()
DELIM=$'\n'
CMD_START_FLAG="false"
-while IFS= read -d "$DELIM" -r ARG; do
+while IFS= read -d "$DELIM" -r _ARG; do
+ ARG=${_ARG//$'\r'}
if [ "$CMD_START_FLAG" == "true" ]; then
CMD+=("$ARG")
else
diff --git a/bin/spark-class2.cmd b/bin/spark-class2.cmd
index 68b271d1d05..800ec0c02c2 100755
--- a/bin/spark-class2.cmd
+++ b/bin/spark-class2.cmd
@@ -69,6 +69,8 @@ rem SPARK-28302: %RANDOM% would return the same number if we
call it instantly a
rem so we should make it sure to generate unique file to avoid process
collision of writing into
rem the same file concurrently.
if exist %LAUNCHER_OUTPUT% goto :gen
+rem unset SHELL to indicate non-bash environment to launcher/Main
+set SHELL=
"%RUNNER%" -Xmx128m -cp "%LAUNCH_CLASSPATH%" org.apache.spark.launcher.Main %*
> %LAUNCHER_OUTPUT%
for /f "tokens=*" %%i in (%LAUNCHER_OUTPUT%) do (
set SPARK_CMD=%%i
diff --git a/build/spark-build-info b/build/spark-build-info
index eb0e3d730e2..26157e8cf8c 100755
--- a/build/spark-build-info
+++ b/build/spark-build-info
@@ -24,7 +24,7 @@
RESOURCE_DIR="$1"
mkdir -p "$RESOURCE_DIR"
-SPARK_BUILD_INFO="${RESOURCE_DIR}"/spark-version-info.properties
+SPARK_BUILD_INFO="${RESOURCE_DIR%/}"/spark-version-info.properties
echo_build_properties() {
echo version=$1
diff --git a/launcher/src/main/java/org/apache/spark/launcher/Main.java
b/launcher/src/main/java/org/apache/spark/launcher/Main.java
index e1054c7060f..6501fc1764c 100644
--- a/launcher/src/main/java/org/apache/spark/launcher/Main.java
+++ b/launcher/src/main/java/org/apache/spark/launcher/Main.java
@@ -87,7 +87,9 @@ class Main {
cmd = buildCommand(builder, env, printLaunchCommand);
}
- if (isWindows()) {
+ // test for shell environments, to enable non-Windows treatment of command
line prep
+ boolean shellflag = !isEmpty(System.getenv("SHELL"));
+ if (isWindows() && !shellflag) {
System.out.println(prepareWindowsCommand(cmd, env));
} else {
// A sequence of NULL character and newline separates command-strings
and others.
@@ -96,7 +98,7 @@ class Main {
// In bash, use NULL as the arg separator since it cannot be used in an
argument.
List<String> bashCmd = prepareBashCommand(cmd, env);
for (String c : bashCmd) {
- System.out.print(c);
+ System.out.print(c.replaceFirst("\r$",""));
System.out.print('\0');
}
}
diff --git a/project/SparkBuild.scala b/project/SparkBuild.scala
index cc103e4ab00..33883a2efaa 100644
--- a/project/SparkBuild.scala
+++ b/project/SparkBuild.scala
@@ -599,11 +599,18 @@ object SparkParallelTestGrouping {
object Core {
import scala.sys.process.Process
+ def buildenv = Process(Seq("uname")).!!.trim.replaceFirst("[^A-Za-z0-9].*",
"").toLowerCase
+ def bashpath = Process(Seq("where",
"bash")).!!.split("[\r\n]+").head.replace('\\', '/')
lazy val settings = Seq(
(Compile / resourceGenerators) += Def.task {
val buildScript = baseDirectory.value + "/../build/spark-build-info"
val targetDir = baseDirectory.value + "/target/extra-resources/"
- val command = Seq("bash", buildScript, targetDir, version.value)
+ // support Windows build under cygwin/mingw64, etc
+ val bash = buildenv match {
+ case "cygwin" | "msys2" | "mingw64" | "clang64" => bashpath
+ case _ => "bash"
+ }
+ val command = Seq(bash, buildScript, targetDir, version.value)
Process(command).!!
val propsFile = baseDirectory.value / "target" / "extra-resources" /
"spark-version-info.properties"
Seq(propsFile)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]