[
https://issues.apache.org/jira/browse/SPARK-40738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Apache Spark reassigned SPARK-40738:
------------------------------------
Assignee: Apache Spark
> spark-shell fails with "bad array subscript" in cygwin or msys bash session
> ---------------------------------------------------------------------------
>
> Key: SPARK-40738
> URL: https://issues.apache.org/jira/browse/SPARK-40738
> Project: Spark
> Issue Type: Bug
> Components: Spark Shell, Windows
> Affects Versions: 3.3.0
> Environment: The problem occurs in Windows if *_spark-shell_* is
> called from a bash session.
> NOTE: the fix also applies to _*spark-submit*_ and and {_}*beeline*{_}, since
> they call spark-shell.
> Reporter: Phil Walker
> Assignee: Apache Spark
> Priority: Major
> Labels: bash, cygwin, mingw, msys2,, windows
> Original Estimate: 0h
> Remaining Estimate: 0h
>
> A spark pull request [spark PR|https://github.com/apache/spark/pull/38167]
> fixes this issue, and also fixes a build error that is also related to
> _*cygwin*_ and *msys/mingw* bash *sbt* sessions.
> If a Windows user tries to start a *_spark-shell_* session by calling the
> bash script (rather than the *_spark-shell.cmd_* script), it fails with a
> confusing error message. Script _*spark-class*_ calls
> _*launcher/src/main/java/org/apache/spark/launcher/Main.java* to_ generate
> command line arguments, but the launcher produces a format appropriate to the
> *_.cmd_* version of the script rather than the _*bash*_ version.
> The launcher Main method, when called for environments other than Windows,
> interleaves NULL characters between the command line arguments. It should
> also do so in Windows when called from the bash script. It incorrectly
> assumes that if the OS is Windows, that it is being called by the .cmd
> version of the script.
> The resulting error message is unhelpful:
>
> {code:java}
> [lots of ugly stuff omitted]
> /opt/spark/bin/spark-class: line 100: CMD: bad array subscript
> {code}
> The key to _*launcher/Main*_ knowing that a request is from a _*bash*_
> session is that the _*SHELL*_ environment variable is set. This will
> normally be set in any of the various Windows shell environments
> ({_}*cygwin*{_}, {_}*mingw64*{_}, {_}*msys2*{_}, etc) and will not normally
> be set in Windows environments. In the _*spark-class.cmd*_ script,
> _*SHELL*_ is intentionally unset to avoid problems, and to permit bash users
> to call the _*.cmd*_ scripts if they prefer (it will still work as before).
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]