HyukjinKwon commented on a change in pull request #25647: [SPARK-28946][R][DOCS] Add some more information about building SparkR on Windows URL: https://github.com/apache/spark/pull/25647#discussion_r319803205
########## File path: R/WINDOWS.md ########## @@ -20,25 +20,28 @@ license: | To build SparkR on Windows, the following steps are required -1. Install R (>= 3.1) and [Rtools](https://cloud.r-project.org/bin/windows/Rtools/). Make sure to +1. Make sure `bash` is available and in `PATH` if you already have a built-in `bash` on Windows. If you do not have, install [Cygwin](https://www.cygwin.com/). + +2. Install R (>= 3.1) and [Rtools](https://cloud.r-project.org/bin/windows/Rtools/). Make sure to include Rtools and R in `PATH`. Note that support for R prior to version 3.4 is deprecated as of Spark 3.0.0. -2. Install -[JDK8](https://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html) and set +3. Install [JDK](https://www.oracle.com/technetwork/java/javase/downloads) that SparkR supports - see `R/pkg/DESCRIPTION`, and set `JAVA_HOME` in the system environment variables. -3. Download and install [Maven](https://maven.apache.org/download.html). Also include the `bin` +4. Download and install [Maven](https://maven.apache.org/download.html). Also include the `bin` directory in Maven in `PATH`. -4. Set `MAVEN_OPTS` as described in [Building Spark](https://spark.apache.org/docs/latest/building-spark.html). +5. Set `MAVEN_OPTS` as described in [Building Spark](https://spark.apache.org/docs/latest/building-spark.html). -5. Open a command shell (`cmd`) in the Spark directory and build Spark with [Maven](https://spark.apache.org/docs/latest/building-spark.html#buildmvn) and include the `-Psparkr` profile to build the R package. For example to use the default Hadoop versions you can run +6. Open a command shell (`cmd`) in the Spark directory and build Spark with [Maven](https://spark.apache.org/docs/latest/building-spark.html#buildmvn) and include the `-Psparkr` profile to build the R package. For example to use the default Hadoop versions you can run ```bash mvn.cmd -DskipTests -Psparkr package ``` - `.\build\mvn` is a shell script so `mvn.cmd` should be used directly on Windows. + Note that `.\build\mvn` is a shell script so `mvn.cmd` on the system should be used directly on Windows. + +Note that it is a workaround for SparkR developers on Windows. Apache Spark does not officially support to _build_ on Windows yet whereas it supports to _run_ on Windows. Review comment: BTW, supporting build on Windows officially will need some considerable changes to convert some scripts from bash to cmd, and it will make our life more difficult. We don't use Windows for a release anyway so I think it's fine. As long as we are able to test SparkR on Windows via AppVeyor (which CRAN requires), I guess we're good. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
