Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/13217#discussion_r64032005
--- Diff: R/WINDOWS.md ---
@@ -11,3 +11,19 @@ include Rtools and R in `PATH`.
directory in Maven in `PATH`.
4. Set `MAVEN_OPTS` as described in [Building
Spark](http://spark.apache.org/docs/latest/building-spark.html).
5. Open a command shell (`cmd`) in the Spark directory and run `mvn
-DskipTests -Psparkr package`
+
+## Unit tests
+
+To run existing unit tests in SparkR on Windows, the following setps are
required (the steps below suppose you are in Spark root directory)
+
+1. Set `HADOOP_HOME`.
+2. Download `winutils.exe` and locate this in `$HADOOP_HOME/bin`.
+
+ It seems not requiring installing Hadoop but only this `winutils.exe`.
It seems not included in Hadoop official binary releases so it should be built
from source but it seems it is able to be downloaded from community (e.g.
[steveloughran/winutils](https://github.com/steveloughran/winutils)).
--- End diff --
I wouldn't recommend putting it under the root of the project, as that only
complicates the source tree and path cleanup; an adjacent directory works. And
I think you may find that `HADOOP.DLL` is needed in places, as there are some
JNI calls related to local file access and permissions/ACLs
I'd suggest the following text:
----
To run the SparkR unit tests on Windows, the following steps are required
âassuming you are in the Spark root directory and do not have Apache Hadoop
installed already:
1. `cd ..`
1. `mkdir hadoop`
1. Download the relevant Hadoop bin package from
[steveloughran/winutils](https://github.com/steveloughran/winutils). While
these are not official ASF artifacts, they are built from the ASF release git
hashes by a Hadoop PMC member on a dedicated Windows VM.
1. Install the files into `hadoop\bin`; make sure that `winutils.exe` and
`hadoop.dll` are present.
1. Set the environment variable `HADOOP_HOME` to the full path to the newly
created `hadoop` directory.
----
.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]