GitHub user HyukjinKwon opened a pull request:
https://github.com/apache/spark/pull/14859
[SPARK-17200][PROJECT INFRA][BUILD][SparkR] Automate building and testing
on Windows (currently SparkR only)
## What changes were proposed in this pull request?
This PR adds the build automation on Windows with
[AppVeyor](https://www.appveyor.com/) CI tool.
Currently, this only runs the tests for SparkR as we have been having some
issues with testing Windows-specific PRs (e.g.
https://github.com/apache/spark/pull/14743 and
https://github.com/apache/spark/pull/13165) and hard time to verify this.
One concern is, this build is dependent on
[steveloughran/winutils](https://github.com/steveloughran/winutils) for
pre-built Hadoop bin package (who is a Hadoop PMC member).
## How was this patch tested?
Manually,
https://ci.appveyor.com/project/HyukjinKwon/spark/build/8-SPARK-17200-build
Some tests are already being failed and this was found in
https://github.com/apache/spark/pull/14743#issuecomment-241405287, which are
currently as below:
```
Skipped
------------------------------------------------------------------------
1. create DataFrame from RDD (@test_sparkSQL.R#200) - Hive is not build
with SparkSQL, skipped
2. test HiveContext (@test_sparkSQL.R#1041) - Hive is not build with
SparkSQL, skipped
3. read/write ORC files (@test_sparkSQL.R#1748) - Hive is not build with
SparkSQL, skipped
4. enableHiveSupport on SparkSession (@test_sparkSQL.R#2480) - Hive is not
build with SparkSQL, skipped
Warnings
-----------------------------------------------------------------------
1. infer types and check types (@test_sparkSQL.R#109) - unable to identify
current timezone 'C':
please set environment variable 'TZ'
Failed
-------------------------------------------------------------------------
1. Error: union on two RDDs (@test_binary_function.R#38)
-----------------------
1: textFile(sc, fileName) at
C:/projects/spark/R/lib/SparkR/tests/testthat/test_binary_function.R:38
2: callJMethod(sc, "textFile", path, getMinPartitions(sc, minPartitions))
3: invokeJava(isStatic = FALSE, objId$id, methodName, ...)
4: stop(readString(conn))
2. Error: zipPartitions() on RDDs (@test_binary_function.R#84)
-----------------
1: textFile(sc, fileName, 1) at
C:/projects/spark/R/lib/SparkR/tests/testthat/test_binary_function.R:84
2: callJMethod(sc, "textFile", path, getMinPartitions(sc, minPartitions))
3: invokeJava(isStatic = FALSE, objId$id, methodName, ...)
4: stop(readString(conn))
3. Error: saveAsObjectFile()/objectFile() following textFile() works
(@test_binaryFile.R#31)
1: textFile(sc, fileName1, 1) at
C:/projects/spark/R/lib/SparkR/tests/testthat/test_binaryFile.R:31
2: callJMethod(sc, "textFile", path, getMinPartitions(sc, minPartitions))
3: invokeJava(isStatic = FALSE, objId$id, methodName, ...)
4: stop(readString(conn))
4. Error: saveAsObjectFile()/objectFile() works on a parallelized list
(@test_binaryFile.R#46)
1: objectFile(sc, fileName) at
C:/projects/spark/R/lib/SparkR/tests/testthat/test_binaryFile.R:46
2: callJMethod(sc, "objectFile", path, getMinPartitions(sc, minPartitions))
3: invokeJava(isStatic = FALSE, objId$id, methodName, ...)
4: stop(readString(conn))
5. Error: saveAsObjectFile()/objectFile() following RDD transformations
works (@test_binaryFile.R#57)
1: textFile(sc, fileName1) at
C:/projects/spark/R/lib/SparkR/tests/testthat/test_binaryFile.R:57
2: callJMethod(sc, "textFile", path, getMinPartitions(sc, minPartitions))
3: invokeJava(isStatic = FALSE, objId$id, methodName, ...)
4: stop(readString(conn))
6. Error: saveAsObjectFile()/objectFile() works with multiple paths
(@test_binaryFile.R#85)
1: objectFile(sc, c(fileName1, fileName2)) at
C:/projects/spark/R/lib/SparkR/tests/testthat/test_binaryFile.R:85
2: callJMethod(sc, "objectFile", path, getMinPartitions(sc, minPartitions))
3: invokeJava(isStatic = FALSE, objId$id, methodName, ...)
4: stop(readString(conn))
7. Error: spark.glm save/load (@test_mllib.R#162)
------------------------------
1: read.ml(modelPath) at
C:/projects/spark/R/lib/SparkR/tests/testthat/test_mllib.R:162
2: callJStatic("org.apache.spark.ml.r.RWrappers", "load", path)
3: invokeJava(isStatic = TRUE, className, methodName, ...)
4: stop(readString(conn))
8. Error: glm save/load (@test_mllib.R#292)
------------------------------------
1: read.ml(modelPath) at
C:/projects/spark/R/lib/SparkR/tests/testthat/test_mllib.R:292
2: callJStatic("org.apache.spark.ml.r.RWrappers", "load", path)
3: invokeJava(isStatic = TRUE, className, methodName, ...)
4: stop(readString(conn))
9. Error: spark.kmeans (@test_mllib.R#340)
-------------------------------------
1: read.ml(modelPath) at
C:/projects/spark/R/lib/SparkR/tests/testthat/test_mllib.R:340
2: callJStatic("org.apache.spark.ml.r.RWrappers", "load", path)
3: invokeJava(isStatic = TRUE, className, methodName, ...)
4: stop(readString(conn))
10. Error: spark.mlp (@test_mllib.R#371)
---------------------------------------
1: read.ml(modelPath) at
C:/projects/spark/R/lib/SparkR/tests/testthat/test_mllib.R:371
2: callJStatic("org.apache.spark.ml.r.RWrappers", "load", path)
3: invokeJava(isStatic = TRUE, className, methodName, ...)
4: stop(readString(conn))
11. Error: spark.naiveBayes (@test_mllib.R#439)
--------------------------------
1: read.ml(modelPath) at
C:/projects/spark/R/lib/SparkR/tests/testthat/test_mllib.R:439
2: callJStatic("org.apache.spark.ml.r.RWrappers", "load", path)
3: invokeJava(isStatic = TRUE, className, methodName, ...)
4: stop(readString(conn))
12. Error: spark.survreg (@test_mllib.R#496)
-----------------------------------
1: read.ml(modelPath) at
C:/projects/spark/R/lib/SparkR/tests/testthat/test_mllib.R:496
2: callJStatic("org.apache.spark.ml.r.RWrappers", "load", path)
3: invokeJava(isStatic = TRUE, className, methodName, ...)
4: stop(readString(conn))
13. Error: spark.isotonicRegression (@test_mllib.R#541)
------------------------
1: read.ml(modelPath) at
C:/projects/spark/R/lib/SparkR/tests/testthat/test_mllib.R:541
2: callJStatic("org.apache.spark.ml.r.RWrappers", "load", path)
3: invokeJava(isStatic = TRUE, className, methodName, ...)
4: stop(readString(conn))
14. Error: spark.gaussianMixture (@test_mllib.R#603)
---------------------------
1: read.ml(modelPath) at
C:/projects/spark/R/lib/SparkR/tests/testthat/test_mllib.R:603
2: callJStatic("org.apache.spark.ml.r.RWrappers", "load", path)
3: invokeJava(isStatic = TRUE, className, methodName, ...)
4: stop(readString(conn))
15. Error: spark.lda with libsvm (@test_mllib.R#636)
---------------------------
1: read.ml(modelPath) at
C:/projects/spark/R/lib/SparkR/tests/testthat/test_mllib.R:636
2: callJStatic("org.apache.spark.ml.r.RWrappers", "load", path)
3: invokeJava(isStatic = TRUE, className, methodName, ...)
4: stop(readString(conn))
DONE
==========================================================================
```
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/HyukjinKwon/spark SPARK-17200-build
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/14859.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #14859
----
commit cf9f56c8571f947b5e89e741a2b5f2f5477843b2
Author: hyukjinkwon <[email protected]>
Date: 2016-08-29T07:24:18Z
Appveyor SparkR Windows test draft
commit b25eabaa88c4ada63e2d5a1bb802a48662f60d04
Author: hyukjinkwon <[email protected]>
Date: 2016-08-29T07:26:18Z
Fix script path for R installation
commit 2772002e832570ccb7ac2120a2775a2daf724832
Author: hyukjinkwon <[email protected]>
Date: 2016-08-29T07:27:49Z
Fix the name of script for R installation
commit 6cb4416a5afe66c624602b6ff36c5d23dc32b62f
Author: hyukjinkwon <[email protected]>
Date: 2016-08-29T07:31:10Z
Upgrade maven version to 3.3.9
commit a2852a0f1b40d715eaf00947acdb076727e2d42e
Author: hyukjinkwon <[email protected]>
Date: 2016-08-29T08:08:11Z
Clean up and fix the path for Hadoop bin package
commit 5aca1045ccfd7eed35bd8fa7721867c363bd88a0
Author: hyukjinkwon <[email protected]>
Date: 2016-08-29T08:18:55Z
Merged dependecies installation
commit fbcfe135db8e6515f3dadccce7c2d72f6dec9b91
Author: hyukjinkwon <[email protected]>
Date: 2016-08-29T08:19:46Z
Remove R installation script
commit f3eb1636d1d9ec6568e8b048d41b5928698d244c
Author: hyukjinkwon <[email protected]>
Date: 2016-08-29T08:26:56Z
Clean up the dependencies installation script
commit fe95491bf7ef28f5ee0d7edb8ec5b14529815bcb
Author: hyukjinkwon <[email protected]>
Date: 2016-08-29T08:41:34Z
Fix comment
commit a8e74fc531bb83ceb14d930ca6f03c799fbde384
Author: hyukjinkwon <[email protected]>
Date: 2016-08-29T08:43:01Z
Uppercase for Maven in the comment
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]