[GitHub] spark pull request #14859: [SPARK-17200][PROJECT INFRA][BUILD][SparkR] Autom...

HyukjinKwon Mon, 29 Aug 2016 02:08:52 -0700

GitHub user HyukjinKwon opened a pull request:

    https://github.com/apache/spark/pull/14859


    [SPARK-17200][PROJECT INFRA][BUILD][SparkR] Automate building and testing 
on Windows (currently SparkR only)

    ## What changes were proposed in this pull request?
    
    This PR adds the build automation on Windows with 
[AppVeyor](https://www.appveyor.com/) CI tool. 
    
    Currently, this only runs the tests for SparkR as we have been having some 
issues with testing Windows-specific PRs (e.g. 
https://github.com/apache/spark/pull/14743 and 
https://github.com/apache/spark/pull/13165) and hard time to verify this.
    
    One concern is, this build is dependent on 
[steveloughran/winutils](https://github.com/steveloughran/winutils) for 
pre-built Hadoop bin package (who is a Hadoop PMC member).
    
    ## How was this patch tested?
    
    Manually, 
https://ci.appveyor.com/project/HyukjinKwon/spark/build/8-SPARK-17200-build
    
    Some tests are already being failed and this was found in 
https://github.com/apache/spark/pull/14743#issuecomment-241405287, which are 
currently as below:
    
    ```
    Skipped 
------------------------------------------------------------------------
    1. create DataFrame from RDD (@test_sparkSQL.R#200) - Hive is not build 
with SparkSQL, skipped
    2. test HiveContext (@test_sparkSQL.R#1041) - Hive is not build with 
SparkSQL, skipped
    3. read/write ORC files (@test_sparkSQL.R#1748) - Hive is not build with 
SparkSQL, skipped
    4. enableHiveSupport on SparkSession (@test_sparkSQL.R#2480) - Hive is not 
build with SparkSQL, skipped
    Warnings 
-----------------------------------------------------------------------
    1. infer types and check types (@test_sparkSQL.R#109) - unable to identify 
current timezone 'C':
    please set environment variable 'TZ'
    Failed 
-------------------------------------------------------------------------
    1. Error: union on two RDDs (@test_binary_function.R#38) 
-----------------------
    1: textFile(sc, fileName) at 
C:/projects/spark/R/lib/SparkR/tests/testthat/test_binary_function.R:38
    2: callJMethod(sc, "textFile", path, getMinPartitions(sc, minPartitions))
    3: invokeJava(isStatic = FALSE, objId$id, methodName, ...)
    4: stop(readString(conn))
    2. Error: zipPartitions() on RDDs (@test_binary_function.R#84) 
-----------------
    1: textFile(sc, fileName, 1) at 
C:/projects/spark/R/lib/SparkR/tests/testthat/test_binary_function.R:84
    2: callJMethod(sc, "textFile", path, getMinPartitions(sc, minPartitions))
    3: invokeJava(isStatic = FALSE, objId$id, methodName, ...)
    4: stop(readString(conn))
    3. Error: saveAsObjectFile()/objectFile() following textFile() works 
(@test_binaryFile.R#31) 
    1: textFile(sc, fileName1, 1) at 
C:/projects/spark/R/lib/SparkR/tests/testthat/test_binaryFile.R:31
    2: callJMethod(sc, "textFile", path, getMinPartitions(sc, minPartitions))
    3: invokeJava(isStatic = FALSE, objId$id, methodName, ...)
    4: stop(readString(conn))
    4. Error: saveAsObjectFile()/objectFile() works on a parallelized list 
(@test_binaryFile.R#46) 
    1: objectFile(sc, fileName) at 
C:/projects/spark/R/lib/SparkR/tests/testthat/test_binaryFile.R:46
    2: callJMethod(sc, "objectFile", path, getMinPartitions(sc, minPartitions))
    3: invokeJava(isStatic = FALSE, objId$id, methodName, ...)
    4: stop(readString(conn))
    5. Error: saveAsObjectFile()/objectFile() following RDD transformations 
works (@test_binaryFile.R#57) 
    1: textFile(sc, fileName1) at 
C:/projects/spark/R/lib/SparkR/tests/testthat/test_binaryFile.R:57
    2: callJMethod(sc, "textFile", path, getMinPartitions(sc, minPartitions))
    3: invokeJava(isStatic = FALSE, objId$id, methodName, ...)
    4: stop(readString(conn))
    6. Error: saveAsObjectFile()/objectFile() works with multiple paths 
(@test_binaryFile.R#85) 
    1: objectFile(sc, c(fileName1, fileName2)) at 
C:/projects/spark/R/lib/SparkR/tests/testthat/test_binaryFile.R:85
    2: callJMethod(sc, "objectFile", path, getMinPartitions(sc, minPartitions))
    3: invokeJava(isStatic = FALSE, objId$id, methodName, ...)
    4: stop(readString(conn))
    7. Error: spark.glm save/load (@test_mllib.R#162) 
------------------------------
    1: read.ml(modelPath) at 
C:/projects/spark/R/lib/SparkR/tests/testthat/test_mllib.R:162
    2: callJStatic("org.apache.spark.ml.r.RWrappers", "load", path)
    3: invokeJava(isStatic = TRUE, className, methodName, ...)
    4: stop(readString(conn))
    8. Error: glm save/load (@test_mllib.R#292) 
------------------------------------
    1: read.ml(modelPath) at 
C:/projects/spark/R/lib/SparkR/tests/testthat/test_mllib.R:292
    2: callJStatic("org.apache.spark.ml.r.RWrappers", "load", path)
    3: invokeJava(isStatic = TRUE, className, methodName, ...)
    4: stop(readString(conn))
    9. Error: spark.kmeans (@test_mllib.R#340) 
-------------------------------------
    1: read.ml(modelPath) at 
C:/projects/spark/R/lib/SparkR/tests/testthat/test_mllib.R:340
    2: callJStatic("org.apache.spark.ml.r.RWrappers", "load", path)
    3: invokeJava(isStatic = TRUE, className, methodName, ...)
    4: stop(readString(conn))
    10. Error: spark.mlp (@test_mllib.R#371) 
---------------------------------------
    1: read.ml(modelPath) at 
C:/projects/spark/R/lib/SparkR/tests/testthat/test_mllib.R:371
    2: callJStatic("org.apache.spark.ml.r.RWrappers", "load", path)
    3: invokeJava(isStatic = TRUE, className, methodName, ...)
    4: stop(readString(conn))
    11. Error: spark.naiveBayes (@test_mllib.R#439) 
--------------------------------
    1: read.ml(modelPath) at 
C:/projects/spark/R/lib/SparkR/tests/testthat/test_mllib.R:439
    2: callJStatic("org.apache.spark.ml.r.RWrappers", "load", path)
    3: invokeJava(isStatic = TRUE, className, methodName, ...)
    4: stop(readString(conn))
    12. Error: spark.survreg (@test_mllib.R#496) 
-----------------------------------
    1: read.ml(modelPath) at 
C:/projects/spark/R/lib/SparkR/tests/testthat/test_mllib.R:496
    2: callJStatic("org.apache.spark.ml.r.RWrappers", "load", path)
    3: invokeJava(isStatic = TRUE, className, methodName, ...)
    4: stop(readString(conn))
    13. Error: spark.isotonicRegression (@test_mllib.R#541) 
------------------------
    1: read.ml(modelPath) at 
C:/projects/spark/R/lib/SparkR/tests/testthat/test_mllib.R:541
    2: callJStatic("org.apache.spark.ml.r.RWrappers", "load", path)
    3: invokeJava(isStatic = TRUE, className, methodName, ...)
    4: stop(readString(conn))
    14. Error: spark.gaussianMixture (@test_mllib.R#603) 
---------------------------
    1: read.ml(modelPath) at 
C:/projects/spark/R/lib/SparkR/tests/testthat/test_mllib.R:603
    2: callJStatic("org.apache.spark.ml.r.RWrappers", "load", path)
    3: invokeJava(isStatic = TRUE, className, methodName, ...)
    4: stop(readString(conn))
    15. Error: spark.lda with libsvm (@test_mllib.R#636) 
---------------------------
    1: read.ml(modelPath) at 
C:/projects/spark/R/lib/SparkR/tests/testthat/test_mllib.R:636
    2: callJStatic("org.apache.spark.ml.r.RWrappers", "load", path)
    3: invokeJava(isStatic = TRUE, className, methodName, ...)
    4: stop(readString(conn))
    DONE 
==========================================================================
    ```

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/HyukjinKwon/spark SPARK-17200-build

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/14859.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #14859
    
----
commit cf9f56c8571f947b5e89e741a2b5f2f5477843b2
Author: hyukjinkwon <[email protected]>
Date:   2016-08-29T07:24:18Z

    Appveyor SparkR Windows test draft

commit b25eabaa88c4ada63e2d5a1bb802a48662f60d04
Author: hyukjinkwon <[email protected]>
Date:   2016-08-29T07:26:18Z

    Fix script path for R installation

commit 2772002e832570ccb7ac2120a2775a2daf724832
Author: hyukjinkwon <[email protected]>
Date:   2016-08-29T07:27:49Z

    Fix the name of script for R installation

commit 6cb4416a5afe66c624602b6ff36c5d23dc32b62f
Author: hyukjinkwon <[email protected]>
Date:   2016-08-29T07:31:10Z

    Upgrade maven version to 3.3.9

commit a2852a0f1b40d715eaf00947acdb076727e2d42e
Author: hyukjinkwon <[email protected]>
Date:   2016-08-29T08:08:11Z

    Clean up and fix the path for Hadoop bin package

commit 5aca1045ccfd7eed35bd8fa7721867c363bd88a0
Author: hyukjinkwon <[email protected]>
Date:   2016-08-29T08:18:55Z

    Merged dependecies installation

commit fbcfe135db8e6515f3dadccce7c2d72f6dec9b91
Author: hyukjinkwon <[email protected]>
Date:   2016-08-29T08:19:46Z

    Remove R installation script

commit f3eb1636d1d9ec6568e8b048d41b5928698d244c
Author: hyukjinkwon <[email protected]>
Date:   2016-08-29T08:26:56Z

    Clean up the dependencies installation script

commit fe95491bf7ef28f5ee0d7edb8ec5b14529815bcb
Author: hyukjinkwon <[email protected]>
Date:   2016-08-29T08:41:34Z

    Fix comment

commit a8e74fc531bb83ceb14d930ca6f03c799fbde384
Author: hyukjinkwon <[email protected]>
Date:   2016-08-29T08:43:01Z

    Uppercase for Maven in the comment

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #14859: [SPARK-17200][PROJECT INFRA][BUILD][SparkR] Autom...

Reply via email to