GitHub user HyukjinKwon opened a pull request:

    https://github.com/apache/spark/pull/16254

    [SPARK-18830][TESTS] Fix tests in PipedRDDSuite to pass on Winodws

    ## What changes were proposed in this pull request?
    
    This PR proposes to fix the tests failed on Windows as below:
    
    ```
    [info] - pipe with empty partition *** FAILED *** (672 milliseconds)
    [info]   Set(0, 4, 5) did not equal Set(0, 5, 6) (PipedRDDSuite.scala:145)
    [info]   org.scalatest.exceptions.TestFailedException:
    ...
    ```
    
    In this case, `wc -c` counts the characters on both Windows and Linux but 
the newlines characters on Windows are `\r\n` which are two. So, the counts 
ends up one more for each.
    
    ```
    [info] - test pipe exports map_input_file *** FAILED *** (62 milliseconds)
    [info]   java.lang.IllegalStateException: Subprocess exited with status 1. 
Command ran: printenv map_input_file
    [info]   at 
org.apache.spark.rdd.PipedRDD$$anon$1.hasNext(PipedRDD.scala:178)
    ...
    ```
    
    ```
    [info] - test pipe exports mapreduce_map_input_file *** FAILED *** (172 
milliseconds)
    [info]   java.lang.IllegalStateException: Subprocess exited with status 1. 
Command ran: printenv mapreduce_map_input_file
    [info]   at 
org.apache.spark.rdd.PipedRDD$$anon$1.hasNext(PipedRDD.scala:178)
    ...
    ```
    
    This command prints the environment variables; however, when environment 
variables are set to `ProcessBuilder` as lower-cased keys, `printenv` in 
Windows ignores and does not print this although it is actually set and 
accessible. (this was tested in 
[here](https://ci.appveyor.com/project/spark-test/spark/build/208-PipedRDDSuite)
 for upper-cases with this 
[diff](https://github.com/apache/spark/compare/master...spark-test:74d39da) and 
[here](https://ci.appveyor.com/project/spark-test/spark/build/203-PipedRDDSuite)
 for lower-cases with this 
[diff](https://github.com/apache/spark/compare/master...spark-test:fde5e37f28032c15a8d8693ba033a8a779a26317).
    (Note that environment variables on Windows are case-insensitive).
    
    This is (I believe) a thirdparty tool on Windows that resembles `printenv` 
on Linux (installed in AppVeyor environment or Windows Server 2012 R2). This 
command does not exist, at least, for Windows 7 and 10.
    
    On Windows, we can use `cmd.exe /C set [varname]` officially for this 
purpose. We could fix the tests with this in order to test if the environment 
variable is set. 
    
    ## How was this patch tested?
    
    Manually tested via AppVeyor.
    
    **Before**
    https://ci.appveyor.com/project/spark-test/spark/build/194-PipedRDDSuite
    
    **After**
    https://ci.appveyor.com/project/spark-test/spark/build/226-PipedRDDSuite

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/HyukjinKwon/spark pipe-errors

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/16254.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #16254
    
----
commit 1cd63b2dedb5c34f66bb370a7b4ce278d9f4ea70
Author: hyukjinkwon <[email protected]>
Date:   2016-12-12T13:15:01Z

    Fix tests in PipedRDDSuite to pass on Winodws

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to