GitHub user HyukjinKwon opened a pull request:

    https://github.com/apache/spark/pull/23203

    [SPARK-26252][PYTHON] Add support to run specific unittests and/or doctests 
in python/run-tests script

    ## What changes were proposed in this pull request?
    
    This PR proposes add a developer option, `--testnames`, to our testing 
script to allow run specific set of unittests and doctests.
    
    
    **1. Run unittests in the class**
    
    ```
    ./run-tests --testnames 'pyspark.sql.tests.test_arrow ArrowTests'
    Running PySpark tests. Output is in /.../spark/python/unit-tests.log
    Will test against the following Python executables: ['python2.7', 'pypy']
    Will test the following Python tests: ['pyspark.sql.tests.test_arrow 
ArrowTests']
    Starting test(python2.7): pyspark.sql.tests.test_arrow ArrowTests
    Starting test(pypy): pyspark.sql.tests.test_arrow ArrowTests
    Finished test(python2.7): pyspark.sql.tests.test_arrow ArrowTests (14s)
    Finished test(pypy): pyspark.sql.tests.test_arrow ArrowTests (14s) ... 22 
tests were skipped
    Tests passed in 14 seconds
    
    Skipped tests in pyspark.sql.tests.test_arrow ArrowTests with pypy:
        test_createDataFrame_column_name_encoding 
(pyspark.sql.tests.test_arrow.ArrowTests) ... skipped 'Pandas >= 0.19.2 must be 
installed; however, it was not found.'
        test_createDataFrame_does_not_modify_input 
(pyspark.sql.tests.test_arrow.ArrowTests) ... skipped 'Pandas >= 0.19.2 must be 
installed; however, it was not found.'
        test_createDataFrame_fallback_disabled 
(pyspark.sql.tests.test_arrow.ArrowTests) ... skipped 'Pandas >= 0.19.2 must be 
installed; however, it was not found.'
        test_createDataFrame_fallback_enabled 
(pyspark.sql.tests.test_arrow.ArrowTests) ... skipped
    ...
    ```
    
    **2. Run single unittest in the class.**
    
    ```
    ./run-tests --testnames 'pyspark.sql.tests.test_arrow 
ArrowTests.test_null_conversion'
    Running PySpark tests. Output is in /.../spark/python/unit-tests.log
    Will test against the following Python executables: ['python2.7', 'pypy']
    Will test the following Python tests: ['pyspark.sql.tests.test_arrow 
ArrowTests.test_null_conversion']
    Starting test(pypy): pyspark.sql.tests.test_arrow 
ArrowTests.test_null_conversion
    Starting test(python2.7): pyspark.sql.tests.test_arrow 
ArrowTests.test_null_conversion
    Finished test(pypy): pyspark.sql.tests.test_arrow 
ArrowTests.test_null_conversion (0s) ... 1 tests were skipped
    Finished test(python2.7): pyspark.sql.tests.test_arrow 
ArrowTests.test_null_conversion (8s)
    Tests passed in 8 seconds
    
    Skipped tests in pyspark.sql.tests.test_arrow 
ArrowTests.test_null_conversion with pypy:
        test_null_conversion (pyspark.sql.tests.test_arrow.ArrowTests) ... 
skipped 'Pandas >= 0.19.2 must be installed; however, it was not found.'
    ```
    
    **3. Run doctests in single PySpark module.**
    
    ```
    ./run-tests --testnames 'pyspark.sql.dataframe'
    Running PySpark tests. Output is in /.../spark/python/unit-tests.log
    Will test against the following Python executables: ['python2.7', 'pypy']
    Will test the following Python tests: ['pyspark.sql.dataframe']
    Starting test(pypy): pyspark.sql.dataframe
    Starting test(python2.7): pyspark.sql.dataframe
    Finished test(python2.7): pyspark.sql.dataframe (47s)
    Finished test(pypy): pyspark.sql.dataframe (48s)
    Tests passed in 48 seconds
    ```
    
    Of course, you can mix them:
    
    ```
    ./run-tests --testnames 'pyspark.sql.tests.test_arrow 
ArrowTests,pyspark.sql.dataframe'
    \Running PySpark tests. Output is in /.../spark/python/unit-tests.log
    Will test against the following Python executables: ['python2.7', 'pypy']
    Will test the following Python tests: ['pyspark.sql.tests.test_arrow 
ArrowTests', 'pyspark.sql.dataframe']
    Starting test(pypy): pyspark.sql.dataframe
    Starting test(pypy): pyspark.sql.tests.test_arrow ArrowTests
    Starting test(python2.7): pyspark.sql.dataframe
    Starting test(python2.7): pyspark.sql.tests.test_arrow ArrowTests
    Finished test(pypy): pyspark.sql.tests.test_arrow ArrowTests (0s) ... 22 
tests were skipped
    Finished test(python2.7): pyspark.sql.tests.test_arrow ArrowTests (18s)
    Finished test(python2.7): pyspark.sql.dataframe (50s)
    Finished test(pypy): pyspark.sql.dataframe (52s)
    Tests passed in 52 seconds
    
    Skipped tests in pyspark.sql.tests.test_arrow ArrowTests with pypy:
        test_createDataFrame_column_name_encoding 
(pyspark.sql.tests.test_arrow.ArrowTests) ... skipped 'Pandas >= 0.19.2 must be 
installed; however, it was not found.'
        test_createDataFrame_does_not_modify_input 
(pyspark.sql.tests.test_arrow.ArrowTests) ... skipped 'Pandas >= 0.19.2 must be 
installed; however, it was not found.'
        test_createDataFrame_fallback_disabled 
(pyspark.sql.tests.test_arrow.ArrowTests) ... skipped 'Pandas >= 0.19.2 must be 
installed; however, it was not found.'
    ```
    
    and also you can use all other options (except `--modules`, which will be 
ignored)
    
    ```
    ./run-tests --testnames 'pyspark.sql.tests.test_arrow 
ArrowTests.test_null_conversion' --python-executables=python
    Running PySpark tests. Output is in /.../spark/python/unit-tests.log
    Will test against the following Python executables: ['python']
    Will test the following Python tests: ['pyspark.sql.tests.test_arrow 
ArrowTests.test_null_conversion']
    Starting test(python): pyspark.sql.tests.test_arrow 
ArrowTests.test_null_conversion
    Finished test(python): pyspark.sql.tests.test_arrow 
ArrowTests.test_null_conversion (12s)
    Tests passed in 12 seconds
    ```
    
    See help below:
    
    ```
     ./run-tests --help
    Usage: run-tests [options]
    
    Options:
    ...
      Developer Options:
        --testnames=TESTNAMES
                            A comma-separated list of specific modules, classes
                            and functions of doctest or unittest to test. For
                            example, 'pyspark.sql.foo' to run the module as
                            unittests or doctests, 'pyspark.sql.tests FooTests' 
to
                            run the specific class of unittests,
                            'pyspark.sql.tests FooTests.test_foo' to run the
                            specific unittest in the class. '--modules' option 
is
                            ignored if they are given.
    ```
    
    I intentionally grouped it as a developer option to be more conservative.
    
    ## How was this patch tested?
    
    Manually tested. Negative tests were also done.
    
    ```
    $ ./run-tests --testnames 'pyspark.sql.tests.test_arrow 
ArrowTests.test_null_conversion1' --python-executables=python
    ...
    AttributeError: type object 'ArrowTests' has no attribute 
'test_null_conversion1'
    ...
    ```
    
    ```
    ./run-tests --testnames 'pyspark.sql.tests.test_arrow ArrowT' 
--python-executables=python
    ...
    AttributeError: 'module' object has no attribute 'ArrowT'
    ...
    ```
    
    ```
     ./run-tests --testnames 'pyspark.sql.tests.test_ar' 
--python-executables=python
    ...
    /.../python2.7: No module named pyspark.sql.tests.test_ar
    ```

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/HyukjinKwon/spark SPARK-26252

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/23203.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #23203
    
----
commit 44c622bf17ab642ef372d9a534b5bfc18c98a0da
Author: Hyukjin Kwon <gurwls223@...>
Date:   2018-12-03T08:02:35Z

    Add support to run specific unittests and/or doctests in python/run-tests 
script

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to