maropu opened a new pull request #23591: 
[SPARK-24740][PYTHON][ML][BACKPORT-2.3] Make PySpark's tests compatible with 
NumPy 1.14+
URL: https://github.com/apache/spark/pull/23591
 
 
   ## What changes were proposed in this pull request?
   This PR backported SPARK-24740 to branch-2.3;
   This PR proposes to make PySpark's tests compatible with NumPy 0.14+
   NumPy 0.14.x introduced rather radical changes about its string 
representation.
   
   For example, the tests below are failed:
   
   ```
   **********************************************************************
   File "/.../spark/python/pyspark/ml/linalg/__init__.py", line 895, in 
__main__.DenseMatrix.__str__
   Failed example:
       print(dm)
   Expected:
       DenseMatrix([[ 0.,  2.],
                    [ 1.,  3.]])
   Got:
       DenseMatrix([[0., 2.],
                    [1., 3.]])
   **********************************************************************
   File "/.../spark/python/pyspark/ml/linalg/__init__.py", line 899, in 
__main__.DenseMatrix.__str__
   Failed example:
       print(dm)
   Expected:
       DenseMatrix([[ 0.,  1.],
                    [ 2.,  3.]])
   Got:
       DenseMatrix([[0., 1.],
                    [2., 3.]])
   **********************************************************************
   File "/.../spark/python/pyspark/ml/linalg/__init__.py", line 939, in 
__main__.DenseMatrix.toArray
   Failed example:
       m.toArray()
   Expected:
       array([[ 0.,  2.],
              [ 1.,  3.]])
   Got:
       array([[0., 2.],
              [1., 3.]])
   **********************************************************************
   File "/.../spark/python/pyspark/ml/linalg/__init__.py", line 324, in 
__main__.DenseVector.dot
   Failed example:
       dense.dot(np.reshape([1., 2., 3., 4.], (2, 2), order='F'))
   Expected:
       array([  5.,  11.])
   Got:
       array([ 5., 11.])
   **********************************************************************
   File "/.../spark/python/pyspark/ml/linalg/__init__.py", line 567, in 
__main__.SparseVector.dot
   Failed example:
       a.dot(np.array([[1, 1], [2, 2], [3, 3], [4, 4]]))
   Expected:
       array([ 22.,  22.])
   Got:
       array([22., 22.])
   ```
   
   See [release 
note](https://docs.scipy.org/doc/numpy-1.14.0/release.html#compatibility-notes).
   
   ## How was this patch tested?
   
   Manually tested:
   
   ```
   $ ./run-tests --python-executables=python3.6,python2.7 
--modules=pyspark-ml,pyspark-mllib
   Running PySpark tests. Output is in /.../spark/python/unit-tests.log
   Will test against the following Python executables: ['python3.6', 
'python2.7']
   Will test the following Python modules: ['pyspark-ml', 'pyspark-mllib']
   Starting test(python2.7): pyspark.mllib.tests
   Starting test(python2.7): pyspark.ml.classification
   Starting test(python3.6): pyspark.mllib.tests
   Starting test(python2.7): pyspark.ml.clustering
   Finished test(python2.7): pyspark.ml.clustering (54s)
   Starting test(python2.7): pyspark.ml.evaluation
   Finished test(python2.7): pyspark.ml.classification (74s)
   Starting test(python2.7): pyspark.ml.feature
   Finished test(python2.7): pyspark.ml.evaluation (27s)
   Starting test(python2.7): pyspark.ml.fpm
   Finished test(python2.7): pyspark.ml.fpm (0s)
   Starting test(python2.7): pyspark.ml.image
   Finished test(python2.7): pyspark.ml.image (17s)
   Starting test(python2.7): pyspark.ml.linalg.__init__
   Finished test(python2.7): pyspark.ml.linalg.__init__ (1s)
   Starting test(python2.7): pyspark.ml.recommendation
   Finished test(python2.7): pyspark.ml.feature (76s)
   Starting test(python2.7): pyspark.ml.regression
   Finished test(python2.7): pyspark.ml.recommendation (69s)
   Starting test(python2.7): pyspark.ml.stat
   Finished test(python2.7): pyspark.ml.regression (45s)
   Starting test(python2.7): pyspark.ml.tests
   Finished test(python2.7): pyspark.ml.stat (28s)
   Starting test(python2.7): pyspark.ml.tuning
   Finished test(python2.7): pyspark.ml.tuning (20s)
   Starting test(python2.7): pyspark.mllib.classification
   Finished test(python2.7): pyspark.mllib.classification (31s)
   Starting test(python2.7): pyspark.mllib.clustering
   Finished test(python2.7): pyspark.mllib.tests (260s)
   Starting test(python2.7): pyspark.mllib.evaluation
   Finished test(python3.6): pyspark.mllib.tests (266s)
   Starting test(python2.7): pyspark.mllib.feature
   Finished test(python2.7): pyspark.mllib.evaluation (21s)
   Starting test(python2.7): pyspark.mllib.fpm
   Finished test(python2.7): pyspark.mllib.feature (38s)
   Starting test(python2.7): pyspark.mllib.linalg.__init__
   Finished test(python2.7): pyspark.mllib.linalg.__init__ (1s)
   Starting test(python2.7): pyspark.mllib.linalg.distributed
   Finished test(python2.7): pyspark.mllib.fpm (34s)
   Starting test(python2.7): pyspark.mllib.random
   Finished test(python2.7): pyspark.mllib.clustering (64s)
   Starting test(python2.7): pyspark.mllib.recommendation
   Finished test(python2.7): pyspark.mllib.random (15s)
   Starting test(python2.7): pyspark.mllib.regression
   Finished test(python2.7): pyspark.mllib.linalg.distributed (47s)
   Starting test(python2.7): pyspark.mllib.stat.KernelDensity
   Finished test(python2.7): pyspark.mllib.stat.KernelDensity (0s)
   Starting test(python2.7): pyspark.mllib.stat._statistics
   Finished test(python2.7): pyspark.mllib.recommendation (40s)
   Starting test(python2.7): pyspark.mllib.tree
   Finished test(python2.7): pyspark.mllib.regression (38s)
   Starting test(python2.7): pyspark.mllib.util
   Finished test(python2.7): pyspark.mllib.stat._statistics (19s)
   Starting test(python3.6): pyspark.ml.classification
   Finished test(python2.7): pyspark.mllib.tree (26s)
   Starting test(python3.6): pyspark.ml.clustering
   Finished test(python2.7): pyspark.mllib.util (27s)
   Starting test(python3.6): pyspark.ml.evaluation
   Finished test(python3.6): pyspark.ml.evaluation (30s)
   Starting test(python3.6): pyspark.ml.feature
   Finished test(python2.7): pyspark.ml.tests (234s)
   Starting test(python3.6): pyspark.ml.fpm
   Finished test(python3.6): pyspark.ml.fpm (1s)
   Starting test(python3.6): pyspark.ml.image
   Finished test(python3.6): pyspark.ml.clustering (55s)
   Starting test(python3.6): pyspark.ml.linalg.__init__
   Finished test(python3.6): pyspark.ml.linalg.__init__ (0s)
   Starting test(python3.6): pyspark.ml.recommendation
   Finished test(python3.6): pyspark.ml.classification (71s)
   Starting test(python3.6): pyspark.ml.regression
   Finished test(python3.6): pyspark.ml.image (18s)
   Starting test(python3.6): pyspark.ml.stat
   Finished test(python3.6): pyspark.ml.stat (37s)
   Starting test(python3.6): pyspark.ml.tests
   Finished test(python3.6): pyspark.ml.regression (59s)
   Starting test(python3.6): pyspark.ml.tuning
   Finished test(python3.6): pyspark.ml.feature (93s)
   Starting test(python3.6): pyspark.mllib.classification
   Finished test(python3.6): pyspark.ml.recommendation (83s)
   Starting test(python3.6): pyspark.mllib.clustering
   Finished test(python3.6): pyspark.ml.tuning (29s)
   Starting test(python3.6): pyspark.mllib.evaluation
   Finished test(python3.6): pyspark.mllib.evaluation (26s)
   Starting test(python3.6): pyspark.mllib.feature
   Finished test(python3.6): pyspark.mllib.classification (43s)
   Starting test(python3.6): pyspark.mllib.fpm
   Finished test(python3.6): pyspark.mllib.clustering (81s)
   Starting test(python3.6): pyspark.mllib.linalg.__init__
   Finished test(python3.6): pyspark.mllib.linalg.__init__ (2s)
   Starting test(python3.6): pyspark.mllib.linalg.distributed
   Finished test(python3.6): pyspark.mllib.fpm (48s)
   Starting test(python3.6): pyspark.mllib.random
   Finished test(python3.6): pyspark.mllib.feature (54s)
   Starting test(python3.6): pyspark.mllib.recommendation
   Finished test(python3.6): pyspark.mllib.random (18s)
   Starting test(python3.6): pyspark.mllib.regression
   Finished test(python3.6): pyspark.mllib.linalg.distributed (55s)
   Starting test(python3.6): pyspark.mllib.stat.KernelDensity
   Finished test(python3.6): pyspark.mllib.stat.KernelDensity (1s)
   Starting test(python3.6): pyspark.mllib.stat._statistics
   Finished test(python3.6): pyspark.mllib.recommendation (51s)
   Starting test(python3.6): pyspark.mllib.tree
   Finished test(python3.6): pyspark.mllib.regression (45s)
   Starting test(python3.6): pyspark.mllib.util
   Finished test(python3.6): pyspark.mllib.stat._statistics (21s)
   Finished test(python3.6): pyspark.mllib.tree (27s)
   Finished test(python3.6): pyspark.mllib.util (27s)
   Finished test(python3.6): pyspark.ml.tests (264s)
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to