viirya opened a new pull request #23484: [SPARK-26559][ML][PySpark] ML image 
can't work with numpy versions prior to 1.9
URL: https://github.com/apache/spark/pull/23484
 
 
   ## What changes were proposed in this pull request?
   
   Due to [API 
change](https://github.com/numpy/numpy/pull/4257/files#diff-c39521d89f7e61d6c0c445d93b62f7dc)
 at 1.9, PySpark image doesn't work with numpy version prior to 1.9.
   
   When running image test with numpy version prior to 1.9, we can see error:
   ```
   test_read_images (pyspark.ml.tests.test_image.ImageReaderTest) ... ERROR     
                                                                         
   test_read_images_multiple_times 
(pyspark.ml.tests.test_image.ImageReaderTest2) ... ok                           
                                      
                                                                                
                                                                         
   ======================================================================       
                                                                         
   ERROR: test_read_images (pyspark.ml.tests.test_image.ImageReaderTest)        
                                                                         
   ----------------------------------------------------------------------       
                                                                         
   Traceback (most recent call last):
     File 
"/Users/viirya/docker_tmp/repos/spark-1/python/pyspark/ml/tests/test_image.py", 
line 36, in test_read_images                                   
       self.assertEqual(ImageSchema.toImage(array, origin=first_row[0]), 
first_row)                                                                      
     File "/Users/viirya/docker_tmp/repos/spark-1/python/pyspark/ml/image.py", 
line 193, in toImage                                                      
       data = bytearray(array.astype(dtype=np.uint8).ravel().tobytes())         
                                                                         
   AttributeError: 'numpy.ndarray' object has no attribute 'tobytes'            
                                                                         
                                                                                
                                                                         
   ----------------------------------------------------------------------
   Ran 2 tests in 29.040s                                                       
                                                                         
                                                                                
                                                                         
   FAILED (errors=1)
   ```
   
   
   ## How was this patch tested?
   
   Manually test with numpy version prior and after 1.9.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to