Tomas Nykodym created SPARK-22723:
-------------------------------------

             Summary: Add support for other data types and add mode info to 
ImageSchema 
                 Key: SPARK-22723
                 URL: https://issues.apache.org/jira/browse/SPARK-22723
             Project: Spark
          Issue Type: Improvement
          Components: ML
    Affects Versions: 2.3.0
            Reporter: Tomas Nykodym
            Priority: Minor


When working with ImageSchema, I came across two shortcomings I had to address 
in our code for spark-deep-learning and I feel like it would be a good idea to 
add this functionality directly to ImageSchema.

Firstly, ImageSchema code currently handles only images stored as uint8 and 
since we produce float-based images in some of our use cases I had to write 
alternatives to ImageSchema.toImage and ImageSchema.toNDArray.

Secondly, there is no description of what open cv modes mean. It would be 
useful to have a data structure describing properties such as number of 
channels and data type for each mode.

The aim of this ticket is to add support for these into the ImageSchema. To be 
more specific, I would like to add the following:
1. support for images stored as floats (CV_F32C* formats)
   ImageSchema.toImage, ImageSchema.toNDArray would need to be updated
2. include description of supported open cv modes, in particular number of 
channels and data type. 

This ticket is based on our implementation in spark-deep-learning. See 
https://github.com/tomasatdatabricks/spark-deep-learning/blob/537d1b125355955dbf9d9cc06c2615f0e30138dc/python/sparkdl/image/imageIO.py#L35-L93





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to