Tomas Nykodym created SPARK-22723:
-------------------------------------
Summary: Add support for other data types and add mode info to
ImageSchema
Key: SPARK-22723
URL: https://issues.apache.org/jira/browse/SPARK-22723
Project: Spark
Issue Type: Improvement
Components: ML
Affects Versions: 2.3.0
Reporter: Tomas Nykodym
Priority: Minor
When working with ImageSchema, I came across two shortcomings I had to address
in our code for spark-deep-learning and I feel like it would be a good idea to
add this functionality directly to ImageSchema.
Firstly, ImageSchema code currently handles only images stored as uint8 and
since we produce float-based images in some of our use cases I had to write
alternatives to ImageSchema.toImage and ImageSchema.toNDArray.
Secondly, there is no description of what open cv modes mean. It would be
useful to have a data structure describing properties such as number of
channels and data type for each mode.
The aim of this ticket is to add support for these into the ImageSchema. To be
more specific, I would like to add the following:
1. support for images stored as floats (CV_F32C* formats)
ImageSchema.toImage, ImageSchema.toNDArray would need to be updated
2. include description of supported open cv modes, in particular number of
channels and data type.
This ticket is based on our implementation in spark-deep-learning. See
https://github.com/tomasatdatabricks/spark-deep-learning/blob/537d1b125355955dbf9d9cc06c2615f0e30138dc/python/sparkdl/image/imageIO.py#L35-L93
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]