[
https://issues.apache.org/jira/browse/SPARK-25157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Amit Baghel updated SPARK-25157:
--------------------------------
Priority: Major (was: Minor)
> Streaming of image files from directory
> ---------------------------------------
>
> Key: SPARK-25157
> URL: https://issues.apache.org/jira/browse/SPARK-25157
> Project: Spark
> Issue Type: New Feature
> Components: ML, Structured Streaming
> Affects Versions: 2.3.1
> Reporter: Amit Baghel
> Priority: Major
>
> We are doing video analytics for video streams using Spark. At present there
> is no direct way to stream video frames or image files to Spark and process
> using Structured Streaming and Dataset. We are using Kafka to stream images
> and then doing processing at spark. We need a method in Spark to stream
> images from directory. Currently *{{DataStreamReader}}* doesn't support
> Images. With the introduction of *org.apache.spark.ml.image.ImageSchema*
> class, we think streaming capabilities can be added for images. It is fine if
> it won't support some of the structured streaming features as it is a binary
> file. Schema used in ImageSchema class for image can be used in Dataset. This
> method could be similar to *mmlspark* *streamImages* method.
> [https://github.com/Azure/mmlspark/blob/4413771a8830e4760f550084da60ea0616bf80b9/src/io/image/src/main/python/ImageReader.py]
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]