[
https://issues.apache.org/jira/browse/SPARK-25157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Amit Baghel updated SPARK-25157:
--------------------------------
Description: We are doing video analytics for video streams using Spark. At
present there is no direct way to stream video frames or image files to Spark
and process them using Structured Streaming and Dataset. We are using Kafka to
stream images and then doing processing at spark. We need a method in Spark to
stream images from directory. Currently *{{DataStreamReader}}* doesn't support
Images. With the introduction of *org.apache.spark.ml.image.ImageSchema* class,
we think streaming capabilities can be added for image files. It is fine if it
won't support some of the structured streaming features as it is a binary file.
Schema used in ImageSchema class for image can be used for Dataset. This method
could be similar to *mmlspark* *streamImages* method.
[https://github.com/Azure/mmlspark/blob/4413771a8830e4760f550084da60ea0616bf80b9/src/io/image/src/main/python/ImageReader.py]
(was: We are doing video analytics for video streams using Spark. At present
there is no direct way to stream video frames or image files to Spark and
process using Structured Streaming and Dataset. We are using Kafka to stream
images and then doing processing at spark. We need a method in Spark to stream
images from directory. Currently *{{DataStreamReader}}* doesn't support Images.
With the introduction of *org.apache.spark.ml.image.ImageSchema* class, we
think streaming capabilities can be added for images. It is fine if it won't
support some of the structured streaming features as it is a binary file.
Schema used in ImageSchema class for image can be used in Dataset. This method
could be similar to *mmlspark* *streamImages* method.
[https://github.com/Azure/mmlspark/blob/4413771a8830e4760f550084da60ea0616bf80b9/src/io/image/src/main/python/ImageReader.py])
> Streaming of image files from directory
> ---------------------------------------
>
> Key: SPARK-25157
> URL: https://issues.apache.org/jira/browse/SPARK-25157
> Project: Spark
> Issue Type: New Feature
> Components: ML, Structured Streaming
> Affects Versions: 2.3.1
> Reporter: Amit Baghel
> Priority: Major
>
> We are doing video analytics for video streams using Spark. At present there
> is no direct way to stream video frames or image files to Spark and process
> them using Structured Streaming and Dataset. We are using Kafka to stream
> images and then doing processing at spark. We need a method in Spark to
> stream images from directory. Currently *{{DataStreamReader}}* doesn't
> support Images. With the introduction of
> *org.apache.spark.ml.image.ImageSchema* class, we think streaming
> capabilities can be added for image files. It is fine if it won't support
> some of the structured streaming features as it is a binary file. Schema used
> in ImageSchema class for image can be used for Dataset. This method could be
> similar to *mmlspark* *streamImages* method.
> [https://github.com/Azure/mmlspark/blob/4413771a8830e4760f550084da60ea0616bf80b9/src/io/image/src/main/python/ImageReader.py]
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]