What is the best way to process streaming data from multiple channels simultaneously using Spark 2.0 API's?

imax Fri, 08 Apr 2016 10:09:08 -0700

I’d like to use Spark 2.0 (streaming) API to consume data from a custom data
source that provides API for random access to stream of data that
represented as a “topic” that have collection of partitions that might be
accessed/consumed simultaneously.


I want to implement a streaming process using the new Spark 2.0 API that
will fetch the data from aforementioned data source in distributed manner,
i.e. will create separate task per offset range per partition, like Kafka
direct stream does.

I'd like to know if there is a better way to achieve this goal using new
Spark 2.0 API.
Is there any reference implementation that I could look at?




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/What-is-the-best-way-to-process-streaming-data-from-multiple-channels-simultaneously-using-Spark-2-0-tp26720.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

What is the best way to process streaming data from multiple channels simultaneously using Spark 2.0 API's?

Reply via email to