I’d like to use Spark 2.0 (streaming) API to consume data from a custom data source that provides API for random access to stream of data that represented as a “topic” that have collection of partitions that might be accessed/consumed simultaneously.
I want to implement a streaming process using the new Spark 2.0 API that will fetch the data from aforementioned data source in distributed manner, i.e. will create separate task per offset range per partition, like Kafka direct stream does. I'd like to know if there is a better way to achieve this goal using new Spark 2.0 API. Is there any reference implementation that I could look at? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/What-is-the-best-way-to-process-streaming-data-from-multiple-channels-simultaneously-using-Spark-2-0-tp26720.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org