RE: How to do dispatching in Streaming?

Evo Eftimov Thu, 16 Apr 2015 01:54:25 -0700

Also you can have each message type in a different topic (needs to be arranged 
upstream from your Spark Streaming app ie in the publishing systems and the 
messaging brokers) and then for each topic you can have a dedicated instance of 
InputReceiverDStream which will be the start of a dedicated DAG pipeline 
instance for every message type. Moreover each such DAG pipeline instance will 
run in parallel with the others

From: Tathagata Das [mailto:t...@databricks.com] 
Sent: Thursday, April 16, 2015 12:52 AM
To: Jianshi Huang
Cc: user; Shao, Saisai; Huang Jie
Subject: Re: How to do dispatching in Streaming?

It may be worthwhile to do architect the computation in a different way. 

dstream.foreachRDD { rdd => 

   rdd.foreach { record => 

      // do different things for each record based on filters

   }

}

TD

On Sun, Apr 12, 2015 at 7:52 PM, Jianshi Huang <jianshi.hu...@gmail.com> wrote:

Hi,

I have a Kafka topic that contains dozens of different types of messages. And 
for each one I'll need to create a DStream for it.

Currently I have to filter the Kafka stream over and over, which is very 
inefficient.

So what's the best way to do dispatching in Spark Streaming? (one DStream -> 
multiple DStreams)

Thanks,

-- 

Jianshi Huang

LinkedIn: jianshi
Twitter: @jshuang
Github & Blog: http://huangjs.github.com/

RE: How to do dispatching in Streaming?

Reply via email to