You can get the topic for a given partition from the offset range. You can either filter using that; or just have a single rdd and match on topic when doing mapPartitions or foreachPartition (which I think is a better idea)
http://spark.apache.org/docs/latest/streaming-kafka-integration.html#approach-2-direct-approach-no-receivers On Wed, Sep 30, 2015 at 5:02 PM, Udit Mehta <ume...@groupon.com> wrote: > Hi, > > I am using spark direct stream to consume from multiple topics in Kafka. I > am able to consume fine but I am stuck at how to separate the data for each > topic since I need to process data differently depending on the topic. > I basically want to split the RDD consisting on N topics into N RDD's each > having 1 topic. > > Any help would be appreciated. > > Thanks in advance, > Udit >