Hrrm, that's interesting. Did you try with subscribe pattern, out of curiosity?
I haven't tested repartitioning on the underlying new Kafka consumer, so its possible I misunderstood something. On Aug 12, 2016 2:47 PM, "Srikanth" <srikanth...@gmail.com> wrote: > I did try a test with spark 2.0 + spark-streaming-kafka-0-10-assembly. > Partition was increased using "bin/kafka-topics.sh --alter" after spark > job was started. > I don't see messages from new partitions in the DStream. > > KafkaUtils.createDirectStream[Array[Byte], Array[Byte]] ( >> ssc, PreferConsistent, Subscribe[Array[Byte], >> Array[Byte]](topics, kafkaParams) ) >> .map(r => (r.key(), r.value())) > > > Also, no.of partitions did not increase too. > >> dataStream.foreachRDD( (rdd, curTime) => { >> logger.info(s"rdd has ${rdd.getNumPartitions} partitions.") > > > Should I be setting some parameter/config? Is the doc for new integ > available? > > Thanks, > Srikanth > > On Fri, Jul 22, 2016 at 2:15 PM, Cody Koeninger <c...@koeninger.org> > wrote: > >> No, restarting from a checkpoint won't do it, you need to re-define the >> stream. >> >> Here's the jira for the 0.10 integration >> >> https://issues.apache.org/jira/browse/SPARK-12177 >> >> I haven't gotten docs completed yet, but there are examples at >> >> https://github.com/koeninger/kafka-exactly-once/tree/kafka-0.10 >> >> On Fri, Jul 22, 2016 at 1:05 PM, Srikanth <srikanth...@gmail.com> wrote: >> > In Spark 1.x, if we restart from a checkpoint, will it read from new >> > partitions? >> > >> > If you can, pls point us to some doc/link that talks about Kafka 0.10 >> integ >> > in Spark 2.0. >> > >> > On Fri, Jul 22, 2016 at 1:33 PM, Cody Koeninger <c...@koeninger.org> >> wrote: >> >> >> >> For the integration for kafka 0.8, you are literally starting a >> >> streaming job against a fixed set of topicapartitions, It will not >> >> change throughout the job, so you'll need to restart the spark job if >> >> you change kafka partitions. >> >> >> >> For the integration for kafka 0.10 / spark 2.0, if you use subscribe >> >> or subscribepattern, it should pick up new partitions as they are >> >> added. >> >> >> >> On Fri, Jul 22, 2016 at 11:29 AM, Srikanth <srikanth...@gmail.com> >> wrote: >> >> > Hello, >> >> > >> >> > I'd like to understand how Spark Streaming(direct) would handle Kafka >> >> > partition addition? >> >> > Will a running job be aware of new partitions and read from it? >> >> > Since it uses Kafka APIs to query offsets and offsets are handled >> >> > internally. >> >> > >> >> > Srikanth >> > >> > >> > >