Re: Not all KafkaReceivers processing the data Why?

2016-09-14 Thread Jeremy Smith
Take a look at how the messages are actually distributed across the
partitions. If the message keys have a low cardinality, you might get poor
distribution (i.e. all the messages are actually only in two of the five
partitions, leading to what you see in Spark).

If you take a look at the Kafka data directories, you can probably get an
idea of the distribution by just examining the sizes of each partition.

Jeremy

On Wed, Sep 14, 2016 at 12:33 PM, Rachana Srivastava <
rachana.srivast...@markmonitor.com> wrote:

> Hello all,
>
>
>
> I have created a Kafka topic with 5 partitions.  And I am using
> createStream receiver API like following.   But somehow only one receiver
> is getting the input data. Rest of receivers are not processign anything.
> Can you please help?
>
>
>
> JavaPairDStream messages = null;
>
>
>
> if(sparkStreamCount > 0){
>
> // We create an input DStream for each partition of the
> topic, unify those streams, and then repartition the unified stream.
>
> List> kafkaStreams = new
> ArrayList>(sparkStreamCount);
>
> for (int i = 0; i < sparkStreamCount; i++) {
>
> kafkaStreams.add(
> KafkaUtils.createStream(jssc, contextVal.getString(KAFKA_ZOOKEEPER),
> contextVal.getString(KAFKA_GROUP_ID), kafkaTopicMap));
>
> }
>
> messages = jssc.union(kafkaStreams.get(0),
> kafkaStreams.subList(1, kafkaStreams.size()));
>
> }
>
> else{
>
> messages =  KafkaUtils.createStream(jssc,
> contextVal.getString(KAFKA_ZOOKEEPER), contextVal.getString(KAFKA_GROUP_ID),
> kafkaTopicMap);
>
> }
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>


Not all KafkaReceivers processing the data Why?

2016-09-14 Thread Rachana Srivastava
Hello all,

I have created a Kafka topic with 5 partitions.  And I am using createStream 
receiver API like following.   But somehow only one receiver is getting the 
input data. Rest of receivers are not processign anything.  Can you please help?

JavaPairDStream messages = null;

if(sparkStreamCount > 0){
// We create an input DStream for each partition of the topic, 
unify those streams, and then repartition the unified stream.
List> kafkaStreams = new 
ArrayList>(sparkStreamCount);
for (int i = 0; i < sparkStreamCount; i++) {
kafkaStreams.add( KafkaUtils.createStream(jssc, 
contextVal.getString(KAFKA_ZOOKEEPER), contextVal.getString(KAFKA_GROUP_ID), 
kafkaTopicMap));
}
messages = jssc.union(kafkaStreams.get(0), 
kafkaStreams.subList(1, kafkaStreams.size()));
}
else{
messages =  KafkaUtils.createStream(jssc, 
contextVal.getString(KAFKA_ZOOKEEPER), contextVal.getString(KAFKA_GROUP_ID), 
kafkaTopicMap);
}



[cid:image001.png@01D20E84.3558F520]