Hi Kanagha,

For reading, KafkaSpout's internally used KafkaConsumer ensures that
data is received in-order per partition. Because the spout might read
multiple partitions, and emit only a single (logical) output stream,
within this output stream, data from multiple partitions interleave (the
relative order within each partition is preserved though). It depends on
the connection pattern of your spout-downstream bolt, how the partitions
are distributed... (If you use shuffleGrouping, data of a single
partition, is distributed over all downstream bolt instances -- still,
order is preserved within a partition, but you get only some data per
partition on each bolt instance. After the first bolt, the order is not
guaranteed by Storm any more, because the data of a single partition is
spread out over multiple parallels bolt is this case.)

If you want each partition to be processed by a single bolt, you need to
extract the partitionId (ie, add it to the Storm tuple) in the spout and
use fieldsGrouping on partitionId for downstream bolts. I guess,
KafkaSpout does not support this out of the box -- you can either patch
KafakSpout itself, if inherit from it to build you own
"PartionKafkaSpout" to add the partitionId to the output tuples.

(Or maybe ask at [email protected] ;))

For writing, you are correct. KafkaBolt uses key-based partitioning on
write and if you use fieldsGrouping on the key, it should work as intended.


-Matthias

On 06/05/2016 07:51 AM, Kanagha wrote:
> Hi,
> 
> I'm looking at the documentation for using KafkaSpout/KafkaBolt.
> 
> https://github.com/apache/storm/tree/master/external/storm-kafka
> 
> How is ordering guaranteed while reading messages from Kafka using
> KafkaSpout?
> Does the parallelism_hint set when a KafkaSpout is added to a topology,
> need to match the number of partitions in a topic?
> 
> Similarly while writing back to Kafka, I believe fieldsGrouping can be used
> so that tuples that have same field value will go to the same task and can
> be written to the same partition.
> Would like to get suggestions on this. Thanks!
> 
> Thanks
> Kanagha
> 

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to