Hi Kanagha, For reading, KafkaSpout's internally used KafkaConsumer ensures that data is received in-order per partition. Because the spout might read multiple partitions, and emit only a single (logical) output stream, within this output stream, data from multiple partitions interleave (the relative order within each partition is preserved though). It depends on the connection pattern of your spout-downstream bolt, how the partitions are distributed... (If you use shuffleGrouping, data of a single partition, is distributed over all downstream bolt instances -- still, order is preserved within a partition, but you get only some data per partition on each bolt instance. After the first bolt, the order is not guaranteed by Storm any more, because the data of a single partition is spread out over multiple parallels bolt is this case.)
If you want each partition to be processed by a single bolt, you need to extract the partitionId (ie, add it to the Storm tuple) in the spout and use fieldsGrouping on partitionId for downstream bolts. I guess, KafkaSpout does not support this out of the box -- you can either patch KafakSpout itself, if inherit from it to build you own "PartionKafkaSpout" to add the partitionId to the output tuples. (Or maybe ask at [email protected] ;)) For writing, you are correct. KafkaBolt uses key-based partitioning on write and if you use fieldsGrouping on the key, it should work as intended. -Matthias On 06/05/2016 07:51 AM, Kanagha wrote: > Hi, > > I'm looking at the documentation for using KafkaSpout/KafkaBolt. > > https://github.com/apache/storm/tree/master/external/storm-kafka > > How is ordering guaranteed while reading messages from Kafka using > KafkaSpout? > Does the parallelism_hint set when a KafkaSpout is added to a topology, > need to match the number of partitions in a topic? > > Similarly while writing back to Kafka, I believe fieldsGrouping can be used > so that tuples that have same field value will go to the same task and can > be written to the same partition. > Would like to get suggestions on this. Thanks! > > Thanks > Kanagha >
signature.asc
Description: OpenPGP digital signature
