Hi Raymond, If you want all messages delivered in order then you should create the topic with 1 partition. If you want ordering guarantees for messages with the same key, then you need to produce the messages with a key.
Using the console producer you can do that by adding --property "parse.key=true" --property "key.separator=," Regards, Damian On Sat, 26 May 2018 at 21:32, Raymond Xie <xie3208...@gmail.com> wrote: > Thank you so much Hans for your enlightening, it is definitely greatly > helpful to me as a new starter. > > So for my case, what is the right options I should put together to run the > commands for producer and consumer respectively? > > Thanks. > > > *------------------------------------------------* > *Sincerely yours,* > > > *Raymond* > > On Sat, May 26, 2018 at 4:26 PM, Hans Jespersen <h...@confluent.io> wrote: > > > There are two concepts in Kafka that are not always familiar to people > who > > have used other pub/sub systems. > > > > 1) partitions: > > > > Kafka topics are partitioned which means a single topic is sharded into > > multiple pieces that are distributed across multiple brokers in the > cluster > > for parallel processing. > > > > Order is guaranteed per partition (not per topic). > > > > You can think of each kafka topic partition like an exclusive queue is > > traditional messaging systems and order is not guaranteed when the data > is > > spread out across multiple queues in tradition messaging either. > > > > 2) keys > > > > Kafka messages have keys in addition the value (I.e body) and the header. > > When messages are published with the same key they will be all be sent in > > order to the same partition. > > > > If messages are published with a “null” key then they will be spread out > > round robin across all partitions (which is what you have done). > > > > > > Conclusion > > > > You will see ordered delivery if your either use a key when you publish > or > > create a topic with one partition. > > > > > > -hans > > > > On May 26, 2018, at 7:59 AM, Raymond Xie <xie3208...@gmail.com> wrote: > > > > Thanks. By default, can you explain me why I received the message in > wrong > > order? Note there are only 9 lines from 1 to 9, but on consumer side > their > > original order becomes messed up. > > > > ~~~sent from my cell phone, sorry if there is any typo > > > > Hans Jespersen <h...@confluent.io> 于 2018年5月26日周六 上午12:16写道: > > > >> If you create a topic with one partition they will be in order. > >> > >> Alternatively if you publish with the same key for every message they > >> will be in the same order even if your topic has more than 1 partition. > >> > >> Either way above will work for Kafka. > >> > >> -hans > >> > >> > On May 25, 2018, at 8:56 PM, Raymond Xie <xie3208...@gmail.com> > wrote: > >> > > >> > Hello, > >> > > >> > I just started learning Kafka and have the environment setup on my > >> > hortonworks sandbox at home vmware. > >> > > >> > test.csv is what I want the producer to send out: > >> > > >> > more test1.csv ./kafka-console-producer.sh --broker-list > >> > sandbox.hortonworks.com:6667 --topic kafka-topic2 > >> > > >> > 1, abc > >> > 2, def > >> > ... > >> > 8, vwx > >> > 9, zzz > >> > > >> > What I received are all the content of test.csv, however, not in their > >> > original order; > >> > > >> > kafka-console-consumer.sh --zookeeper 192.168.112.129:2181 --topic > >> > kafka-topic2 > >> > > >> > 2, def > >> > 1, abc > >> > ... > >> > 9, zzz > >> > 8, vwx > >> > > >> > > >> > I read from google that partition could be the feasible solution, > >> however, > >> > my questions are: > >> > > >> > 1. for small files like this one, shall I really do the partitioning? > >> how > >> > small a partition would be acceptable to ensure the sequence? > >> > 2. for big files, each partition could still contain multiple lines, > >> how to > >> > ensure all the lines in each partition won't get messed up on consumer > >> side? > >> > > >> > > >> > I also want to know what is the best practice to process large volume > of > >> > data through kafka? There should be better way other than console > >> command. > >> > > >> > Thank you very much. > >> > > >> > > >> > > >> > *------------------------------------------------* > >> > *Sincerely yours,* > >> > > >> > > >> > *Raymond* > >> > > >