If you specify a key with each message then all messages with the same key get 
sent to the same partition.

> On Dec 26, 2016, at 23:32, Ali Akhtar <ali.rac...@gmail.com> wrote:
>
> How would I route the messages to a specific partition?
>
>> On 27 Dec 2016 10:25 a.m., "Asaf Mesika" <asaf.mes...@gmail.com> wrote:
>>
>> There is a much easier approach: your can route all messages of a given Id
>> to a specific partition. Since each partition has a single writer you get
>> the ordering you wish for. Of course this won't work if your updates occur
>> in different hosts.
>> Also maybe Kafka streams can help shard the based on item Id to a second
>> topic
>>> On Thu, 22 Dec 2016 at 4:31 Ali Akhtar <ali.rac...@gmail.com> wrote:
>>>
>>> The batch size can be large, so in memory ordering isn't an option,
>>> unfortunately.
>>>
>>> On Thu, Dec 22, 2016 at 7:09 AM, Jesse Hodges <hodges.je...@gmail.com>
>>> wrote:
>>>
>>>> Depending on the expected max out of order window, why not order them
>> in
>>>> memory? Then you don't need to reread from Cassandra, in case of a
>>> problem
>>>> you can reread data from Kafka.
>>>>
>>>> -Jesse
>>>>
>>>>> On Dec 21, 2016, at 7:24 PM, Ali Akhtar <ali.rac...@gmail.com>
>> wrote:
>>>>>
>>>>> - I'm receiving a batch of messages to a Kafka topic.
>>>>>
>>>>> Each message has a timestamp, however the messages can arrive / get
>>>> processed out of order. I.e event 1's timestamp could've been a few
>>> seconds
>>>> before event 2, and event 2 could still get processed before event 1.
>>>>>
>>>>> - I know the number of messages that are sent per batch.
>>>>>
>>>>> - I need to process the messages in order. The messages are basically
>>>> providing the history of an item. I need to be able to track the
>> history
>>>> accurately (i.e, if an event occurred 3 times, i need to accurately log
>>> the
>>>> dates of the first, 2nd, and 3rd time it occurred).
>>>>>
>>>>> The approach I'm considering is:
>>>>>
>>>>> - Creating a cassandra table which is ordered by the timestamp of the
>>>> messages.
>>>>>
>>>>> - Once a batch of messages has arrived, writing them all to
>> cassandra,
>>>> counting on them being ordered by the timestamp even if they are
>>> processed
>>>> out of order.
>>>>>
>>>>> - Then iterating over the messages in the cassandra table, to process
>>>> them in order.
>>>>>
>>>>> However, I'm concerned about Cassandra's eventual consistency. Could
>> it
>>>> be that even though I wrote the messages, they are not there when I try
>>> to
>>>> read them (which would be almost immediately after they are written)?
>>>>>
>>>>> Should I enforce consistency = ALL to make sure the messages will be
>>>> available immediately after being written?
>>>>>
>>>>> Is there a better way to handle this thru either Kafka streams or
>>>> Cassandra?
>>>>
>>>
>>
This e-mail and any files transmitted with it are confidential, may contain 
sensitive information, and are intended solely for the use of the individual or 
entity to whom they are addressed. If you have received this e-mail in error, 
please notify the sender by reply e-mail immediately and destroy all copies of 
the e-mail and any attachments.

Reply via email to