How would I route the messages to a specific partition?

On 27 Dec 2016 10:25 a.m., "Asaf Mesika" <asaf.mes...@gmail.com> wrote:

> There is a much easier approach: your can route all messages of a given Id
> to a specific partition. Since each partition has a single writer you get
> the ordering you wish for. Of course this won't work if your updates occur
> in different hosts.
> Also maybe Kafka streams can help shard the based on item Id to a second
> topic
> On Thu, 22 Dec 2016 at 4:31 Ali Akhtar <ali.rac...@gmail.com> wrote:
>
> > The batch size can be large, so in memory ordering isn't an option,
> > unfortunately.
> >
> > On Thu, Dec 22, 2016 at 7:09 AM, Jesse Hodges <hodges.je...@gmail.com>
> > wrote:
> >
> > > Depending on the expected max out of order window, why not order them
> in
> > > memory? Then you don't need to reread from Cassandra, in case of a
> > problem
> > > you can reread data from Kafka.
> > >
> > > -Jesse
> > >
> > > > On Dec 21, 2016, at 7:24 PM, Ali Akhtar <ali.rac...@gmail.com>
> wrote:
> > > >
> > > > - I'm receiving a batch of messages to a Kafka topic.
> > > >
> > > > Each message has a timestamp, however the messages can arrive / get
> > > processed out of order. I.e event 1's timestamp could've been a few
> > seconds
> > > before event 2, and event 2 could still get processed before event 1.
> > > >
> > > > - I know the number of messages that are sent per batch.
> > > >
> > > > - I need to process the messages in order. The messages are basically
> > > providing the history of an item. I need to be able to track the
> history
> > > accurately (i.e, if an event occurred 3 times, i need to accurately log
> > the
> > > dates of the first, 2nd, and 3rd time it occurred).
> > > >
> > > > The approach I'm considering is:
> > > >
> > > > - Creating a cassandra table which is ordered by the timestamp of the
> > > messages.
> > > >
> > > > - Once a batch of messages has arrived, writing them all to
> cassandra,
> > > counting on them being ordered by the timestamp even if they are
> > processed
> > > out of order.
> > > >
> > > > - Then iterating over the messages in the cassandra table, to process
> > > them in order.
> > > >
> > > > However, I'm concerned about Cassandra's eventual consistency. Could
> it
> > > be that even though I wrote the messages, they are not there when I try
> > to
> > > read them (which would be almost immediately after they are written)?
> > > >
> > > > Should I enforce consistency = ALL to make sure the messages will be
> > > available immediately after being written?
> > > >
> > > > Is there a better way to handle this thru either Kafka streams or
> > > Cassandra?
> > >
> >
>

Reply via email to