Hi there,

I'm working on a project that needs delayed event processing. Example:

Today a certain set of conditions X is met, and in 7 days and 12 hours
(delay Y) we want to take action Z.

The set of conditions X is out of the scope of this problem.
The delay Y will be at least 1 minute, and at most 90 days with a
resolution per minute.
The action Z will be describe by a JSON string.

Other details:
- for every point in the future there are probably hundreds of actions
which have to be processed
- all actions for a point in time will be processed at once (thus not
removing action by action as a typical queue would do)
- once all actions have been processed we remove the entire row (by key,
not the individual columns)

Now I'm aware of the fact that Cassandra is not suited well for queues
(which is basically what we want, but then hundreds / thousands of queues
for a specific point in time in the future). See
http://www.datastax.com/dev/blog/cassandra-anti-patterns-queues-and-queue-like-datasets
for more details

Basically I the columnfamily to contain around 130K (90*1440) rows at most.
Every record will have between 1 and a few million (average expected around
few hundred thousand) columns where key is a timeuuid and the value is a
byte array / string of a few kilobytes at most.

Why not use real queues? We currently have Kafka and Cassandra in our infra
that would be suitable to this, however creating a queue for every possible
moment in time would need a lot (up to ~ 130K) in queues, which is not what
Kafka is meant for.

Why not a different datastore? We could use redis which I think might be
better suited for this, but we don't have that running.

I think this should work well given the details and usage pattern.

Do you have any thoughts on using Cassandra for this? Or should we use a
different system, like Redis?

Thanks.

Best regards,

Robin Verlangen
*Chief Data Architect*

W http://www.robinverlangen.nl
E ro...@us2.nl

<http://goo.gl/Lt7BC>
*What is CloudPelican? <http://goo.gl/HkB3D>*

Disclaimer: The information contained in this message and attachments is
intended solely for the attention and use of the named addressee and may be
confidential. If you are not the intended recipient, you are reminded that
the information remains the property of the sender. You must not use,
disclose, distribute, copy, print or rely on this e-mail. If you have
received this message in error, please contact the sender immediately and
irrevocably delete this message and any copies.

Reply via email to