You can have a read at https://www.datastax.com/blog/cassandra-anti-patterns-queues-and-queue-datasets

Your table schema does not include the most important piece of information - the partition keys (and clustering keys, if any). Keep in mind that you can only efficiently query Cassandra by the exact partition key or the token of a partition key, otherwise you will have to rely on MV or secondary index, or worse, scan the entire table (all the nodes) to find your data.

A Cassandra schema should look like this:

CREATE TABLE xyz (
  a text,
  b text,
  c timeuuid,
  d int,
  e text,
  PRIMARY KEY ((a, b), c, d)
);

The line "PRIMARY KEY" contains arguably the most important piece of information of the table schema.


On 19/02/2024 06:52, Gowtham S wrote:
Hi Bowen

    which is a well documented anti-pattern.

Can you please explain more on this, I'm not aware of it. It will be helpful to make decisions.
Please find the below table schema

*Table schema*
TopicName - text
Partition - int
MessageUUID - text
Actual data - text
OccurredTime - Timestamp
Status - boolean

We are planning to read the table with the topic name and the status is not true. And produce those to the respective topic when Kafka is live.

Thanks and regards,
Gowtham S


On Sat, 17 Feb 2024 at 18:10, Bowen Song via user <user@cassandra.apache.org> wrote:

    Hi Gowtham,

    On the face of it, it sounds like you are planning to use
    Cassandra for a queue-like application, which is a well documented
    anti-pattern. If that's not the case, can you please show the
    table schema and some example queries?

    Cheers,
    Bowen

    On 17/02/2024 08:44, Gowtham S wrote:

    Dear Cassandra Community,

    I am reaching out to seek your valuable feedback and insights on
    a proposed solution we are considering for managing Kafka outages
    using Cassandra.

    At our organization, we heavily rely on Kafka for real-time data
    processing and messaging. However, like any technology, Kafka is
    susceptible to occasional outages which can disrupt our
    operations and impact our services. To mitigate the impact of
    such outages and ensure continuity, we are exploring the
    possibility of leveraging Cassandra as a backup solution.

    Our proposed approach involves storing messages in Cassandra
    during Kafka outages. Subsequently, we plan to implement a
    scheduler that will read from Cassandra and attempt to write
    these messages back into Kafka once it is operational again.

    We believe that by adopting this strategy, we can achieve the
    following benefits:

    1.

        Improved Fault Tolerance: By having a backup mechanism in
        place, we can reduce the risk of data loss and ensure
        continuity of operations during Kafka outages.

    2.

        Enhanced Reliability: Cassandra's distributed architecture
        and built-in replication features make it well-suited for
        storing data reliably, even in the face of failures.

    3.

        Scalability: Both Cassandra and Kafka are designed to scale
        horizontally, allowing us to handle increased loads seamlessly.

    Before proceeding further with this approach, we would greatly
    appreciate any feedback, suggestions, or concerns from the
    community. Specifically, we are interested in hearing about:

      * Potential challenges or drawbacks of using Cassandra as a
        backup solution for Kafka outages.
      * Best practices or recommendations for implementing such a
        backup mechanism effectively.
      * Any alternative approaches or technologies that we should
        consider?

    Your expertise and insights are invaluable to us, and we are
    eager to learn from your experiences and perspectives. Please
    feel free to share your thoughts or reach out to us with any
    questions or clarifications.

    Thank you for taking the time to consider our proposal, and we
    look forward to hearing from you soon.

    Thanks and regards,
    Gowtham S

Reply via email to