This sounds like a really bad idea.

In Cassandra 4.0 RC1, when you have more than 150 tables or 40 keyspaces (code reference <https://github.com/apache/cassandra/blob/3282f5ecf187ecbb56b8d73ab9a9110c010898b0/src/java/org/apache/cassandra/config/Config.java#L562>), Cassandra will warn you about it:

   /Cluster already contains %d tables in %d keyspaces. Having a large
   number of tables will significantly slow down schema dependent
   cluster operations./

The warning exists for a good reason. You really need to reconsider you data model.


On 30/05/2021 10:12, Sébastien Rebecchi wrote:
Hello,

I have a more general question about that, I cannot find clear answer.

In my use case I have many tables (around 10k new tables created per months) and they are created from many clients and only dynamically, with several clients creating same tables simulteanously.

What is the recommended way of creating tables dynamically? If I am doing "if not exists" queries + wait for schema aggreement before and after each create statement, will it work correctly for Cassandra?

Sébastien.

Le ven. 28 mai 2021 à 20:45, Sébastien Rebecchi <srebec...@kameleoon.com <mailto:srebec...@kameleoon.com>> a écrit :

    Thank you for your answer.

    If I send all my create operations still from many clients but to
    1 coordinator node, always the same, would it prevent schema mismatch?

    Sébastien.


    Le ven. 28 mai 2021 à 01:14, Kane Wilson <k...@raft.so> a écrit :

            Which client operations could trigger schema change at
            node level? Do you mean that for ex creating a new table
            trigger a schema change globally, not only at KS/table
            single level?

        Yes, any DDL statement (creating tables, altering, dropping,
        etc) triggers a schema change across the cluster (globally).
        All nodes need to be told of this schema change.

            I don't have schema changes, except keyspaces and tables
            creations. But they are done from multiple sources indeed.
            With a "create if not exists" statement, on demand. Thanks
            you for your answer, I will try to see if I could
            precreate them then.

         Yep, definitely do that. You don't want to be issuing
        simultaneous create statements from different clients. IF NOT
        EXISTS won't necessarily catch all cases.

            As for the schema mismatch, what is the best way of fixing
            that issue? Could Cassandra recover from that on its own
            or is there a nodetool command to force schema agreement?
            I have heard that we have to restart the nodes 1 by 1, but
            it seems a very heavy procedure for that.

        A rolling restart is usually enough to fix the issue. You
        might want to repair afterwards, and check that data didn't
        make it to different versions of the table on different nodes
        (in which case some more intervention may be necessary to save
        that data).
-- raft.so <https://raft.so> - Cassandra consulting, support, and
        managed services

Reply via email to