Re: Trouble with using group commitlog_sync

Bowen Song via user Wed, 24 Apr 2024 01:27:55 -0700

You might have run into the bottleneck of the driver's IO thread. Tryincrease the driver's connections-per-server limit to 2 or 3 if you'veonly got 1 server in the cluster. Or alternatively, run two clientprocesses in parallel.


On 24/04/2024 07:19, Nathan Marz wrote:

Tried it again with one more client thread, and that had no effect onperformance. This is unsurprising as there's only 2 CPU on this nodeand they were already at 100%. These were good ideas, but I'm stillunable to even match the performance of batch commit mode with groupcommit mode.

On Tue, Apr 23, 2024 at 12:46 PM Bowen Song via user<user@cassandra.apache.org> wrote:


    To achieve 10k loop iterations per second, each iteration must
    take 0.1 milliseconds or less. Considering that each iteration
    needs to lock and unlock the semaphore (two syscalls) and make
    network requests (more syscalls), that's a lots of context
    switches. It may a bit too much to ask for a single thread. I
    would suggest try multi-threading or multi-processing, and see if
    the combined insert rate is higher.

    I should also note that executeAsync() also has implicit limits on
    the number of in-flight requests, which default to 1024 requests
    per connection and 1 connection per server. See
    https://docs.datastax.com/en/developer/java-driver/4.17/manual/core/pooling/


    On 23/04/2024 23:18, Nathan Marz wrote:

    It's using the async API, so why would it need multiple threads?
    Using the exact same approach I'm able to get 38k / second with
    periodic commitlog_sync. For what it's worth, I do see 100% CPU
    utilization in every single one of these tests.

    On Tue, Apr 23, 2024 at 11:01 AM Bowen Song via user
    <user@cassandra.apache.org> wrote:

        Have you checked the thread CPU utilisation of the client
        side? You likely will need more than one thread to do
        insertion in a loop to achieve tens of thousands of inserts
        per second.


        On 23/04/2024 21:55, Nathan Marz wrote:

        Thanks for the explanation.

        I tried again with commitlog_sync_group_window at 2ms,
        concurrent_writes at 512, and doing 1000 individual inserts
        at a time with the same loop + semaphore approach. This only
        nets 9k / second.

        I got much higher throughput for the other modes with
        BatchStatement of 100 inserts rather than 100x more
        individual inserts.

        On Tue, Apr 23, 2024 at 10:45 AM Bowen Song via user
        <user@cassandra.apache.org> wrote:

            I suspect you are abusing batch statements. Batch
            statements should only be used where atomicity or
            isolation is needed. Using batch statements won't make
            inserting multiple partitions faster. In fact, it often
            will make that slower.

            Also, the liner relationship between
            commitlog_sync_group_window and write throughput is
            expected. That's because the max number of uncompleted
            writes is limited by the write concurrency, and a write
            is not considered "complete" before it is synced to disk
            when commitlog sync is in group or batch mode. That
            means within each interval, only limited number of
            writes can be done. The ways to increase that including:
            add more nodes, sync commitlog at shorter intervals and
            allow more concurrent writes.


            On 23/04/2024 20:43, Nathan Marz wrote:

            Thanks. I raised concurrent_writes to 128 and
            set commitlog_sync_group_window to 20ms. This causes a
            single execute of a BatchStatement containing 100
            inserts to succeed. However, the throughput I'm seeing
            is atrocious.

            With these settings, I'm executing 10 BatchStatement
            concurrently at a time using the semaphore + loop
            approach I showed in my first message. So as requests
            complete, more are sent out such that there are 10
            in-flight at a time. Each BatchStatement has 100
            individual inserts. I'm seeing only 730 inserts /
            second. Again, with periodic mode I see 38k / second
            and with batch I see 14k / second. My expectation was
            that group commit mode throughput would be somewhere
            between those two.

            If I set commitlog_sync_group_window to 100ms, the
            throughput drops to 14 / second.

            If I set commitlog_sync_group_window to 10ms, the
            throughput increases to 1587 / second.

            If I set commitlog_sync_group_window to 5ms, the
            throughput increases to 3200 / second.

            If I set commitlog_sync_group_window to 1ms, the
            throughput increases to 13k / second, which is slightly
            less than batch commit mode.

            Is group commit mode supposed to have better
            performance than batch mode?


            On Tue, Apr 23, 2024 at 8:46 AM Bowen Song via user
            <user@cassandra.apache.org> wrote:

                The default commitlog_sync_group_window is very
                long for SSDs. Try reduce it if you are using
                SSD-backed storage for the commit log. 10-15 ms is
                a good starting point. You may also want to
                increase the value of concurrent_writes, consider
                at least double or quadruple it from the default.
                You'll need even higher write concurrency for
                longer commitlog_sync_group_window.


                On 23/04/2024 19:26, Nathan Marz wrote:

                "batch" mode works fine. I'm having trouble with
                "group" mode. The only config for that is
                "commitlog_sync_group_window", and I have that set
                to the default 1000ms.

                On Tue, Apr 23, 2024 at 8:15 AM Bowen Song via
                user <user@cassandra.apache.org> wrote:

                    Why would you want to set
                    commitlog_sync_batch_window to 1 second long
                    when commitlog_sync is set to batch mode? The
                    documentation
                    
<https://cassandra.apache.org/doc/stable/cassandra/architecture/storage_engine.html>
                    on this says:

                        /This window should be kept short because
                        the writer threads will be unable to do
                        extra work while waiting. You may need to
                        increase concurrent_writes for the same
                        reason/

                    If you want to use batch mode, at least ensure
                    commitlog_sync_batch_window is reasonably
                    short. The default is 2 millisecond.


                    On 23/04/2024 18:32, Nathan Marz wrote:

                    I'm doing some benchmarking of Cassandra on a
                    single m6gd.large instance. It works fine
                    with periodic or batch commitlog_sync
                    options, but I'm having tons of issues when I
                    change it to "group". I have
                    "commitlog_sync_group_window" set to 1000ms.

                    My client is doing writes like this (pseudocode):

                    Semaphore sem = new Semaphore(numTickets);
                    while(true) {

                        sem.acquire();
                        session.executeAsync(insert.bind(genUUIDStr(),
                        genUUIDStr(), genUUIDStr())
                        .whenComplete((t, u) -> sem.release())

                    }

                    If I set numTickets higher than 20, I get
                    tons of timeout errors.

                    I've also tried doing single commands with
                    BatchStatement with many inserts at a time,
                    and that fails with timeout when the batch
                    size gets more than 20.

                    Increasing the write request timeout in
                    cassandra.yaml makes it time out at slightly
                    higher numbers of concurrent requests.

                    With periodic I'm able to get about 38k
                    writes / second, and with batch I'm able to
                    get about 14k / second.

                    Any tips on what I should be doing to get
                    group commitlog_sync to work properly? I
                    didn't expect to have to do anything other
                    than change the config.

Re: Trouble with using group commitlog_sync

Reply via email to