Building the giant batch string wasn't as bad as I thought, and at first I had great(!) results (using "unlogged" batches): 2500 rows/sec (batches of 100 in 48 threads) ran very smoothly, and the load on the cassandra server nodes averaged about 1.0 or less continuously.

But then I upped it to 5000 rows/sec, and the load on the server nodes jumped to a continuous load on all 3 of 8-10 with peaks over 14. I also tried running 2 separate clients at 2500 rows/sec with the same results. I don't see any compactions while at this load, so would this likely be the result of GC thrashing?

Seems like I'm spending a lot of effort and am still not getting very close to being able to insert 10k rows (10M of data each) per second, which is pretty disappointing.

On 08/20/2013 07:16 PM, Nate McCall wrote:
Thrift will allow for more large, free-form batch contstruction. The increase will be doing a lot more in the same payload message. Otherwise CQL is more efficient.

If you do build those giant string, yes you should see a performance improvement.


On Tue, Aug 20, 2013 at 8:03 PM, Keith Freeman <8fo...@gmail.com <mailto:8fo...@gmail.com>> wrote:

    Thanks.  Can you tell me why would using thrift would improve
    performance?

    Also, if I do try to build those giant strings for a prepared
    batch statement, should I expect another performance improvement?



    On 08/20/2013 05:06 PM, Nate McCall wrote:
    Ugh - sorry, I knew Sylvain and Michaƫl had worked on this
    recently but it is only in 2.0 - I could have sworn it got marked
    for inclusion back into 1.2 but I was wrong:
    https://issues.apache.org/jira/browse/CASSANDRA-4693

    This is indeed an issue if you don't know the column count before
    hand (or had a very large number of them like in your case).
    Again, apologies, I would not have recommended that route if I
    knew it was only in 2.0.

    I would be willing to bet you could hit those insert numbers
    pretty easily with thrift given the shape of your mutation.


    On Tue, Aug 20, 2013 at 5:00 PM, Keith Freeman <8fo...@gmail.com
    <mailto:8fo...@gmail.com>> wrote:

        So I tried inserting prepared statements separately (no
        batch), and my server nodes load definitely dropped
        significantly. Throughput from my client improved a bit, but
        only a few %.  I was able to *almost* get 5000 rows/sec (sort
        of) by also reducing the rows/insert-thread to 20-50 and
        eliminating all overhead from the timing, i.e. timing only
        the tight for loop of inserts.  But that's still a lot slower
        than I expected.

        I couldn't do batches because the driver doesn't allow
        prepared statements in a batch (QueryBuilder API).  It
        appears the batch itself could possibly be a prepared
        statement, but since I have 40+ columns on each insert that
        would take some ugly code to build so I haven't tried it yet.

        I'm using CL "ONE" on the inserts and RF 2 in my schema.


        On 08/20/2013 08:04 AM, Nate McCall wrote:
        John makes a good point re:prepared statements (I'd increase
        batch sizes again once you did this as well - separate,
        incremental runs of course so you can gauge the effect of
        each). That should take out some of the processing overhead
        of statement validation in the server (some - that load
        spike still seems high though).

        I'd actually be really interested as to what your results
        were after doing so - i've not tried any A/B testing here
        for prepared statements on inserts.

        Given your load is on the server, i'm not sure adding more
        async indirection on the client would buy you too much though.

        Also, at what RF and consistency level are you writing?


        On Tue, Aug 20, 2013 at 8:56 AM, Keith Freeman
        <8fo...@gmail.com <mailto:8fo...@gmail.com>> wrote:

            Ok, I'll try prepared statements.   But while sending my
            statements async might speed up my client, it wouldn't
improve throughput on the cassandra nodes would it? They're running at pretty high loads and only about 10%
            idle, so my concern is that they can't handle the data
            any faster, so something's wrong on the server side.  I
            don't really think there's anything on the client side
            that matters for this problem.

            Of course I know there are obvious h/w things I can do
            to improve server performance: SSDs, more RAM, more
            cores, etc.  But I thought the servers I have would be
            able to handle more rows/sec than say Mysql, since write
            speed is supposed to be one of Cassandra's strengths.


            On 08/19/2013 09:03 PM, John Sanda wrote:
            I'd suggest using prepared statements that you
            initialize at application start up and switching to use
            Session.executeAsync coupled with Google Guava Futures
            API to get better throughput on the client side.


            On Mon, Aug 19, 2013 at 10:14 PM, Keith Freeman
            <8fo...@gmail.com <mailto:8fo...@gmail.com>> wrote:

                Sure, I've tried different numbers for batches and
                threads, but generally I'm running 10-30 threads at
                a time on the client, each sending a batch of 100
                insert statements in every call, using the
                QueryBuilder.batch() API from the latest datastax
                java driver, then calling the Session.execute()
                function (synchronous) on the Batch.

                I can't post my code, but my client does this on
                each iteration:
                -- divides up the set of inserts by the number of
                threads
                -- stores the current time
                -- tells all the threads to send their inserts
                -- then when they've all returned checks the
                elapsed time

                At about 2000 rows for each iteration, 20 threads
with 100 inserts each finish in about 1 second. For 4000 rows, 40 threads with 100 inserts each
                finish in about 1.5 - 2 seconds, and as I said all
                3 cassandra nodes have a heavy CPU load while the
                client is hardly loaded.  I've tried with 10
                threads and more inserts per batch, or up to 60
                threads with fewer, doesn't seem to make a lot of
                difference.


                On 08/19/2013 05:00 PM, Nate McCall wrote:
                How big are the batch sizes? In other words, how
                many rows are you sending per insert operation?

                Other than the above, not much else to suggest
                without seeing some example code (on pastebin,
                gist or similar, ideally).

                On Mon, Aug 19, 2013 at 5:49 PM, Keith Freeman
                <8fo...@gmail.com <mailto:8fo...@gmail.com>> wrote:

                    I've got a 3-node cassandra cluster
                    (16G/4-core VMs ESXi v5 on 2.5Ghz machines not
                    shared with any other VMs).  I'm inserting
                    time-series data into a single column-family
                    using "wide rows" (timeuuids) and have a
                    3-part partition key so my primary key is
                    something like ((a, b, day), in-time-uuid), x,
                    y, z).

                    My java client is feeding rows (about 1k of
                    raw data size each) in batches using multiple
                    threads, and the fastest I can get it run
                    reliably is about 2000 rows/second.  Even at
                    that speed, all 3 cassandra nodes are very CPU
                    bound, with loads of 6-9 each (and the client
                    machine is hardly breaking a sweat).  I've
                    tried turning off compression in my table
                    which reduced the loads slightly but not much.
                     There are no other updates or reads
                    occurring, except the datastax opscenter.

                    I was expecting to be able to insert at least
                    10k rows/second with this configuration, and
                    after a lot of reading of docs, blogs, and
                    google, can't really figure out what's slowing
                    my client down.  When I increase the insert
                    speed of my client beyond 2000/second, the
                    server responses are just too slow and the
                    client falls behind.  I had a single-node
                    Mysql database that can handle 10k of these
                    data rows/second, so I really feel like I'm
                    missing something in Cassandra.  Any ideas?






--
            - John







Reply via email to