Re: insert performance (1.2.8)

2013-08-26 Thread Keith Freeman
That sounds interesting, but not sure exactly what you mean?  My key is 
like this: ((f1, f2, day) timeuuid), and f1/f2 are roughly 
well-distributed.  So my inserts are pretty evenly distributed across 
about 22k combinations of f1+f2 each day.


Are you saying that you get better performance by keeping the wide rows 
less wide, or by spreading out over time inserts into a single row?  
Just don't know what you mean by shuffling?


On 08/26/2013 03:06 PM, Jake Luciani wrote:

How are you inserting the data? Is it all partition at once?

We've had the experience that shuffling the inserts across rows for 
wide rows gave us normal insert rates.  When you mutate a entire 
wide row at once it hits a bottleneck.



On Mon, Aug 26, 2013 at 4:49 PM, Keith Freeman 8fo...@gmail.com 
mailto:8fo...@gmail.com wrote:


I can believe that I'm IO bound with the current disk
configuration, but that doesn't explain the CPU load does it?  If
I'm hitting a limit of disk performance, I should see a slowdown
but not the jump in CPU, right?


On 08/22/2013 11:52 AM, Nate McCall wrote:

Given the backups in the flushing stages, I think you are IO
bound. SSDs will work best for the data volume. Use rotational
media for the commitlog as it is largely sequential.

Quick experiment: disable commit log on the keyspace and see if
your test goes faster (WITH DURABLE_WRITES = false on keyspace
creation).


On Wed, Aug 21, 2013 at 5:41 PM, Keith Freeman 8fo...@gmail.com
mailto:8fo...@gmail.com wrote:

We have 2 partitions on the same physical disk for commit-log
and data.  Definitely non-optimal, we're planning to install
SSDs for the commit-log partition but don't have them yet.

Can this explain the high server loads?

On 08/21/2013 04:24 PM, Nate McCall wrote:





Re: insert performance (1.2.8)

2013-08-21 Thread Keith Freeman
Building the giant batch string wasn't as bad as I thought, and at first 
I had great(!) results (using unlogged batches): 2500 rows/sec 
(batches of 100 in 48 threads) ran very smoothly, and the load on the 
cassandra server nodes averaged about 1.0 or less continuously.


But then I upped it to 5000 rows/sec, and the load on the server nodes 
jumped to a continuous load on all 3 of 8-10 with peaks over 14.  I also 
tried running 2 separate clients at 2500 rows/sec with the same 
results.  I don't see any compactions while at this load, so would this 
likely be the result of GC thrashing?


Seems like I'm spending a lot of effort and am still not getting very 
close to being able to insert 10k rows (10M of data each) per second, 
which is pretty disappointing.


On 08/20/2013 07:16 PM, Nate McCall wrote:
Thrift will allow for more large, free-form batch contstruction. The 
increase will be doing a lot more in the same payload message. 
Otherwise CQL is more efficient.


If you do build those giant string, yes you should see a performance 
improvement.



On Tue, Aug 20, 2013 at 8:03 PM, Keith Freeman 8fo...@gmail.com 
mailto:8fo...@gmail.com wrote:


Thanks.  Can you tell me why would using thrift would improve
performance?

Also, if I do try to build those giant strings for a prepared
batch statement, should I expect another performance improvement?



On 08/20/2013 05:06 PM, Nate McCall wrote:

Ugh - sorry, I knew Sylvain and Michaël had worked on this
recently but it is only in 2.0 - I could have sworn it got marked
for inclusion back into 1.2 but I was wrong:
https://issues.apache.org/jira/browse/CASSANDRA-4693

This is indeed an issue if you don't know the column count before
hand (or had a very large number of them like in your case).
Again, apologies, I would not have recommended that route if I
knew it was only in 2.0.

I would be willing to bet you could hit those insert numbers
pretty easily with thrift given the shape of your mutation.


On Tue, Aug 20, 2013 at 5:00 PM, Keith Freeman 8fo...@gmail.com
mailto:8fo...@gmail.com wrote:

So I tried inserting prepared statements separately (no
batch), and my server nodes load definitely dropped
significantly. Throughput from my client improved a bit, but
only a few %.  I was able to *almost* get 5000 rows/sec (sort
of) by also reducing the rows/insert-thread to 20-50 and
eliminating all overhead from the timing, i.e. timing only
the tight for loop of inserts.  But that's still a lot slower
than I expected.

I couldn't do batches because the driver doesn't allow
prepared statements in a batch (QueryBuilder API).  It
appears the batch itself could possibly be a prepared
statement, but since I have 40+ columns on each insert that
would take some ugly code to build so I haven't tried it yet.

I'm using CL ONE on the inserts and RF 2 in my schema.


On 08/20/2013 08:04 AM, Nate McCall wrote:

John makes a good point re:prepared statements (I'd increase
batch sizes again once you did this as well - separate,
incremental runs of course so you can gauge the effect of
each). That should take out some of the processing overhead
of statement validation in the server (some - that load
spike still seems high though).

I'd actually be really interested as to what your results
were after doing so - i've not tried any A/B testing here
for prepared statements on inserts.

Given your load is on the server, i'm not sure adding more
async indirection on the client would buy you too much though.

Also, at what RF and consistency level are you writing?


On Tue, Aug 20, 2013 at 8:56 AM, Keith Freeman
8fo...@gmail.com mailto:8fo...@gmail.com wrote:

Ok, I'll try prepared statements.   But while sending my
statements async might speed up my client, it wouldn't
improve throughput on the cassandra nodes would it? 
They're running at pretty high loads and only about 10%

idle, so my concern is that they can't handle the data
any faster, so something's wrong on the server side.  I
don't really think there's anything on the client side
that matters for this problem.

Of course I know there are obvious h/w things I can do
to improve server performance: SSDs, more RAM, more
cores, etc.  But I thought the servers I have would be
able to handle more rows/sec than say Mysql, since write
speed is supposed to be one of Cassandra's strengths.


On 08/19/2013 09:03 PM, John Sanda wrote:

I'd suggest using prepared statements that you
initialize at application start up and switching to use

Re: insert performance (1.2.8)

2013-08-21 Thread Nate McCall
The only thing I can think to suggest at this point is upping that batch
size - say to 500 and see what happens.

Do you have any monitoring on this cluster? If not, what do you see as the
output of 'nodetool tpstats' while you run this test?


On Wed, Aug 21, 2013 at 1:40 PM, Keith Freeman 8fo...@gmail.com wrote:

  Building the giant batch string wasn't as bad as I thought, and at first
 I had great(!) results (using unlogged batches): 2500 rows/sec (batches
 of 100 in 48 threads) ran very smoothly, and the load on the cassandra
 server nodes averaged about 1.0 or less continuously.

 But then I upped it to 5000 rows/sec, and the load on the server nodes
 jumped to a continuous load on all 3 of 8-10 with peaks over 14.  I also
 tried running 2 separate clients at 2500 rows/sec with the same results.  I
 don't see any compactions while at this load, so would this likely be the
 result of GC thrashing?

 Seems like I'm spending a lot of effort and am still not getting very
 close to being able to insert 10k rows (10M of data each) per second, which
 is pretty disappointing.


 On 08/20/2013 07:16 PM, Nate McCall wrote:

 Thrift will allow for more large, free-form batch contstruction. The
 increase will be doing a lot more in the same payload message. Otherwise
 CQL is more efficient.

  If you do build those giant string, yes you should see a performance
 improvement.


 On Tue, Aug 20, 2013 at 8:03 PM, Keith Freeman 8fo...@gmail.com wrote:

  Thanks.  Can you tell me why would using thrift would improve
 performance?

 Also, if I do try to build those giant strings for a prepared batch
 statement, should I expect another performance improvement?



 On 08/20/2013 05:06 PM, Nate McCall wrote:

 Ugh - sorry, I knew Sylvain and Michaël had worked on this recently but
 it is only in 2.0 - I could have sworn it got marked for inclusion back
 into 1.2 but I was wrong:
 https://issues.apache.org/jira/browse/CASSANDRA-4693

  This is indeed an issue if you don't know the column count before hand
 (or had a very large number of them like in your case). Again, apologies, I
 would not have recommended that route if I knew it was only in 2.0.

  I would be willing to bet you could hit those insert numbers pretty
 easily with thrift given the shape of your mutation.


 On Tue, Aug 20, 2013 at 5:00 PM, Keith Freeman 8fo...@gmail.com wrote:

  So I tried inserting prepared statements separately (no batch), and my
 server nodes load definitely dropped significantly.  Throughput from my
 client improved a bit, but only a few %.  I was able to *almost* get 5000
 rows/sec (sort of) by also reducing the rows/insert-thread to 20-50 and
 eliminating all overhead from the timing, i.e. timing only the tight for
 loop of inserts.  But that's still a lot slower than I expected.

 I couldn't do batches because the driver doesn't allow prepared
 statements in a batch (QueryBuilder API).  It appears the batch itself
 could possibly be a prepared statement, but since I have 40+ columns on
 each insert that would take some ugly code to build so I haven't tried it
 yet.

 I'm using CL ONE on the inserts and RF 2 in my schema.


 On 08/20/2013 08:04 AM, Nate McCall wrote:

 John makes a good point re:prepared statements (I'd increase batch sizes
 again once you did this as well - separate, incremental runs of course so
 you can gauge the effect of each). That should take out some of the
 processing overhead of statement validation in the server (some - that load
 spike still seems high though).

  I'd actually be really interested as to what your results were after
 doing so - i've not tried any A/B testing here for prepared statements on
 inserts.

  Given your load is on the server, i'm not sure adding more async
 indirection on the client would buy you too much though.

  Also, at what RF and consistency level are you writing?


 On Tue, Aug 20, 2013 at 8:56 AM, Keith Freeman 8fo...@gmail.com wrote:

  Ok, I'll try prepared statements.   But while sending my statements
 async might speed up my client, it wouldn't improve throughput on the
 cassandra nodes would it?  They're running at pretty high loads and only
 about 10% idle, so my concern is that they can't handle the data any
 faster, so something's wrong on the server side.  I don't really think
 there's anything on the client side that matters for this problem.

 Of course I know there are obvious h/w things I can do to improve
 server performance: SSDs, more RAM, more cores, etc.  But I thought the
 servers I have would be able to handle more rows/sec than say Mysql, since
 write speed is supposed to be one of Cassandra's strengths.


 On 08/19/2013 09:03 PM, John Sanda wrote:

 I'd suggest using prepared statements that you initialize at
 application start up and switching to use Session.executeAsync coupled with
 Google Guava Futures API to get better throughput on the client side.


 On Mon, Aug 19, 2013 at 10:14 PM, Keith Freeman 8fo...@gmail.comwrote:

  Sure, I've tried 

Re: insert performance (1.2.8)

2013-08-21 Thread Nate McCall
What's the disk setup like on these system? You have some pending tasks in
MemtablePostFlusher and FlushWriter which may mean there is contention on
flushing discarded segments from the commit log.


On Wed, Aug 21, 2013 at 5:14 PM, Keith Freeman 8fo...@gmail.com wrote:

  Ok, I tried batching 500 at a time, made no noticeable difference in the
 server loads.  I have been monitoring JMX via jconsole if that's what you
 mean?  I also did tpstats on all 3 nodes while it was under load (the 5000
 rows/sec test).  Attached file contains a screen shot of the JMX and the
 output of the 3 tpstats commands.


 On 08/21/2013 02:16 PM, Nate McCall wrote:

 The only thing I can think to suggest at this point is upping that batch
 size - say to 500 and see what happens.

  Do you have any monitoring on this cluster? If not, what do you see as
 the output of 'nodetool tpstats' while you run this test?


 On Wed, Aug 21, 2013 at 1:40 PM, Keith Freeman 8fo...@gmail.com wrote:

  Building the giant batch string wasn't as bad as I thought, and at first
 I had great(!) results (using unlogged batches): 2500 rows/sec (batches
 of 100 in 48 threads) ran very smoothly, and the load on the cassandra
 server nodes averaged about 1.0 or less continuously.

 But then I upped it to 5000 rows/sec, and the load on the server nodes
 jumped to a continuous load on all 3 of 8-10 with peaks over 14.  I also
 tried running 2 separate clients at 2500 rows/sec with the same results.  I
 don't see any compactions while at this load, so would this likely be the
 result of GC thrashing?

 Seems like I'm spending a lot of effort and am still not getting very
 close to being able to insert 10k rows (10M of data each) per second, which
 is pretty disappointing.


 On 08/20/2013 07:16 PM, Nate McCall wrote:

 Thrift will allow for more large, free-form batch contstruction. The
 increase will be doing a lot more in the same payload message. Otherwise
 CQL is more efficient.

  If you do build those giant string, yes you should see a performance
 improvement.


 On Tue, Aug 20, 2013 at 8:03 PM, Keith Freeman 8fo...@gmail.com wrote:

  Thanks.  Can you tell me why would using thrift would improve
 performance?

 Also, if I do try to build those giant strings for a prepared batch
 statement, should I expect another performance improvement?



 On 08/20/2013 05:06 PM, Nate McCall wrote:

 Ugh - sorry, I knew Sylvain and Michaël had worked on this recently but
 it is only in 2.0 - I could have sworn it got marked for inclusion back
 into 1.2 but I was wrong:
 https://issues.apache.org/jira/browse/CASSANDRA-4693

  This is indeed an issue if you don't know the column count before hand
 (or had a very large number of them like in your case). Again, apologies, I
 would not have recommended that route if I knew it was only in 2.0.

  I would be willing to bet you could hit those insert numbers pretty
 easily with thrift given the shape of your mutation.


 On Tue, Aug 20, 2013 at 5:00 PM, Keith Freeman 8fo...@gmail.com wrote:

  So I tried inserting prepared statements separately (no batch), and my
 server nodes load definitely dropped significantly.  Throughput from my
 client improved a bit, but only a few %.  I was able to *almost* get 5000
 rows/sec (sort of) by also reducing the rows/insert-thread to 20-50 and
 eliminating all overhead from the timing, i.e. timing only the tight for
 loop of inserts.  But that's still a lot slower than I expected.

 I couldn't do batches because the driver doesn't allow prepared
 statements in a batch (QueryBuilder API).  It appears the batch itself
 could possibly be a prepared statement, but since I have 40+ columns on
 each insert that would take some ugly code to build so I haven't tried it
 yet.

 I'm using CL ONE on the inserts and RF 2 in my schema.


 On 08/20/2013 08:04 AM, Nate McCall wrote:

 John makes a good point re:prepared statements (I'd increase batch
 sizes again once you did this as well - separate, incremental runs of
 course so you can gauge the effect of each). That should take out some of
 the processing overhead of statement validation in the server (some - that
 load spike still seems high though).

  I'd actually be really interested as to what your results were after
 doing so - i've not tried any A/B testing here for prepared statements on
 inserts.

  Given your load is on the server, i'm not sure adding more async
 indirection on the client would buy you too much though.

  Also, at what RF and consistency level are you writing?


 On Tue, Aug 20, 2013 at 8:56 AM, Keith Freeman 8fo...@gmail.comwrote:

  Ok, I'll try prepared statements.   But while sending my statements
 async might speed up my client, it wouldn't improve throughput on the
 cassandra nodes would it?  They're running at pretty high loads and only
 about 10% idle, so my concern is that they can't handle the data any
 faster, so something's wrong on the server side.  I don't really think
 there's anything on the client side 

Re: insert performance (1.2.8)

2013-08-20 Thread Keith Freeman
Ok, I'll try prepared statements.   But while sending my statements 
async might speed up my client, it wouldn't improve throughput on the 
cassandra nodes would it?  They're running at pretty high loads and only 
about 10% idle, so my concern is that they can't handle the data any 
faster, so something's wrong on the server side.  I don't really think 
there's anything on the client side that matters for this problem.


Of course I know there are obvious h/w things I can do to improve server 
performance: SSDs, more RAM, more cores, etc.  But I thought the servers 
I have would be able to handle more rows/sec than say Mysql, since write 
speed is supposed to be one of Cassandra's strengths.


On 08/19/2013 09:03 PM, John Sanda wrote:
I'd suggest using prepared statements that you initialize at 
application start up and switching to use Session.executeAsync coupled 
with Google Guava Futures API to get better throughput on the client side.



On Mon, Aug 19, 2013 at 10:14 PM, Keith Freeman 8fo...@gmail.com 
mailto:8fo...@gmail.com wrote:


Sure, I've tried different numbers for batches and threads, but
generally I'm running 10-30 threads at a time on the client, each
sending a batch of 100 insert statements in every call, using the
QueryBuilder.batch() API from the latest datastax java driver,
then calling the Session.execute() function (synchronous) on the
Batch.

I can't post my code, but my client does this on each iteration:
-- divides up the set of inserts by the number of threads
-- stores the current time
-- tells all the threads to send their inserts
-- then when they've all returned checks the elapsed time

At about 2000 rows for each iteration, 20 threads with 100 inserts
each finish in about 1 second.  For 4000 rows, 40 threads with 100
inserts each finish in about 1.5 - 2 seconds, and as I said all 3
cassandra nodes have a heavy CPU load while the client is hardly
loaded.  I've tried with 10 threads and more inserts per batch, or
up to 60 threads with fewer, doesn't seem to make a lot of
difference.


On 08/19/2013 05:00 PM, Nate McCall wrote:

How big are the batch sizes? In other words, how many rows are
you sending per insert operation?

Other than the above, not much else to suggest without seeing
some example code (on pastebin, gist or similar, ideally).

On Mon, Aug 19, 2013 at 5:49 PM, Keith Freeman 8fo...@gmail.com
mailto:8fo...@gmail.com wrote:

I've got a 3-node cassandra cluster (16G/4-core VMs ESXi v5
on 2.5Ghz machines not shared with any other VMs).  I'm
inserting time-series data into a single column-family using
wide rows (timeuuids) and have a 3-part partition key so my
primary key is something like ((a, b, day), in-time-uuid), x,
y, z).

My java client is feeding rows (about 1k of raw data size
each) in batches using multiple threads, and the fastest I
can get it run reliably is about 2000 rows/second.  Even at
that speed, all 3 cassandra nodes are very CPU bound, with
loads of 6-9 each (and the client machine is hardly breaking
a sweat).  I've tried turning off compression in my table
which reduced the loads slightly but not much.  There are no
other updates or reads occurring, except the datastax opscenter.

I was expecting to be able to insert at least 10k rows/second
with this configuration, and after a lot of reading of docs,
blogs, and google, can't really figure out what's slowing my
client down.  When I increase the insert speed of my client
beyond 2000/second, the server responses are just too slow
and the client falls behind.  I had a single-node Mysql
database that can handle 10k of these data rows/second, so I
really feel like I'm missing something in Cassandra.  Any ideas?







--

- John




Re: insert performance (1.2.8)

2013-08-20 Thread Nate McCall
John makes a good point re:prepared statements (I'd increase batch sizes
again once you did this as well - separate, incremental runs of course so
you can gauge the effect of each). That should take out some of the
processing overhead of statement validation in the server (some - that load
spike still seems high though).

I'd actually be really interested as to what your results were after doing
so - i've not tried any A/B testing here for prepared statements on
inserts.

Given your load is on the server, i'm not sure adding more async
indirection on the client would buy you too much though.

Also, at what RF and consistency level are you writing?


On Tue, Aug 20, 2013 at 8:56 AM, Keith Freeman 8fo...@gmail.com wrote:

  Ok, I'll try prepared statements.   But while sending my statements async
 might speed up my client, it wouldn't improve throughput on the cassandra
 nodes would it?  They're running at pretty high loads and only about 10%
 idle, so my concern is that they can't handle the data any faster, so
 something's wrong on the server side.  I don't really think there's
 anything on the client side that matters for this problem.

 Of course I know there are obvious h/w things I can do to improve server
 performance: SSDs, more RAM, more cores, etc.  But I thought the servers I
 have would be able to handle more rows/sec than say Mysql, since write
 speed is supposed to be one of Cassandra's strengths.


 On 08/19/2013 09:03 PM, John Sanda wrote:

 I'd suggest using prepared statements that you initialize at application
 start up and switching to use Session.executeAsync coupled with Google
 Guava Futures API to get better throughput on the client side.


 On Mon, Aug 19, 2013 at 10:14 PM, Keith Freeman 8fo...@gmail.com wrote:

  Sure, I've tried different numbers for batches and threads, but
 generally I'm running 10-30 threads at a time on the client, each sending a
 batch of 100 insert statements in every call, using the
 QueryBuilder.batch() API from the latest datastax java driver, then calling
 the Session.execute() function (synchronous) on the Batch.

 I can't post my code, but my client does this on each iteration:
 -- divides up the set of inserts by the number of threads
 -- stores the current time
 -- tells all the threads to send their inserts
 -- then when they've all returned checks the elapsed time

 At about 2000 rows for each iteration, 20 threads with 100 inserts each
 finish in about 1 second.  For 4000 rows, 40 threads with 100 inserts each
 finish in about 1.5 - 2 seconds, and as I said all 3 cassandra nodes have a
 heavy CPU load while the client is hardly loaded.  I've tried with 10
 threads and more inserts per batch, or up to 60 threads with fewer, doesn't
 seem to make a lot of difference.


 On 08/19/2013 05:00 PM, Nate McCall wrote:

  How big are the batch sizes? In other words, how many rows are you
 sending per insert operation?

  Other than the above, not much else to suggest without seeing some
 example code (on pastebin, gist or similar, ideally).

 On Mon, Aug 19, 2013 at 5:49 PM, Keith Freeman 8fo...@gmail.com wrote:

 I've got a 3-node cassandra cluster (16G/4-core VMs ESXi v5 on 2.5Ghz
 machines not shared with any other VMs).  I'm inserting time-series data
 into a single column-family using wide rows (timeuuids) and have a 3-part
 partition key so my primary key is something like ((a, b, day),
 in-time-uuid), x, y, z).

 My java client is feeding rows (about 1k of raw data size each) in
 batches using multiple threads, and the fastest I can get it run reliably
 is about 2000 rows/second.  Even at that speed, all 3 cassandra nodes are
 very CPU bound, with loads of 6-9 each (and the client machine is hardly
 breaking a sweat).  I've tried turning off compression in my table which
 reduced the loads slightly but not much.  There are no other updates or
 reads occurring, except the datastax opscenter.

 I was expecting to be able to insert at least 10k rows/second with this
 configuration, and after a lot of reading of docs, blogs, and google, can't
 really figure out what's slowing my client down.  When I increase the
 insert speed of my client beyond 2000/second, the server responses are just
 too slow and the client falls behind.  I had a single-node Mysql database
 that can handle 10k of these data rows/second, so I really feel like I'm
 missing something in Cassandra.  Any ideas?






  --

 - John





Re: insert performance (1.2.8)

2013-08-20 Thread Przemek Maciolek
I had similar issues (sent a note on the list few weeks ago but nobody
responded). I think there's a serious bottleneck with using wide rows and
composite keys. I made a trivial benchmark, which you check here:
http://pastebin.com/qAcRcqbF  - it's written in cql-rb, but I ran the test
using astyanax/cql3 enabled and the results were the same.

In my case, inserting 10 000 entries took following time (seconds):

Using composite keys
Separetely: 12.892867
Batch: 189.731306

This means, I got 1000 rows/s when inserting them seperately and 52 (!!!)
when inserting them in a huge batch.

Using just partition key and wide row
Separetely: 11.292507
Batch: 0.093355

Again, 1000 rows/s when inserting them one by one. But batch obviously
improves thing and I easily got 1 rows/s.

Anyone else with similar experiences?

Thanks,
Przemek


On Tue, Aug 20, 2013 at 4:04 PM, Nate McCall n...@thelastpickle.com wrote:

 John makes a good point re:prepared statements (I'd increase batch sizes
 again once you did this as well - separate, incremental runs of course so
 you can gauge the effect of each). That should take out some of the
 processing overhead of statement validation in the server (some - that load
 spike still seems high though).

 I'd actually be really interested as to what your results were after doing
 so - i've not tried any A/B testing here for prepared statements on
 inserts.

 Given your load is on the server, i'm not sure adding more async
 indirection on the client would buy you too much though.

 Also, at what RF and consistency level are you writing?


 On Tue, Aug 20, 2013 at 8:56 AM, Keith Freeman 8fo...@gmail.com wrote:

  Ok, I'll try prepared statements.   But while sending my statements
 async might speed up my client, it wouldn't improve throughput on the
 cassandra nodes would it?  They're running at pretty high loads and only
 about 10% idle, so my concern is that they can't handle the data any
 faster, so something's wrong on the server side.  I don't really think
 there's anything on the client side that matters for this problem.

 Of course I know there are obvious h/w things I can do to improve server
 performance: SSDs, more RAM, more cores, etc.  But I thought the servers I
 have would be able to handle more rows/sec than say Mysql, since write
 speed is supposed to be one of Cassandra's strengths.


 On 08/19/2013 09:03 PM, John Sanda wrote:

 I'd suggest using prepared statements that you initialize at application
 start up and switching to use Session.executeAsync coupled with Google
 Guava Futures API to get better throughput on the client side.


 On Mon, Aug 19, 2013 at 10:14 PM, Keith Freeman 8fo...@gmail.com wrote:

  Sure, I've tried different numbers for batches and threads, but
 generally I'm running 10-30 threads at a time on the client, each sending a
 batch of 100 insert statements in every call, using the
 QueryBuilder.batch() API from the latest datastax java driver, then calling
 the Session.execute() function (synchronous) on the Batch.

 I can't post my code, but my client does this on each iteration:
 -- divides up the set of inserts by the number of threads
 -- stores the current time
 -- tells all the threads to send their inserts
 -- then when they've all returned checks the elapsed time

 At about 2000 rows for each iteration, 20 threads with 100 inserts each
 finish in about 1 second.  For 4000 rows, 40 threads with 100 inserts each
 finish in about 1.5 - 2 seconds, and as I said all 3 cassandra nodes have a
 heavy CPU load while the client is hardly loaded.  I've tried with 10
 threads and more inserts per batch, or up to 60 threads with fewer, doesn't
 seem to make a lot of difference.


 On 08/19/2013 05:00 PM, Nate McCall wrote:

  How big are the batch sizes? In other words, how many rows are you
 sending per insert operation?

  Other than the above, not much else to suggest without seeing some
 example code (on pastebin, gist or similar, ideally).

 On Mon, Aug 19, 2013 at 5:49 PM, Keith Freeman 8fo...@gmail.com wrote:

 I've got a 3-node cassandra cluster (16G/4-core VMs ESXi v5 on 2.5Ghz
 machines not shared with any other VMs).  I'm inserting time-series data
 into a single column-family using wide rows (timeuuids) and have a 3-part
 partition key so my primary key is something like ((a, b, day),
 in-time-uuid), x, y, z).

 My java client is feeding rows (about 1k of raw data size each) in
 batches using multiple threads, and the fastest I can get it run reliably
 is about 2000 rows/second.  Even at that speed, all 3 cassandra nodes are
 very CPU bound, with loads of 6-9 each (and the client machine is hardly
 breaking a sweat).  I've tried turning off compression in my table which
 reduced the loads slightly but not much.  There are no other updates or
 reads occurring, except the datastax opscenter.

 I was expecting to be able to insert at least 10k rows/second with this
 configuration, and after a lot of reading of docs, blogs, and google, can't

Re: insert performance (1.2.8)

2013-08-20 Thread Nate McCall
Thanks for putting this up - sorry I missed your post the other week. I
would be real curious as to your results if you added a prepared statement
for those inserts.


On Tue, Aug 20, 2013 at 9:14 AM, Przemek Maciolek pmacio...@gmail.comwrote:

 I had similar issues (sent a note on the list few weeks ago but nobody
 responded). I think there's a serious bottleneck with using wide rows and
 composite keys. I made a trivial benchmark, which you check here:
 http://pastebin.com/qAcRcqbF  - it's written in cql-rb, but I ran the
 test using astyanax/cql3 enabled and the results were the same.

 In my case, inserting 10 000 entries took following time (seconds):

 Using composite keys
 Separetely: 12.892867
 Batch: 189.731306

 This means, I got 1000 rows/s when inserting them seperately and 52 (!!!)
 when inserting them in a huge batch.

 Using just partition key and wide row
 Separetely: 11.292507
 Batch: 0.093355

 Again, 1000 rows/s when inserting them one by one. But batch obviously
 improves thing and I easily got 1 rows/s.

 Anyone else with similar experiences?

 Thanks,
 Przemek


 On Tue, Aug 20, 2013 at 4:04 PM, Nate McCall n...@thelastpickle.comwrote:

 John makes a good point re:prepared statements (I'd increase batch sizes
 again once you did this as well - separate, incremental runs of course so
 you can gauge the effect of each). That should take out some of the
 processing overhead of statement validation in the server (some - that load
 spike still seems high though).

 I'd actually be really interested as to what your results were after
 doing so - i've not tried any A/B testing here for prepared statements on
 inserts.

 Given your load is on the server, i'm not sure adding more async
 indirection on the client would buy you too much though.

 Also, at what RF and consistency level are you writing?


 On Tue, Aug 20, 2013 at 8:56 AM, Keith Freeman 8fo...@gmail.com wrote:

  Ok, I'll try prepared statements.   But while sending my statements
 async might speed up my client, it wouldn't improve throughput on the
 cassandra nodes would it?  They're running at pretty high loads and only
 about 10% idle, so my concern is that they can't handle the data any
 faster, so something's wrong on the server side.  I don't really think
 there's anything on the client side that matters for this problem.

 Of course I know there are obvious h/w things I can do to improve server
 performance: SSDs, more RAM, more cores, etc.  But I thought the servers I
 have would be able to handle more rows/sec than say Mysql, since write
 speed is supposed to be one of Cassandra's strengths.


 On 08/19/2013 09:03 PM, John Sanda wrote:

 I'd suggest using prepared statements that you initialize at application
 start up and switching to use Session.executeAsync coupled with Google
 Guava Futures API to get better throughput on the client side.


 On Mon, Aug 19, 2013 at 10:14 PM, Keith Freeman 8fo...@gmail.comwrote:

  Sure, I've tried different numbers for batches and threads, but
 generally I'm running 10-30 threads at a time on the client, each sending a
 batch of 100 insert statements in every call, using the
 QueryBuilder.batch() API from the latest datastax java driver, then calling
 the Session.execute() function (synchronous) on the Batch.

 I can't post my code, but my client does this on each iteration:
 -- divides up the set of inserts by the number of threads
 -- stores the current time
 -- tells all the threads to send their inserts
 -- then when they've all returned checks the elapsed time

 At about 2000 rows for each iteration, 20 threads with 100 inserts each
 finish in about 1 second.  For 4000 rows, 40 threads with 100 inserts each
 finish in about 1.5 - 2 seconds, and as I said all 3 cassandra nodes have a
 heavy CPU load while the client is hardly loaded.  I've tried with 10
 threads and more inserts per batch, or up to 60 threads with fewer, doesn't
 seem to make a lot of difference.


 On 08/19/2013 05:00 PM, Nate McCall wrote:

  How big are the batch sizes? In other words, how many rows are you
 sending per insert operation?

  Other than the above, not much else to suggest without seeing some
 example code (on pastebin, gist or similar, ideally).

 On Mon, Aug 19, 2013 at 5:49 PM, Keith Freeman 8fo...@gmail.comwrote:

 I've got a 3-node cassandra cluster (16G/4-core VMs ESXi v5 on 2.5Ghz
 machines not shared with any other VMs).  I'm inserting time-series data
 into a single column-family using wide rows (timeuuids) and have a 
 3-part
 partition key so my primary key is something like ((a, b, day),
 in-time-uuid), x, y, z).

 My java client is feeding rows (about 1k of raw data size each) in
 batches using multiple threads, and the fastest I can get it run reliably
 is about 2000 rows/second.  Even at that speed, all 3 cassandra nodes are
 very CPU bound, with loads of 6-9 each (and the client machine is hardly
 breaking a sweat).  I've tried turning off compression in my table which
 reduced 

Re: insert performance (1.2.8)

2013-08-20 Thread Przemek Maciolek
AFAIK, batch prepared statements were added just recently:
https://issues.apache.org/jira/browse/CASSANDRA-4693 and many client
libraries are not supporting it yet. (And I believe that the problem is
related to batch operations).



On Tue, Aug 20, 2013 at 4:43 PM, Nate McCall n...@thelastpickle.com wrote:

 Thanks for putting this up - sorry I missed your post the other week. I
 would be real curious as to your results if you added a prepared statement
 for those inserts.


 On Tue, Aug 20, 2013 at 9:14 AM, Przemek Maciolek pmacio...@gmail.comwrote:

 I had similar issues (sent a note on the list few weeks ago but nobody
 responded). I think there's a serious bottleneck with using wide rows and
 composite keys. I made a trivial benchmark, which you check here:
 http://pastebin.com/qAcRcqbF  - it's written in cql-rb, but I ran the
 test using astyanax/cql3 enabled and the results were the same.

 In my case, inserting 10 000 entries took following time (seconds):

 Using composite keys
 Separetely: 12.892867
 Batch: 189.731306

 This means, I got 1000 rows/s when inserting them seperately and 52 (!!!)
 when inserting them in a huge batch.

 Using just partition key and wide row
 Separetely: 11.292507
 Batch: 0.093355

 Again, 1000 rows/s when inserting them one by one. But batch obviously
 improves thing and I easily got 1 rows/s.

 Anyone else with similar experiences?

 Thanks,
 Przemek


 On Tue, Aug 20, 2013 at 4:04 PM, Nate McCall n...@thelastpickle.comwrote:

 John makes a good point re:prepared statements (I'd increase batch sizes
 again once you did this as well - separate, incremental runs of course so
 you can gauge the effect of each). That should take out some of the
 processing overhead of statement validation in the server (some - that load
 spike still seems high though).

 I'd actually be really interested as to what your results were after
 doing so - i've not tried any A/B testing here for prepared statements on
 inserts.

 Given your load is on the server, i'm not sure adding more async
 indirection on the client would buy you too much though.

 Also, at what RF and consistency level are you writing?


 On Tue, Aug 20, 2013 at 8:56 AM, Keith Freeman 8fo...@gmail.com wrote:

  Ok, I'll try prepared statements.   But while sending my statements
 async might speed up my client, it wouldn't improve throughput on the
 cassandra nodes would it?  They're running at pretty high loads and only
 about 10% idle, so my concern is that they can't handle the data any
 faster, so something's wrong on the server side.  I don't really think
 there's anything on the client side that matters for this problem.

 Of course I know there are obvious h/w things I can do to improve
 server performance: SSDs, more RAM, more cores, etc.  But I thought the
 servers I have would be able to handle more rows/sec than say Mysql, since
 write speed is supposed to be one of Cassandra's strengths.


 On 08/19/2013 09:03 PM, John Sanda wrote:

 I'd suggest using prepared statements that you initialize at
 application start up and switching to use Session.executeAsync coupled with
 Google Guava Futures API to get better throughput on the client side.


 On Mon, Aug 19, 2013 at 10:14 PM, Keith Freeman 8fo...@gmail.comwrote:

  Sure, I've tried different numbers for batches and threads, but
 generally I'm running 10-30 threads at a time on the client, each sending 
 a
 batch of 100 insert statements in every call, using the
 QueryBuilder.batch() API from the latest datastax java driver, then 
 calling
 the Session.execute() function (synchronous) on the Batch.

 I can't post my code, but my client does this on each iteration:
 -- divides up the set of inserts by the number of threads
 -- stores the current time
 -- tells all the threads to send their inserts
 -- then when they've all returned checks the elapsed time

 At about 2000 rows for each iteration, 20 threads with 100 inserts
 each finish in about 1 second.  For 4000 rows, 40 threads with 100 inserts
 each finish in about 1.5 - 2 seconds, and as I said all 3 cassandra nodes
 have a heavy CPU load while the client is hardly loaded.  I've tried with
 10 threads and more inserts per batch, or up to 60 threads with fewer,
 doesn't seem to make a lot of difference.


 On 08/19/2013 05:00 PM, Nate McCall wrote:

  How big are the batch sizes? In other words, how many rows are you
 sending per insert operation?

  Other than the above, not much else to suggest without seeing some
 example code (on pastebin, gist or similar, ideally).

 On Mon, Aug 19, 2013 at 5:49 PM, Keith Freeman 8fo...@gmail.comwrote:

 I've got a 3-node cassandra cluster (16G/4-core VMs ESXi v5 on 2.5Ghz
 machines not shared with any other VMs).  I'm inserting time-series data
 into a single column-family using wide rows (timeuuids) and have a 
 3-part
 partition key so my primary key is something like ((a, b, day),
 in-time-uuid), x, y, z).

 My java client is feeding rows (about 1k of raw data 

Re: insert performance (1.2.8)

2013-08-20 Thread Keith Freeman
So I tried inserting prepared statements separately (no batch), and my 
server nodes load definitely dropped significantly.  Throughput from my 
client improved a bit, but only a few %.  I was able to *almost* get 
5000 rows/sec (sort of) by also reducing the rows/insert-thread to 20-50 
and eliminating all overhead from the timing, i.e. timing only the tight 
for loop of inserts.  But that's still a lot slower than I expected.


I couldn't do batches because the driver doesn't allow prepared 
statements in a batch (QueryBuilder API).  It appears the batch itself 
could possibly be a prepared statement, but since I have 40+ columns on 
each insert that would take some ugly code to build so I haven't tried 
it yet.


I'm using CL ONE on the inserts and RF 2 in my schema.

On 08/20/2013 08:04 AM, Nate McCall wrote:
John makes a good point re:prepared statements (I'd increase batch 
sizes again once you did this as well - separate, incremental runs of 
course so you can gauge the effect of each). That should take out some 
of the processing overhead of statement validation in the server (some 
- that load spike still seems high though).


I'd actually be really interested as to what your results were after 
doing so - i've not tried any A/B testing here for prepared statements 
on inserts.


Given your load is on the server, i'm not sure adding more async 
indirection on the client would buy you too much though.


Also, at what RF and consistency level are you writing?


On Tue, Aug 20, 2013 at 8:56 AM, Keith Freeman 8fo...@gmail.com 
mailto:8fo...@gmail.com wrote:


Ok, I'll try prepared statements.   But while sending my
statements async might speed up my client, it wouldn't improve
throughput on the cassandra nodes would it?  They're running at
pretty high loads and only about 10% idle, so my concern is that
they can't handle the data any faster, so something's wrong on the
server side.  I don't really think there's anything on the client
side that matters for this problem.

Of course I know there are obvious h/w things I can do to improve
server performance: SSDs, more RAM, more cores, etc.  But I
thought the servers I have would be able to handle more rows/sec
than say Mysql, since write speed is supposed to be one of
Cassandra's strengths.


On 08/19/2013 09:03 PM, John Sanda wrote:

I'd suggest using prepared statements that you initialize at
application start up and switching to use Session.executeAsync
coupled with Google Guava Futures API to get better throughput on
the client side.


On Mon, Aug 19, 2013 at 10:14 PM, Keith Freeman 8fo...@gmail.com
mailto:8fo...@gmail.com wrote:

Sure, I've tried different numbers for batches and threads,
but generally I'm running 10-30 threads at a time on the
client, each sending a batch of 100 insert statements in
every call, using the QueryBuilder.batch() API from the
latest datastax java driver, then calling the
Session.execute() function (synchronous) on the Batch.

I can't post my code, but my client does this on each iteration:
-- divides up the set of inserts by the number of threads
-- stores the current time
-- tells all the threads to send their inserts
-- then when they've all returned checks the elapsed time

At about 2000 rows for each iteration, 20 threads with 100
inserts each finish in about 1 second.  For 4000 rows, 40
threads with 100 inserts each finish in about 1.5 - 2
seconds, and as I said all 3 cassandra nodes have a heavy CPU
load while the client is hardly loaded.  I've tried with 10
threads and more inserts per batch, or up to 60 threads with
fewer, doesn't seem to make a lot of difference.


On 08/19/2013 05:00 PM, Nate McCall wrote:

How big are the batch sizes? In other words, how many rows
are you sending per insert operation?

Other than the above, not much else to suggest without
seeing some example code (on pastebin, gist or similar,
ideally).

On Mon, Aug 19, 2013 at 5:49 PM, Keith Freeman
8fo...@gmail.com mailto:8fo...@gmail.com wrote:

I've got a 3-node cassandra cluster (16G/4-core VMs ESXi
v5 on 2.5Ghz machines not shared with any other VMs).
 I'm inserting time-series data into a single
column-family using wide rows (timeuuids) and have a
3-part partition key so my primary key is something like
((a, b, day), in-time-uuid), x, y, z).

My java client is feeding rows (about 1k of raw data
size each) in batches using multiple threads, and the
fastest I can get it run reliably is about 2000
rows/second.  Even at that speed, all 3 cassandra nodes
are very CPU bound, with loads of 6-9 each (and the
client 

Re: insert performance (1.2.8)

2013-08-20 Thread Nate McCall
Ugh - sorry, I knew Sylvain and Michaël had worked on this recently but it
is only in 2.0 - I could have sworn it got marked for inclusion back into
1.2 but I was wrong:
https://issues.apache.org/jira/browse/CASSANDRA-4693

This is indeed an issue if you don't know the column count before hand (or
had a very large number of them like in your case). Again, apologies, I
would not have recommended that route if I knew it was only in 2.0.

I would be willing to bet you could hit those insert numbers pretty easily
with thrift given the shape of your mutation.


On Tue, Aug 20, 2013 at 5:00 PM, Keith Freeman 8fo...@gmail.com wrote:

  So I tried inserting prepared statements separately (no batch), and my
 server nodes load definitely dropped significantly.  Throughput from my
 client improved a bit, but only a few %.  I was able to *almost* get 5000
 rows/sec (sort of) by also reducing the rows/insert-thread to 20-50 and
 eliminating all overhead from the timing, i.e. timing only the tight for
 loop of inserts.  But that's still a lot slower than I expected.

 I couldn't do batches because the driver doesn't allow prepared statements
 in a batch (QueryBuilder API).  It appears the batch itself could possibly
 be a prepared statement, but since I have 40+ columns on each insert that
 would take some ugly code to build so I haven't tried it yet.

 I'm using CL ONE on the inserts and RF 2 in my schema.


 On 08/20/2013 08:04 AM, Nate McCall wrote:

 John makes a good point re:prepared statements (I'd increase batch sizes
 again once you did this as well - separate, incremental runs of course so
 you can gauge the effect of each). That should take out some of the
 processing overhead of statement validation in the server (some - that load
 spike still seems high though).

  I'd actually be really interested as to what your results were after
 doing so - i've not tried any A/B testing here for prepared statements on
 inserts.

  Given your load is on the server, i'm not sure adding more async
 indirection on the client would buy you too much though.

  Also, at what RF and consistency level are you writing?


 On Tue, Aug 20, 2013 at 8:56 AM, Keith Freeman 8fo...@gmail.com wrote:

  Ok, I'll try prepared statements.   But while sending my statements
 async might speed up my client, it wouldn't improve throughput on the
 cassandra nodes would it?  They're running at pretty high loads and only
 about 10% idle, so my concern is that they can't handle the data any
 faster, so something's wrong on the server side.  I don't really think
 there's anything on the client side that matters for this problem.

 Of course I know there are obvious h/w things I can do to improve server
 performance: SSDs, more RAM, more cores, etc.  But I thought the servers I
 have would be able to handle more rows/sec than say Mysql, since write
 speed is supposed to be one of Cassandra's strengths.


 On 08/19/2013 09:03 PM, John Sanda wrote:

 I'd suggest using prepared statements that you initialize at application
 start up and switching to use Session.executeAsync coupled with Google
 Guava Futures API to get better throughput on the client side.


 On Mon, Aug 19, 2013 at 10:14 PM, Keith Freeman 8fo...@gmail.com wrote:

  Sure, I've tried different numbers for batches and threads, but
 generally I'm running 10-30 threads at a time on the client, each sending a
 batch of 100 insert statements in every call, using the
 QueryBuilder.batch() API from the latest datastax java driver, then calling
 the Session.execute() function (synchronous) on the Batch.

 I can't post my code, but my client does this on each iteration:
 -- divides up the set of inserts by the number of threads
 -- stores the current time
 -- tells all the threads to send their inserts
 -- then when they've all returned checks the elapsed time

 At about 2000 rows for each iteration, 20 threads with 100 inserts each
 finish in about 1 second.  For 4000 rows, 40 threads with 100 inserts each
 finish in about 1.5 - 2 seconds, and as I said all 3 cassandra nodes have a
 heavy CPU load while the client is hardly loaded.  I've tried with 10
 threads and more inserts per batch, or up to 60 threads with fewer, doesn't
 seem to make a lot of difference.


 On 08/19/2013 05:00 PM, Nate McCall wrote:

  How big are the batch sizes? In other words, how many rows are you
 sending per insert operation?

  Other than the above, not much else to suggest without seeing some
 example code (on pastebin, gist or similar, ideally).

 On Mon, Aug 19, 2013 at 5:49 PM, Keith Freeman 8fo...@gmail.com wrote:

 I've got a 3-node cassandra cluster (16G/4-core VMs ESXi v5 on 2.5Ghz
 machines not shared with any other VMs).  I'm inserting time-series data
 into a single column-family using wide rows (timeuuids) and have a 3-part
 partition key so my primary key is something like ((a, b, day),
 in-time-uuid), x, y, z).

 My java client is feeding rows (about 1k of raw data size each) in
 batches using 

Re: insert performance (1.2.8)

2013-08-20 Thread Keith Freeman

Thanks.  Can you tell me why would using thrift would improve performance?

Also, if I do try to build those giant strings for a prepared batch 
statement, should I expect another performance improvement?



On 08/20/2013 05:06 PM, Nate McCall wrote:
Ugh - sorry, I knew Sylvain and Michaël had worked on this recently 
but it is only in 2.0 - I could have sworn it got marked for inclusion 
back into 1.2 but I was wrong:

https://issues.apache.org/jira/browse/CASSANDRA-4693

This is indeed an issue if you don't know the column count before hand 
(or had a very large number of them like in your case). Again, 
apologies, I would not have recommended that route if I knew it was 
only in 2.0.


I would be willing to bet you could hit those insert numbers pretty 
easily with thrift given the shape of your mutation.



On Tue, Aug 20, 2013 at 5:00 PM, Keith Freeman 8fo...@gmail.com 
mailto:8fo...@gmail.com wrote:


So I tried inserting prepared statements separately (no batch),
and my server nodes load definitely dropped significantly. 
Throughput from my client improved a bit, but only a few %.  I was

able to *almost* get 5000 rows/sec (sort of) by also reducing the
rows/insert-thread to 20-50 and eliminating all overhead from the
timing, i.e. timing only the tight for loop of inserts.  But
that's still a lot slower than I expected.

I couldn't do batches because the driver doesn't allow prepared
statements in a batch (QueryBuilder API).  It appears the batch
itself could possibly be a prepared statement, but since I have
40+ columns on each insert that would take some ugly code to build
so I haven't tried it yet.

I'm using CL ONE on the inserts and RF 2 in my schema.


On 08/20/2013 08:04 AM, Nate McCall wrote:

John makes a good point re:prepared statements (I'd increase
batch sizes again once you did this as well - separate,
incremental runs of course so you can gauge the effect of each).
That should take out some of the processing overhead of statement
validation in the server (some - that load spike still seems high
though).

I'd actually be really interested as to what your results were
after doing so - i've not tried any A/B testing here for prepared
statements on inserts.

Given your load is on the server, i'm not sure adding more async
indirection on the client would buy you too much though.

Also, at what RF and consistency level are you writing?


On Tue, Aug 20, 2013 at 8:56 AM, Keith Freeman 8fo...@gmail.com
mailto:8fo...@gmail.com wrote:

Ok, I'll try prepared statements.   But while sending my
statements async might speed up my client, it wouldn't
improve throughput on the cassandra nodes would it?  They're
running at pretty high loads and only about 10% idle, so my
concern is that they can't handle the data any faster, so
something's wrong on the server side.  I don't really think
there's anything on the client side that matters for this
problem.

Of course I know there are obvious h/w things I can do to
improve server performance: SSDs, more RAM, more cores, etc. 
But I thought the servers I have would be able to handle more

rows/sec than say Mysql, since write speed is supposed to be
one of Cassandra's strengths.


On 08/19/2013 09:03 PM, John Sanda wrote:

I'd suggest using prepared statements that you initialize at
application start up and switching to use
Session.executeAsync coupled with Google Guava Futures API
to get better throughput on the client side.


On Mon, Aug 19, 2013 at 10:14 PM, Keith Freeman
8fo...@gmail.com mailto:8fo...@gmail.com wrote:

Sure, I've tried different numbers for batches and
threads, but generally I'm running 10-30 threads at a
time on the client, each sending a batch of 100 insert
statements in every call, using the QueryBuilder.batch()
API from the latest datastax java driver, then calling
the Session.execute() function (synchronous) on the Batch.

I can't post my code, but my client does this on each
iteration:
-- divides up the set of inserts by the number of threads
-- stores the current time
-- tells all the threads to send their inserts
-- then when they've all returned checks the elapsed time

At about 2000 rows for each iteration, 20 threads with
100 inserts each finish in about 1 second.  For 4000
rows, 40 threads with 100 inserts each finish in about
1.5 - 2 seconds, and as I said all 3 cassandra nodes
have a heavy CPU load while the client is hardly
loaded.  I've tried with 10 threads and more inserts per
batch, or up to 60 threads with fewer, doesn't seem 

Re: insert performance (1.2.8)

2013-08-20 Thread Nate McCall
Thrift will allow for more large, free-form batch contstruction. The
increase will be doing a lot more in the same payload message. Otherwise
CQL is more efficient.

If you do build those giant string, yes you should see a performance
improvement.


On Tue, Aug 20, 2013 at 8:03 PM, Keith Freeman 8fo...@gmail.com wrote:

  Thanks.  Can you tell me why would using thrift would improve performance?

 Also, if I do try to build those giant strings for a prepared batch
 statement, should I expect another performance improvement?



 On 08/20/2013 05:06 PM, Nate McCall wrote:

 Ugh - sorry, I knew Sylvain and Michaël had worked on this recently but
 it is only in 2.0 - I could have sworn it got marked for inclusion back
 into 1.2 but I was wrong:
 https://issues.apache.org/jira/browse/CASSANDRA-4693

  This is indeed an issue if you don't know the column count before hand
 (or had a very large number of them like in your case). Again, apologies, I
 would not have recommended that route if I knew it was only in 2.0.

  I would be willing to bet you could hit those insert numbers pretty
 easily with thrift given the shape of your mutation.


 On Tue, Aug 20, 2013 at 5:00 PM, Keith Freeman 8fo...@gmail.com wrote:

  So I tried inserting prepared statements separately (no batch), and my
 server nodes load definitely dropped significantly.  Throughput from my
 client improved a bit, but only a few %.  I was able to *almost* get 5000
 rows/sec (sort of) by also reducing the rows/insert-thread to 20-50 and
 eliminating all overhead from the timing, i.e. timing only the tight for
 loop of inserts.  But that's still a lot slower than I expected.

 I couldn't do batches because the driver doesn't allow prepared
 statements in a batch (QueryBuilder API).  It appears the batch itself
 could possibly be a prepared statement, but since I have 40+ columns on
 each insert that would take some ugly code to build so I haven't tried it
 yet.

 I'm using CL ONE on the inserts and RF 2 in my schema.


 On 08/20/2013 08:04 AM, Nate McCall wrote:

 John makes a good point re:prepared statements (I'd increase batch sizes
 again once you did this as well - separate, incremental runs of course so
 you can gauge the effect of each). That should take out some of the
 processing overhead of statement validation in the server (some - that load
 spike still seems high though).

  I'd actually be really interested as to what your results were after
 doing so - i've not tried any A/B testing here for prepared statements on
 inserts.

  Given your load is on the server, i'm not sure adding more async
 indirection on the client would buy you too much though.

  Also, at what RF and consistency level are you writing?


 On Tue, Aug 20, 2013 at 8:56 AM, Keith Freeman 8fo...@gmail.com wrote:

  Ok, I'll try prepared statements.   But while sending my statements
 async might speed up my client, it wouldn't improve throughput on the
 cassandra nodes would it?  They're running at pretty high loads and only
 about 10% idle, so my concern is that they can't handle the data any
 faster, so something's wrong on the server side.  I don't really think
 there's anything on the client side that matters for this problem.

 Of course I know there are obvious h/w things I can do to improve server
 performance: SSDs, more RAM, more cores, etc.  But I thought the servers I
 have would be able to handle more rows/sec than say Mysql, since write
 speed is supposed to be one of Cassandra's strengths.


 On 08/19/2013 09:03 PM, John Sanda wrote:

 I'd suggest using prepared statements that you initialize at application
 start up and switching to use Session.executeAsync coupled with Google
 Guava Futures API to get better throughput on the client side.


 On Mon, Aug 19, 2013 at 10:14 PM, Keith Freeman 8fo...@gmail.comwrote:

  Sure, I've tried different numbers for batches and threads, but
 generally I'm running 10-30 threads at a time on the client, each sending a
 batch of 100 insert statements in every call, using the
 QueryBuilder.batch() API from the latest datastax java driver, then calling
 the Session.execute() function (synchronous) on the Batch.

 I can't post my code, but my client does this on each iteration:
 -- divides up the set of inserts by the number of threads
 -- stores the current time
 -- tells all the threads to send their inserts
 -- then when they've all returned checks the elapsed time

 At about 2000 rows for each iteration, 20 threads with 100 inserts each
 finish in about 1 second.  For 4000 rows, 40 threads with 100 inserts each
 finish in about 1.5 - 2 seconds, and as I said all 3 cassandra nodes have a
 heavy CPU load while the client is hardly loaded.  I've tried with 10
 threads and more inserts per batch, or up to 60 threads with fewer, doesn't
 seem to make a lot of difference.


 On 08/19/2013 05:00 PM, Nate McCall wrote:

  How big are the batch sizes? In other words, how many rows are you
 sending per insert operation?

  Other 

insert performance (1.2.8)

2013-08-19 Thread Keith Freeman
I've got a 3-node cassandra cluster (16G/4-core VMs ESXi v5 on 2.5Ghz 
machines not shared with any other VMs).  I'm inserting time-series data 
into a single column-family using wide rows (timeuuids) and have a 
3-part partition key so my primary key is something like ((a, b, day), 
in-time-uuid), x, y, z).


My java client is feeding rows (about 1k of raw data size each) in 
batches using multiple threads, and the fastest I can get it run 
reliably is about 2000 rows/second.  Even at that speed, all 3 cassandra 
nodes are very CPU bound, with loads of 6-9 each (and the client machine 
is hardly breaking a sweat).  I've tried turning off compression in my 
table which reduced the loads slightly but not much.  There are no other 
updates or reads occurring, except the datastax opscenter.


I was expecting to be able to insert at least 10k rows/second with this 
configuration, and after a lot of reading of docs, blogs, and google, 
can't really figure out what's slowing my client down.  When I increase 
the insert speed of my client beyond 2000/second, the server responses 
are just too slow and the client falls behind.  I had a single-node 
Mysql database that can handle 10k of these data rows/second, so I 
really feel like I'm missing something in Cassandra.  Any ideas?




Re: insert performance (1.2.8)

2013-08-19 Thread Nate McCall
How big are the batch sizes? In other words, how many rows are you sending
per insert operation?

Other than the above, not much else to suggest without seeing some example
code (on pastebin, gist or similar, ideally).

On Mon, Aug 19, 2013 at 5:49 PM, Keith Freeman 8fo...@gmail.com wrote:

 I've got a 3-node cassandra cluster (16G/4-core VMs ESXi v5 on 2.5Ghz
 machines not shared with any other VMs).  I'm inserting time-series data
 into a single column-family using wide rows (timeuuids) and have a 3-part
 partition key so my primary key is something like ((a, b, day),
 in-time-uuid), x, y, z).

 My java client is feeding rows (about 1k of raw data size each) in batches
 using multiple threads, and the fastest I can get it run reliably is about
 2000 rows/second.  Even at that speed, all 3 cassandra nodes are very CPU
 bound, with loads of 6-9 each (and the client machine is hardly breaking a
 sweat).  I've tried turning off compression in my table which reduced the
 loads slightly but not much.  There are no other updates or reads
 occurring, except the datastax opscenter.

 I was expecting to be able to insert at least 10k rows/second with this
 configuration, and after a lot of reading of docs, blogs, and google, can't
 really figure out what's slowing my client down.  When I increase the
 insert speed of my client beyond 2000/second, the server responses are just
 too slow and the client falls behind.  I had a single-node Mysql database
 that can handle 10k of these data rows/second, so I really feel like I'm
 missing something in Cassandra.  Any ideas?




Re: insert performance (1.2.8)

2013-08-19 Thread Keith Freeman
Sure, I've tried different numbers for batches and threads, but 
generally I'm running 10-30 threads at a time on the client, each 
sending a batch of 100 insert statements in every call, using the 
QueryBuilder.batch() API from the latest datastax java driver, then 
calling the Session.execute() function (synchronous) on the Batch.


I can't post my code, but my client does this on each iteration:
-- divides up the set of inserts by the number of threads
-- stores the current time
-- tells all the threads to send their inserts
-- then when they've all returned checks the elapsed time

At about 2000 rows for each iteration, 20 threads with 100 inserts each 
finish in about 1 second.  For 4000 rows, 40 threads with 100 inserts 
each finish in about 1.5 - 2 seconds, and as I said all 3 cassandra 
nodes have a heavy CPU load while the client is hardly loaded.  I've 
tried with 10 threads and more inserts per batch, or up to 60 threads 
with fewer, doesn't seem to make a lot of difference.


On 08/19/2013 05:00 PM, Nate McCall wrote:
How big are the batch sizes? In other words, how many rows are you 
sending per insert operation?


Other than the above, not much else to suggest without seeing some 
example code (on pastebin, gist or similar, ideally).


On Mon, Aug 19, 2013 at 5:49 PM, Keith Freeman 8fo...@gmail.com 
mailto:8fo...@gmail.com wrote:


I've got a 3-node cassandra cluster (16G/4-core VMs ESXi v5 on
2.5Ghz machines not shared with any other VMs).  I'm inserting
time-series data into a single column-family using wide rows
(timeuuids) and have a 3-part partition key so my primary key is
something like ((a, b, day), in-time-uuid), x, y, z).

My java client is feeding rows (about 1k of raw data size each) in
batches using multiple threads, and the fastest I can get it run
reliably is about 2000 rows/second.  Even at that speed, all 3
cassandra nodes are very CPU bound, with loads of 6-9 each (and
the client machine is hardly breaking a sweat).  I've tried
turning off compression in my table which reduced the loads
slightly but not much.  There are no other updates or reads
occurring, except the datastax opscenter.

I was expecting to be able to insert at least 10k rows/second with
this configuration, and after a lot of reading of docs, blogs, and
google, can't really figure out what's slowing my client down.
 When I increase the insert speed of my client beyond 2000/second,
the server responses are just too slow and the client falls
behind.  I had a single-node Mysql database that can handle 10k of
these data rows/second, so I really feel like I'm missing
something in Cassandra.  Any ideas?






Re: insert performance (1.2.8)

2013-08-19 Thread John Sanda
I'd suggest using prepared statements that you initialize at application
start up and switching to use Session.executeAsync coupled with Google
Guava Futures API to get better throughput on the client side.


On Mon, Aug 19, 2013 at 10:14 PM, Keith Freeman 8fo...@gmail.com wrote:

  Sure, I've tried different numbers for batches and threads, but generally
 I'm running 10-30 threads at a time on the client, each sending a batch of
 100 insert statements in every call, using the QueryBuilder.batch() API
 from the latest datastax java driver, then calling the Session.execute()
 function (synchronous) on the Batch.

 I can't post my code, but my client does this on each iteration:
 -- divides up the set of inserts by the number of threads
 -- stores the current time
 -- tells all the threads to send their inserts
 -- then when they've all returned checks the elapsed time

 At about 2000 rows for each iteration, 20 threads with 100 inserts each
 finish in about 1 second.  For 4000 rows, 40 threads with 100 inserts each
 finish in about 1.5 - 2 seconds, and as I said all 3 cassandra nodes have a
 heavy CPU load while the client is hardly loaded.  I've tried with 10
 threads and more inserts per batch, or up to 60 threads with fewer, doesn't
 seem to make a lot of difference.


 On 08/19/2013 05:00 PM, Nate McCall wrote:

  How big are the batch sizes? In other words, how many rows are you
 sending per insert operation?

  Other than the above, not much else to suggest without seeing some
 example code (on pastebin, gist or similar, ideally).

 On Mon, Aug 19, 2013 at 5:49 PM, Keith Freeman 8fo...@gmail.com wrote:

 I've got a 3-node cassandra cluster (16G/4-core VMs ESXi v5 on 2.5Ghz
 machines not shared with any other VMs).  I'm inserting time-series data
 into a single column-family using wide rows (timeuuids) and have a 3-part
 partition key so my primary key is something like ((a, b, day),
 in-time-uuid), x, y, z).

 My java client is feeding rows (about 1k of raw data size each) in
 batches using multiple threads, and the fastest I can get it run reliably
 is about 2000 rows/second.  Even at that speed, all 3 cassandra nodes are
 very CPU bound, with loads of 6-9 each (and the client machine is hardly
 breaking a sweat).  I've tried turning off compression in my table which
 reduced the loads slightly but not much.  There are no other updates or
 reads occurring, except the datastax opscenter.

 I was expecting to be able to insert at least 10k rows/second with this
 configuration, and after a lot of reading of docs, blogs, and google, can't
 really figure out what's slowing my client down.  When I increase the
 insert speed of my client beyond 2000/second, the server responses are just
 too slow and the client falls behind.  I had a single-node Mysql database
 that can handle 10k of these data rows/second, so I really feel like I'm
 missing something in Cassandra.  Any ideas?






-- 

- John


Re: insert performance

2012-02-23 Thread Philippe
Definitely multi thread writes...probably with a little batching (10 or so).
That's how i get my peak throughput.
Le 23 févr. 2012 04:48, Deno Vichas d...@syncopated.net a écrit :

 all,

 would i be better off (i'm in java land) with spawning a bunch of
 threads that all add a single item to a mutator or a single thread that
 adds a bunch of items to a mutator?


 thanks,
 deno




insert performance

2012-02-22 Thread Deno Vichas

all,

would i be better off (i'm in java land) with spawning a bunch of
threads that all add a single item to a mutator or a single thread that
adds a bunch of items to a mutator?


thanks,
deno



Re: Question about insert performance in multiple node cluster

2011-03-01 Thread Oleg Anastasyev
Are your test client talks to single node or to both ?



Question about insert performance in multiple node cluster

2011-02-28 Thread Flachbart, Dirk (HP Software - TransactionVision)
Hi,

We are trying to use Cassandra for high-performance insertion of simple 
key/value records. I have set up Cassandra on two of my machines in my local 
network (Windows 2008 server), using pretty much the default configuration. I 
created a test driver in java (using thrift) which inserts a single 1K data 
column (keys are unique strings of integer values) with multiple threads. On 
each machine I am able to achieve around 9,000 inserts/sec when running the 
test driver with the local Cassandra server.

Then I set up a cluster with both machines, and ran the same test again (the 
test driver is still local to one of the Cassandra nodes). Surprisingly I did 
not see any improvement in the insert performance, I got the same 9000 
inserts/sec as when running with a single node. I know that I shouldn't expect 
linear scaling to 18,000 operations/sec, but shouldn't I see at least some 
significant improvement? The CPU isn't fully loaded on either of the machines, 
and the network utilization is low too (1000 mbit network). Later on I also 
tested adding a third node, but that didn't improve anything either.

I suspect I'm doing something wrong with setting up the cluster. The only 
changes I made on the second machine were:


-  AutoBootstrap=true

-  Setting 'Seed' to the IP of the other node


Did I miss anything? Or am I simply wrong in expecting the throughput to scale 
when using multiple nodes?



Thanks,
Dirk




Re: Question about insert performance in multiple node cluster

2011-02-28 Thread Ryan King
On Mon, Feb 28, 2011 at 9:24 AM, Flachbart, Dirk (HP Software -
TransactionVision) dirk.flachb...@hp.com wrote:
 Hi,



 We are trying to use Cassandra for high-performance insertion of simple
 key/value records. I have set up Cassandra on two of my machines in my local
 network (Windows 2008 server), using pretty much the default configuration.
 I created a test driver in java (using thrift) which inserts a single 1K
 data column (keys are unique strings of integer values) with multiple
 threads. On each machine I am able to achieve around 9,000 inserts/sec when
 running the test driver with the local Cassandra server.



 Then I set up a cluster with both machines, and ran the same test again (the
 test driver is still local to one of the Cassandra nodes). Surprisingly I
 did not see any improvement in the insert performance, I got the same 9000
 inserts/sec as when running with a single node. I know that I shouldn’t
 expect linear scaling to 18,000 operations/sec, but shouldn’t I see at least
 some significant improvement? The CPU isn’t fully loaded on either of the
 machines, and the network utilization is low too (1000 mbit network). Later
 on I also tested adding a third node, but that didn’t improve anything
 either.



 I suspect I’m doing something wrong with setting up the cluster. The only
 changes I made on the second machine were:



 -  AutoBootstrap=true

 -  Setting ‘Seed’ to the IP of the other node





 Did I miss anything? Or am I simply wrong in expecting the throughput to
 scale when using multiple nodes?

What's your replication factor? Which consistency level are you using?
Is the ring evenly balanced? Did you double the number of client
threads when you added the second server?

-ryan


Re: Question about insert performance in multiple node cluster

2011-02-28 Thread Peter Schuller
 What's your replication factor? Which consistency level are you using?
 Is the ring evenly balanced? Did you double the number of client
 threads when you added the second server?

And are you on 100 mbit networking? 9k requests/second inserting 1k,
sounds suspiciously close to saturating 100 MB of bandwidth.

-- 
/ Peter Schuller


RE: Question about insert performance in multiple node cluster

2011-02-28 Thread Flachbart, Dirk (HP Software - TransactionVision)
Nope, I'm on a Gigabit network. The windows task manager on both machines shows 
a network utilization of around 12 percent.

Regards,
Dirk


-Original Message-
From: sc...@scode.org [mailto:sc...@scode.org] On Behalf Of Peter Schuller
Sent: Monday, February 28, 2011 12:53 PM
To: user@cassandra.apache.org
Cc: Ryan King; Flachbart, Dirk (HP Software - TransactionVision)
Subject: Re: Question about insert performance in multiple node cluster

 What's your replication factor? Which consistency level are you using?
 Is the ring evenly balanced? Did you double the number of client
 threads when you added the second server?

And are you on 100 mbit networking? 9k requests/second inserting 1k,
sounds suspiciously close to saturating 100 MB of bandwidth.

-- 
/ Peter Schuller


Re: Question about insert performance in multiple node cluster

2011-02-28 Thread Peter Schuller
 Replication factor is set to 1, and I'm using ConsistencyLevel.ANY. And yep, 
 I tried doubling the threads from 16 to 32 when running with the second 
 server, didn't make a difference.

Are you sure the client isn't the bottleneck? Have you tried running
the client on independent (and perhaps multiple) machines? What does
nodetool tpstats say while you run the test? (Try running it several
times in a row and observe how it changes.)

 Regarding the ring balancing - I assume it should be balanced. I'm using 
 RandomPartitioner, and the keys are generated by simply incrementing an 
 Integer counter value, so they should be spread fairly evenly across the two 
 servers (at least that is my understanding based on the Wiki documentation).

A 'nodetool compact' on all nodes (when not actively writing) followed
by 'nodetool ring' should confirm that you're balanced across the
nodes.

-- 
/ Peter Schuller


Re: Question about insert performance in multiple node cluster

2011-02-28 Thread Ryan King
On Mon, Feb 28, 2011 at 2:05 PM, Flachbart, Dirk (HP Software -
TransactionVision) dirk.flachb...@hp.com wrote:
 Replication factor is set to 1, and I'm using ConsistencyLevel.ANY. And yep, 
 I tried doubling the threads from 16 to 32 when running with the second 
 server, didn't make a difference.

 Regarding the ring balancing - I assume it should be balanced. I'm using 
 RandomPartitioner, and the keys are generated by simply incrementing an 
 Integer counter value, so they should be spread fairly evenly across the two 
 servers (at least that is my understanding based on the Wiki documentation).

What does nodetool cfstats say?

-ryan


0.6 insert performance .... Re: [RELEASE] 0.6.1

2010-04-19 Thread Masood Mortazavi
I wonder if anyone can use:
 * Add logging of GC activity (CASSANDRA-813)
to confirm this:
  http://www.slideshare.net/schubertzhang/cassandra-060-insert-throughput

- m.


On Sun, Apr 18, 2010 at 6:58 PM, Eric Evans eev...@rackspace.com wrote:


 Hot on the trails of 0.6.0 comes our latest, 0.6.1. This stable point
 release contains a number of important bugfixes[1] and is a painless
 upgrade from 0.6.0.

 Enjoy!

 [1]: http://bit.ly/9NqwAb (changelog)

 --
 Eric Evans
 eev...@rackspace.com




RE: 0.6 insert performance .... Re: [RELEASE] 0.6.1

2010-04-19 Thread Mark Jones
I'm seeing some issues like this as well, in fact, I think seeing your graphs 
has helped me understand the dynamics of my cluster better.

Using some ballpark figures for inserting single column objects of ~500 bytes 
onto individual nodes(not when combined as a cluster):

Node1: Inserts 12000/s
Node2: Inserts 12000/s
Node3: Inserts 9000/s
Node4: Inserts 6000/s

When combined as a cluster, inserts are around 7000/s (replication factor of 2)

When GC kicks in anywhere in the cluster, Quorum writes slowdown for everyone 
associated with that node.  And the fact that there are 4 Nodes, almost implies 
garbage collection will be going on somewhere almost all the time.

So while I should be able to write more than 12,000/second, my slowest node in 
the cluster seems to overwhelm the faster nodes and drag everyone down.  I'm 
still running tests of various combinations to see where things work out.

From: Masood Mortazavi [mailto:masoodmortaz...@gmail.com]
Sent: Monday, April 19, 2010 6:15 AM
To: user@cassandra.apache.org; d...@cassandra.apache.org
Subject: 0.6 insert performance  Re: [RELEASE] 0.6.1

I wonder if anyone can use:
 * Add logging of GC activity (CASSANDRA-813)
to confirm this:
  http://www.slideshare.net/schubertzhang/cassandra-060-insert-throughput

- m.

On Sun, Apr 18, 2010 at 6:58 PM, Eric Evans 
eev...@rackspace.commailto:eev...@rackspace.com wrote:

Hot on the trails of 0.6.0 comes our latest, 0.6.1. This stable point
release contains a number of important bugfixes[1] and is a painless
upgrade from 0.6.0.

Enjoy!

[1]: http://bit.ly/9NqwAb (changelog)

--
Eric Evans
eev...@rackspace.commailto:eev...@rackspace.com



RE: 0.6 insert performance .... Re: [RELEASE] 0.6.1

2010-04-19 Thread Daniel Kluesing
We see this behavior as well with 0.6, heap usage graphs look almost identical. 
The GC is a noticeable bottleneck, we've tried jdku19 and jrockit vm's. It 
basically kills any kind of soft real time behavior.

From: Masood Mortazavi [mailto:masoodmortaz...@gmail.com]
Sent: Monday, April 19, 2010 4:15 AM
To: user@cassandra.apache.org; d...@cassandra.apache.org
Subject: 0.6 insert performance  Re: [RELEASE] 0.6.1

I wonder if anyone can use:
 * Add logging of GC activity (CASSANDRA-813)
to confirm this:
  http://www.slideshare.net/schubertzhang/cassandra-060-insert-throughput

- m.

On Sun, Apr 18, 2010 at 6:58 PM, Eric Evans 
eev...@rackspace.commailto:eev...@rackspace.com wrote:

Hot on the trails of 0.6.0 comes our latest, 0.6.1. This stable point
release contains a number of important bugfixes[1] and is a painless
upgrade from 0.6.0.

Enjoy!

[1]: http://bit.ly/9NqwAb (changelog)

--
Eric Evans
eev...@rackspace.commailto:eev...@rackspace.com



Re: 0.6 insert performance .... Re: [RELEASE] 0.6.1

2010-04-19 Thread Jonathan Ellis
It's hard to tell from those slides, but it looks like the slowdown
doesn't hit until after several GCs.

Perhaps this is compaction kicking in, not GCs?  Definitely the extra
I/O + CPU load from compaction will cause a drop in throughput.

On Mon, Apr 19, 2010 at 6:14 AM, Masood Mortazavi
masoodmortaz...@gmail.com wrote:
 I wonder if anyone can use:
  * Add logging of GC activity (CASSANDRA-813)
 to confirm this:
   http://www.slideshare.net/schubertzhang/cassandra-060-insert-throughput

 - m.


 On Sun, Apr 18, 2010 at 6:58 PM, Eric Evans eev...@rackspace.com wrote:

 Hot on the trails of 0.6.0 comes our latest, 0.6.1. This stable point
 release contains a number of important bugfixes[1] and is a painless
 upgrade from 0.6.0.

 Enjoy!

 [1]: http://bit.ly/9NqwAb (changelog)

 --
 Eric Evans
 eev...@rackspace.com





Re: 0.6 insert performance .... Re: [RELEASE] 0.6.1

2010-04-19 Thread Schubert Zhang
Since the scale of GC graph in the slides is different from the throughput
ones. I will do another test for this issue.
Thanks for your advices, Masood and Jonathan.

---
Here, i just post my cossandra.in.sh.
JVM_OPTS= \
-ea \
-Xms128M \
-Xmx6G \
-XX:TargetSurvivorRatio=90 \
-XX:+AggressiveOpts \
-XX:+UseParNewGC \
-XX:+UseConcMarkSweepGC \
-XX:+CMSParallelRemarkEnabled \
-XX:SurvivorRatio=128 \
-XX:MaxTenuringThreshold=0 \
-Dcom.sun.management.jmxremote.port=8081 \
-Dcom.sun.management.jmxremote.ssl=false \
-Dcom.sun.management.jmxremote.authenticate=false

On Tue, Apr 20, 2010 at 5:46 AM, Masood Mortazavi masoodmortaz...@gmail.com
 wrote:

 Minimizing GC pauses or minimizing time slots allocated to GC pauses --
 either through configuration or re-implementations of garbage collection
 bottlenecks (i.e. object-generation bottlenecks) -- seem to be the
 immediate approach. (Other approaches appear to be more intrusive.)
 At code level, using the GC logs, one can investigate further. There may be
 places were some object recycling can make some larger difference.
 Trying this first will probably bear more immediate fruit.

 - m.


 On Mon, Apr 19, 2010 at 9:11 AM, Daniel Kluesing d...@bluekai.com wrote:

  We see this behavior as well with 0.6, heap usage graphs look almost
 identical. The GC is a noticeable bottleneck, we’ve tried jdku19 and jrockit
 vm’s. It basically kills any kind of soft real time behavior.



 *From:* Masood Mortazavi [mailto:masoodmortaz...@gmail.com]
 *Sent:* Monday, April 19, 2010 4:15 AM
 *To:* user@cassandra.apache.org; d...@cassandra.apache.org
 *Subject:* 0.6 insert performance  Re: [RELEASE] 0.6.1



 I wonder if anyone can use:

  * Add logging of GC activity (CASSANDRA-813)
 to confirm this:
   http://www.slideshare.net/schubertzhang/cassandra-060-insert-throughput

 - m.

  On Sun, Apr 18, 2010 at 6:58 PM, Eric Evans eev...@rackspace.com
 wrote:


 Hot on the trails of 0.6.0 comes our latest, 0.6.1. This stable point
 release contains a number of important bugfixes[1] and is a painless
 upgrade from 0.6.0.

 Enjoy!

 [1]: http://bit.ly/9NqwAb (changelog)

 --
 Eric Evans
 eev...@rackspace.com