Re: Cassandra Performance Benchmarking.

2013-01-21 Thread aaron morton
You can also see what it looks like from the server side. nodetool proxyhistograms will show you full request latency recorded by the coordinator. nodetool cfhistograms will show you the local read latency, this is just the time it takes to read data on a replica and does not include network

Re: Cassandra Performance Benchmarking.

2013-01-21 Thread Pradeep Kumar Mantha
Hi, Thanks for the information.. I upgraded my cassandra version to 1.2.0 and tried running the experiment again to find the statistics. My application took nearly 529 seconds for querying 76896 keys. Please find the statistic information below for 32 threads ( where each thread query 76896

Re: Cassandra Performance Benchmarking.

2013-01-18 Thread Tyler Hobbs
You just need to increase the ConnectionPool size to handle the number of threads you have using it concurrently. Set the pool_size kwarg to at least the number of threads you're using. On Thu, Jan 17, 2013 at 6:46 PM, Pradeep Kumar Mantha pradeep...@gmail.comwrote: Thanks Tyler. I just

Re: Cassandra Performance Benchmarking.

2013-01-18 Thread Pradeep Kumar Mantha
Hi, Thanks Tyler. Below is the *global* connection pool I am trying to use, where the server_list contains all the ips of 12 DataNodes I am using and pool_size is the number of threads and I just set to timeout to 60 to avoid connection retry errors. pool = pycassa.ConnectionPool('Blast',

Re: Cassandra Performance Benchmarking.

2013-01-18 Thread Tyler Hobbs
The fact that it's still exactly 521 seconds is very suspicious. I can't debug your script over the mailing list, but do some sanity checks to make sure there's not a bottleneck somewhere you don't expect. On Fri, Jan 18, 2013 at 12:44 PM, Pradeep Kumar Mantha pradeep...@gmail.com wrote: Hi,

Cassandra Performance Benchmarking.

2013-01-17 Thread Pradeep Kumar Mantha
Hi, I am trying to maximize execution of the number of read queries/second. Here is my cluster configuration. Replication - Default 12 Data Nodes. 16 Client Nodes - used for querying. Each client node executes 32 threads - each thread executes 76896 read queries using cassandra-cli tool.

Re: Cassandra Performance Benchmarking.

2013-01-17 Thread Edward Capriolo
Wow you managed to do a load test through the cassandra-cli. There should be a merit badge for that. You should use the built in stress tool or YCSB. The CLI has to do much more string conversion then a normal client would and it is not built for performance. You will definitely get better

Re: Cassandra Performance Benchmarking.

2013-01-17 Thread Pradeep Kumar Mantha
Hi, Thanks. I would like to benchmark cassandra with our application so that we understand the details of how the actual benchmarking is done. Not sure, how easy it would be to integrate YCSB with our application. So, i am trying different client interfaces to cassandra. I found for 12 Data

Re: Cassandra Performance Benchmarking.

2013-01-17 Thread Pradeep Kumar Mantha
Thanks Tyler. I just moved the pool and cf which store the connection pool and CF information to have global scope. Increased the server_list values from 1 to 4. ( i think i can increase them max to 12 since I have 12 data nodes ) when I created 8 threads using python threading package , I see