Cassandra benchmark shows OK throughput but high read latency ( 100ms)?

2010-02-14 Thread Weijun Li
Hello,

 

I saw some Cassandra benchmark reports mentioning read latency that is less 
than 50ms or even 30ms. But my benchmark with 0.5 doesn’t seem to support that. 
Here’s my settings:

 

Nodes: 2 machines. 2x2.5GHZ Xeon Quad Core (thus 8 cores), 8GB RAM

ReplicationFactor=2 Partitioner=Random

JVM Xmx: 4GB

Memory table size: 512MB (haven’t figured out how to enable binary memtable so 
I set both memtable number to 512mb)

Flushing threads: 2-4

Payload: ~1000 bytes, 3 columns in one CF.

Read/write time measure: get startTime right before each Java thrift call, 
transport objects are pre-created upon creation of each thread.

 

The result shows that total write throughput is around 2000/sec (for 2 nodes in 
the cluster) which is not bad, and read throughput is just around 750/sec. 
However for each thread the average read latency is more than 100ms. I’m 
running 100 threads for the testing and each thread randomly pick a node for 
thrift call. So the read/sec of each thread is just around 7.5, meaning 
duration of each thrift call is 1000/7.5=133ms. Without replication the cluster 
write throughput is around 3300/s, and read throughput is around 1400/s, so the 
read latency is still around 70ms without replication.

 

Is there anything wrong in my benchmark test? How can I achieve a reasonable 
read latency ( 30ms)?

 

Thanks,

-Weijun

 

 



Re: Cassandra benchmark shows OK throughput but high read latency ( 100ms)?

2010-02-14 Thread Jonathan Ellis
are you i/o bound?  what is your on-disk data set size?  what does
iostats tell you?
http://spyced.blogspot.com/2010/01/linux-performance-basics.html

do you have a lot of pending compactions?  (tpstats will tell you)

have you increased KeysCachedFraction?

On Sun, Feb 14, 2010 at 8:18 PM, Weijun Li weiju...@gmail.com wrote:
 Hello,



 I saw some Cassandra benchmark reports mentioning read latency that is less
 than 50ms or even 30ms. But my benchmark with 0.5 doesn’t seem to support
 that. Here’s my settings:



 Nodes: 2 machines. 2x2.5GHZ Xeon Quad Core (thus 8 cores), 8GB RAM

 ReplicationFactor=2 Partitioner=Random

 JVM Xmx: 4GB

 Memory table size: 512MB (haven’t figured out how to enable binary memtable
 so I set both memtable number to 512mb)

 Flushing threads: 2-4

 Payload: ~1000 bytes, 3 columns in one CF.

 Read/write time measure: get startTime right before each Java thrift call,
 transport objects are pre-created upon creation of each thread.



 The result shows that total write throughput is around 2000/sec (for 2 nodes
 in the cluster) which is not bad, and read throughput is just around
 750/sec. However for each thread the average read latency is more than
 100ms. I’m running 100 threads for the testing and each thread randomly pick
 a node for thrift call. So the read/sec of each thread is just around 7.5,
 meaning duration of each thrift call is 1000/7.5=133ms. Without replication
 the cluster write throughput is around 3300/s, and read throughput is around
 1400/s, so the read latency is still around 70ms without replication.



 Is there anything wrong in my benchmark test? How can I achieve a reasonable
 read latency ( 30ms)?



 Thanks,

 -Weijun






Re: Bootstrap hung

2010-02-14 Thread ruslan usifov
ERROR [MESSAGE-STREAMING-POOL:1] 2010-02-12 19:08:25,500
DebuggableThreadPoolExecutor.java (line 80) Error in ThreadPoolExecutor
java.lang.RuntimeException: java.io.IOException: Невозможно выполнить
операцию на сокете, т.к. буфер слишком мал или очередь переполнена
at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
Caused by: java.io.IOException: Невозможно выполнить операцию на сокете,
т.к. буфер слишком мал или очередь переполнена
at sun.nio.ch.SocketDispatcher.write0(Native Method)
at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:33)
at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:104)
at sun.nio.ch.IOUtil.write(IOUtil.java:60)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334)
at
sun.nio.ch.FileChannelImpl.transferToTrustedChannel(FileChannelImpl.java:449)
at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:520)
at
org.apache.cassandra.net.FileStreamTask.stream(FileStreamTask.java:95)
at
org.apache.cassandra.net.FileStreamTask.runMayThrow(FileStreamTask.java:63)
at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
... 3 more
ERROR [MESSAGE-STREAMING-POOL:1] 2010-02-12 19:08:25,515
CassandraDaemon.java (line 78) Fatal exception in thread
Thread[MESSAGE-STREAMING-POOL:1,5,main]
java.lang.RuntimeException: java.io.IOException: Невозможно выполнить
операцию на сокете, т.к. буфер слишком мал или очередь переполнена
at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
Caused by: java.io.IOException: Невозможно выполнить операцию на сокете,
т.к. буфер слишком мал или очередь переполнена
at sun.nio.ch.SocketDispatcher.write0(Native Method)
at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:33)
at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:104)
at sun.nio.ch.IOUtil.write(IOUtil.java:60)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334)
at
sun.nio.ch.FileChannelImpl.transferToTrustedChannel(FileChannelImpl.java:449)
at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:520)
at
org.apache.cassandra.net.FileStreamTask.stream(FileStreamTask.java:95)
at
org.apache.cassandra.net.FileStreamTask.runMayThrow(FileStreamTask.java:63)
at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
... 3 more


2010/2/12 Jonathan Ellis jbel...@gmail.com

 Care to include a stack trace?  Those are useful when reporting problems.

 On Fri, Feb 12, 2010 at 2:31 PM, ruslan usifov ruslan.usi...@gmail.com
 wrote:
  Yes