Re: Bootstrap hung

2010-02-15 Thread Jonathan Ellis
created https://issues.apache.org/jira/browse/CASSANDRA-794 for this

On Fri, Feb 12, 2010 at 2:38 PM, ruslan usifov ruslan.usi...@gmail.com wrote:
 Also i have problem with StreamInitiateVerbHandler, the problem in
 PendingFile.getTargetFile, namely difference in slashes on win and unix, so
 i change PendingFile.java like this:

     public PendingFile(String targetFile, long expectedBytes, String table)
     {
     targetFile_ = targetFile.replaceAll((?:|/)+, /);
     expectedBytes_ = expectedBytes;
     table_ = table;
     ptr_ = 0;
     }

     public void setTargetFile(String file)
     {
     targetFile_ = file.replaceAll((?:|/)+, /);;
     }




Re: Bootstrap hung

2010-02-15 Thread Jonathan Ellis
are you mixing windows and unix machines in the same cluster?

On Fri, Feb 12, 2010 at 2:38 PM, ruslan usifov ruslan.usi...@gmail.com wrote:
 Also i have problem with StreamInitiateVerbHandler, the problem in
 PendingFile.getTargetFile, namely difference in slashes on win and unix, so
 i change PendingFile.java like this:

     public PendingFile(String targetFile, long expectedBytes, String table)
     {
     targetFile_ = targetFile.replaceAll((?:|/)+, /);
     expectedBytes_ = expectedBytes;
     table_ = table;
     ptr_ = 0;
     }

     public void setTargetFile(String file)
     {
     targetFile_ = file.replaceAll((?:|/)+, /);;
     }




Re: Bootstrap hung

2010-02-15 Thread Gary Dusbabek
Ruslan,

I think this indicates that SO_SNDBUF is too small on windows.
Windows is the source, freebsd is the destination, correct?)

I've created https://issues.apache.org/jira/browse/CASSANDRA-795 to
track this.  Can you apply the patch attached to it to see if it
addresses the problem?

Thanks.

Gary

2010/2/15 ruslan usifov ruslan.usi...@gmail.com:
 ERROR [MESSAGE-STREAMING-POOL:1] 2010-02-12 19:08:25,500
 DebuggableThreadPoolExecutor.java (line 80) Error in ThreadPoolExecutor
 java.lang.RuntimeException: java.io.IOException: Невозможно выполнить
 операцию на сокете, т.к. буфер слишком мал или очередь переполнена
     at
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
     at
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
     at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
     at java.lang.Thread.run(Thread.java:619)
 Caused by: java.io.IOException: Невозможно выполнить операцию на сокете,
 т.к. буфер слишком мал или очередь переполнена
     at sun.nio.ch.SocketDispatcher.write0(Native Method)
     at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:33)
     at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:104)
     at sun.nio.ch.IOUtil.write(IOUtil.java:60)
     at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334)
     at
 sun.nio.ch.FileChannelImpl.transferToTrustedChannel(FileChannelImpl.java:449)
     at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:520)
     at
 org.apache.cassandra.net.FileStreamTask.stream(FileStreamTask.java:95)
     at
 org.apache.cassandra.net.FileStreamTask.runMayThrow(FileStreamTask.java:63)
     at
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
     ... 3 more
 ERROR [MESSAGE-STREAMING-POOL:1] 2010-02-12 19:08:25,515
 CassandraDaemon.java (line 78) Fatal exception in thread
 Thread[MESSAGE-STREAMING-POOL:1,5,main]
 java.lang.RuntimeException: java.io.IOException: Невозможно выполнить
 операцию на сокете, т.к. буфер слишком мал или очередь переполнена
     at
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
     at
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
     at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
     at java.lang.Thread.run(Thread.java:619)
 Caused by: java.io.IOException: Невозможно выполнить операцию на сокете,
 т.к. буфер слишком мал или очередь переполнена
     at sun.nio.ch.SocketDispatcher.write0(Native Method)
     at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:33)
     at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:104)
     at sun.nio.ch.IOUtil.write(IOUtil.java:60)
     at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334)
     at
 sun.nio.ch.FileChannelImpl.transferToTrustedChannel(FileChannelImpl.java:449)
     at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:520)
     at
 org.apache.cassandra.net.FileStreamTask.stream(FileStreamTask.java:95)
     at
 org.apache.cassandra.net.FileStreamTask.runMayThrow(FileStreamTask.java:63)
     at
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
     ... 3 more


 2010/2/12 Jonathan Ellis jbel...@gmail.com

 Care to include a stack trace?  Those are useful when reporting problems.

 On Fri, Feb 12, 2010 at 2:31 PM, ruslan usifov ruslan.usi...@gmail.com
 wrote:
  Yes
 
 
 




Re: Bootstrap hung

2010-02-15 Thread ruslan usifov
Yes, for test case

2010/2/15 Jonathan Ellis jbel...@gmail.com

 are you mixing windows and unix machines in the same cluster?

 On Fri, Feb 12, 2010 at 2:38 PM, ruslan usifov ruslan.usi...@gmail.com
 wrote:
  Also i have problem with StreamInitiateVerbHandler, the problem in
  PendingFile.getTargetFile, namely difference in slashes on win and unix,
 so
  i change PendingFile.java like this:
 
  public PendingFile(String targetFile, long expectedBytes, String
 table)
  {
  targetFile_ = targetFile.replaceAll((?:|/)+, /);
  expectedBytes_ = expectedBytes;
  table_ = table;
  ptr_ = 0;
  }
 
  public void setTargetFile(String file)
  {
  targetFile_ = file.replaceAll((?:|/)+, /);;
  }
 
 



Nodeprobe Not Working Properly

2010-02-15 Thread Shahan Khan


Hi, 

I just installed cassandra using the Debian package on two
servers, _db1a_, and _db1b_. 

When I run the command _nodeprobe -host db1a
ring_, the command only works on the server db1a. 

iptables is set to
allow everything.  

I did also add the
-Djava.rmi.server.hostname=192.168.1.13 to cassandra.in.sh as mentioned on
this page [1]. 

What am I doing wrong? Any other recommendations? 

Thank
you, 

Shahan 

- 

db1a 

db1a:~#
iptables -L 

Chain INPUT (policy ACCEPT) 

target prot opt source
destination  

Chain FORWARD (policy ACCEPT) 

target prot opt source
destination  

Chain OUTPUT (policy ACCEPT) 

target prot opt source
destination  

db1a:~# nodeprobe -host db1a ring 

Address Status Load
Range Ring 

 Token(bytes[eaaca3c3bd3caba3e14ee0f85d5cda8a]) 

192.168.1.13
Up 3.04 KB Token(bytes[d1deccd61a6632f9040546c5fa57427e])||  

db1b


db1b:~# iptables -L 

Chain INPUT (policy ACCEPT) 

target prot opt
source destination  

Chain FORWARD (policy ACCEPT) 

target prot opt
source destination  

Chain OUTPUT (policy ACCEPT) 

target prot opt source
destination  

db1b:~# nodeprobe -host db1a ring 

Error connecting to
remote JMX agent! 

java.rmi.ConnectException: Connection refused to host:
127.0.0.1; nested exception is:  

 java.net.ConnectException: Connection
refused 

 at
sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:619) 

 at
sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:216) 


at sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:202) 


at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:128) 

 at
javax.management.remote.rmi.RMIServerImpl_Stub.newClient(Unknown Source) 


at
javax.management.remote.rmi.RMIConnector.getConnection(RMIConnector.java:2343)


 at
javax.management.remote.rmi.RMIConnector.connect(RMIConnector.java:296) 


at
javax.management.remote.JMXConnectorFactory.connect(JMXConnectorFactory.java:267)


 at org.apache.cassandra.tools.NodeProbe.connect(NodeProbe.java:153) 


at org.apache.cassandra.tools.NodeProbe.(NodeProbe.java:115) 

 at
org.apache.cassandra.tools.NodeProbe.main(NodeProbe.java:514) 

Caused by:
java.net.ConnectException: Connection refused 

 at
java.net.PlainSocketImpl.socketConnect(Native Method) 

 at
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:310)


 at
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:176)


 at
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:163)


 at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:381) 

 at
java.net.Socket.connect(Socket.java:537) 

 at
java.net.Socket.connect(Socket.java:487) 

 at
java.net.Socket.(Socket.java:384) 

 at java.net.Socket.(Socket.java:198)


 at
sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(RMIDirectSocketFactory.java:40)


 at
sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket(RMIMasterSocketFactory.java:146)


 at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:613) 


... 10 more 

Links:
--
[1]
http://www.mail-archive.com/cassandra-user@incubator.apache.org/msg00629.html


Re: Nodeprobe Not Working Properly

2010-02-15 Thread Brandon Williams
On Mon, Feb 15, 2010 at 1:13 PM, Shahan Khan cont...@shahan.me wrote:

 db1b:~# nodeprobe -host db1a ring

 Error connecting to remote JMX agent!

 java.rmi.ConnectException: Connection refused to host: 127.0.0.1; nested
 exception is:


This seems to indicate that db1a resolves as 127.0.0.1 on db1b, when it
actually needs to resolve to the 192.168 address.  Try passing the ip
address as the host and it should work.

-Brandon


Re: Nodeprobe Not Working Properly

2010-02-15 Thread Shahan Khan


I tried Brandon's suggestion, but am still getting the same error on the
remote server. 

Any other suggestions? Is it possible that its a bug?


Thanks, 

Shahan 

db1a = 192.168.1.13 

db1b = 192.168.1.14


= 

db1a:~# nodeprobe -host 192.168.1.14 ring 

Error
connecting to remote JMX agent! 

java.rmi.ConnectException: Connection
refused to host: 127.0.0.1; nested exception is:  


java.net.ConnectException: Connection refused 

 at
sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:619) 

 at
sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:216) 


at sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:202) 


at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:128) 

 at
javax.management.remote.rmi.RMIServerImpl_Stub.newClient(Unknown Source) 


at
javax.management.remote.rmi.RMIConnector.getConnection(RMIConnector.java:2343)


 at
javax.management.remote.rmi.RMIConnector.connect(RMIConnector.java:296) 


at
javax.management.remote.JMXConnectorFactory.connect(JMXConnectorFactory.java:267)


 at org.apache.cassandra.tools.NodeProbe.connect(NodeProbe.java:153) 


at org.apache.cassandra.tools.NodeProbe.(NodeProbe.java:115) 

 at
org.apache.cassandra.tools.NodeProbe.main(NodeProbe.java:514) 

Caused by:
java.net.ConnectException: Connection refused 

 at
java.net.PlainSocketImpl.socketConnect(Native Method) 

 at
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:310)


 at
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:176)


 at
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:163)


 at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:381) 

 at
java.net.Socket.connect(Socket.java:537) 

 at
java.net.Socket.connect(Socket.java:487) 

 at
java.net.Socket.(Socket.java:384) 

 at java.net.Socket.(Socket.java:198)


 at
sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(RMIDirectSocketFactory.java:40)


 at
sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket(RMIMasterSocketFactory.java:146)


 at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:613) 


... 10 more


===


db1b:~# nodeprobe -host 192.168.1.14 info


Token(bytes[eaaca3c3bd3caba3e14ee0f85d5cda8a]) 

Load : 4 KB 

Generation
No : 1266260277 

Uptime (seconds) : 9933 

Heap Memory (MB) : 54.83 /
1016.13 

db1b:~# nodeprobe -host 192.168.1.14 ring 

Address Status Load
Range Ring 

 Token(bytes[eaaca3c3bd3caba3e14ee0f85d5cda8a]) 

192.168.1.13
Up 3.52 KB Token(bytes[d1deccd61a6632f9040546c5fa57427e])|| 

On Mon, 15
Feb 2010 13:19:42 -0600, Brandon Williams  wrote:  On Mon, Feb 15, 2010 at
1:13 PM, Shahan Khan  wrote:

db1b:~# nodeprobe -host db1a ring 

Error
connecting to remote JMX agent! 

java.rmi.ConnectException: Connection
refused to host: 127.0.0.1; nested exception is:This seems to indicate
that db1a resolves as 127.0.0.1 on db1b, when it actually needs to resolve
to the 192.168 address. Try passing the ip address as the host and it
should work.  -Brandon

 

Links:
--
[1] mailto:cont...@shahan.me


RE: Cassandra benchmark shows OK throughput but high read latency ( 100ms)?

2010-02-15 Thread Weijun Li
It seems that read latency is sensitive to number of threads (or thrift
clients): after reducing number of threads to 15 and read latency decreased
to ~20ms. 

The other problem is: if I keep mixed write and read (e.g, 8 write threads
plus 7 read threads) against the 2-nodes cluster continuously, the read
latency will go up gradually (along with the size of Cassandra data file),
and at the end it will become ~40ms (up from ~20ms) even with only 15
threads. During this process the data file grew from 1.6GB to over 3GB even
if I kept writing the same key/values to Cassandra. It seems that Cassandra
keeps appending to sstable data files and will only clean up them during
node cleanup or compact (please correct me if this is incorrect). 
 
Here's my test settings:

JVM xmx: 6GB
KCF: 0.3
Memtable: 512MB.
Number of records: 1 millon (payload is 1000 bytes)

I used JMX and iostat to watch the cluster but can't find any clue for the
increasing read latency issue: JVM memory, GC, CPU usage, tpstats and io
saturation all seem to be clean. One exception is that the wait time in
iostat goes up quickly once a while but is a small number for most of the
time. Another thing I noticed is that JVM doesn't use more than 1GB of
memory (out of the 6GB I specified for JVM) even if I set KCF to 0.3 and
increased memtable size to 512MB.

Did I miss anything here? How can I diagnose this kind of increasing read
latency issue? Is there any performance tuning guide available?

Thanks,
-Weijun


-Original Message-
From: Jonathan Ellis [mailto:jbel...@gmail.com] 
Sent: Sunday, February 14, 2010 6:22 PM
To: cassandra-user@incubator.apache.org
Subject: Re: Cassandra benchmark shows OK throughput but high read latency
( 100ms)?

are you i/o bound?  what is your on-disk data set size?  what does
iostats tell you?
http://spyced.blogspot.com/2010/01/linux-performance-basics.html

do you have a lot of pending compactions?  (tpstats will tell you)

have you increased KeysCachedFraction?

On Sun, Feb 14, 2010 at 8:18 PM, Weijun Li weiju...@gmail.com wrote:
 Hello,



 I saw some Cassandra benchmark reports mentioning read latency that is
less
 than 50ms or even 30ms. But my benchmark with 0.5 doesn't seem to support
 that. Here's my settings:



 Nodes: 2 machines. 2x2.5GHZ Xeon Quad Core (thus 8 cores), 8GB RAM

 ReplicationFactor=2 Partitioner=Random

 JVM Xmx: 4GB

 Memory table size: 512MB (haven't figured out how to enable binary
memtable
 so I set both memtable number to 512mb)

 Flushing threads: 2-4

 Payload: ~1000 bytes, 3 columns in one CF.

 Read/write time measure: get startTime right before each Java thrift call,
 transport objects are pre-created upon creation of each thread.



 The result shows that total write throughput is around 2000/sec (for 2
nodes
 in the cluster) which is not bad, and read throughput is just around
 750/sec. However for each thread the average read latency is more than
 100ms. I'm running 100 threads for the testing and each thread randomly
pick
 a node for thrift call. So the read/sec of each thread is just around 7.5,
 meaning duration of each thrift call is 1000/7.5=133ms. Without
replication
 the cluster write throughput is around 3300/s, and read throughput is
around
 1400/s, so the read latency is still around 70ms without replication.



 Is there anything wrong in my benchmark test? How can I achieve a
reasonable
 read latency ( 30ms)?



 Thanks,

 -Weijun