Re: cassandra read latency help

2012-05-18 Thread Piavlo

  
  
On 05/18/2012 08:49 AM, Viktor Jevdokimov wrote:

  
  
  
  
Row cache is ok until keys are not heavily
updated, otherwise it frequently invalidates and pressures
GC.
  

According to http://www.datastax.com/docs/1.0/operations/tuning
"As of Cassandra 1.0, column family row caches are stored in native
memory by default (outside of the Java heap).
This results in both a smaller per-row memory footprint and reduced
JVM heap requirements, which helps keep the heap size manageable for
good JVM garbage collection performance."
AFAIU it's outside of the Java heap only if JNA is used.

Then I tried row cache for for a few CF (with cassandra 1.0.9), to
my surprise it just killed reads latency , and made very high cpu
usage, the row hit rate was ~20% and reads/writes ~50/50.
The CFs are compressed (does it matter? does the row cache keep rows
compressed or not?)
AFAIU with JNA off heap cache stores the rows in serialized form, so
where does the high cpu come from?


  


The
high latency is from your batch of 100 keys. Review your
data model to avoid such reads, if you need low latency.

500M
rows on one node, or on the cluster? Reading 100 random rows
at total of 40KB data from a data set of 180GB uncompressed
under 30ms is not an easy task.





  

  

  Best regards / Pagarbiai
  Viktor Jevdokimov
  Senior Developer
  
  Email: viktor.jevdoki...@adform.com
  Phone: +370 5 212 3063, Fax +370 5 261 0453
  J. Jasinskio 16C, LT-01112 Vilnius, Lithuania
  Follow us on Twitter: 
  @adforminsider
  What is Adform: 
  watch this short video


  


  

  
  
Disclaimer: The information contained in this message and
attachments is intended solely for the attention and use of
the named addressee and may be confidential. If you are not
the intended recipient, you are reminded that the
information remains the property of the sender. You must not
use, disclose, distribute, copy, print or rely on this
e-mail. If you have received this message in error, please
contact the sender immediately and irrevocably delete this
message and any copies.
  



  

  From:
  Gurpreet Singh [mailto:gurpreet.si...@gmail.com]
  
  Sent: Thursday, May 17, 2012 20:24
  To: user@cassandra.apache.org
  Subject: Re: cassandra read latency help

  
  
  Thanks Viktor for the advice.
  
Right now, i just have 1 node that i am
  testing against and i am using CL one.
  
  
Are you suggesting that the page cache
  might be doing better than the row cache?
  I am getting row cache hit of 0.66 right now.
  
  

  
  
/G
  
  


  On Thu, May 17, 2012 at 12:26 AM,
Viktor Jevdokimov viktor.jevdoki...@adform.com
wrote:
  

  Gurpreet Singh wrote:
   Any ideas on what could help here bring down the
  read latency even more ?
  
  Avoid Cassandra forwarding request to
other nodes:
- Use consistency level ONE;
- Create data model to do single request with single
key, since different keys may belong to different nodes
and requires forwarding requests to them;
- Use smart client to calculate token for key and select
appropriate node (primary or replica) by token range;
- Turn off Dynamic Snitch (it may forward request to
other replica even it has the data);
- Have all or hot data in page cache (no HDD disk IO) or
use SSD;
- If you do regular updates to key, do not use row
cache, otherwise you may try.




Best regards / Pagarbiai

Viktor Jevdokimov
Senior Developer

Email: 

Re: cassandra read latency help

2012-05-18 Thread Gurpreet Singh
Hi Viktor,

As i mentioned, my goal is to finally achieve 100 reads per second
throughput, its not a batch size of 100.

The writes are already done, and i am not doing them anymore. I loaded the
system with about 130 million keys.

I am just doing a read workload in my experiment as of now..

1. no invalidation of cache happenning because no writes are happenning. GC
is not an issue, its an off-heap native cache by  default in 1.0.9
2. reads are with batch size 1
3. read qps 25

I am keeping max read qps constant, and just varying the number of threads
doing the reads.

row cache hit ratio = 0.66

Observations:

1. With 20 threads doing reads, avg latency is 50 ms
2. With 6 threads doing reads, avg latency is 30 ms
3. With 2 threads doing reads, avg latency is 15 ms
4. With 3 threads, latency is 20 ms

Looks like the number of disks (2) are limiting the concurrency of the
system here. any other explanations?
/G


On Thu, May 17, 2012 at 10:49 PM, Viktor Jevdokimov 
viktor.jevdoki...@adform.com wrote:

  Row cache is ok until keys are not heavily updated, otherwise it
 frequently invalidates and pressures GC.

 ** **

 The high latency is from your batch of 100 keys. Review your data model to
 avoid such reads, if you need low latency.

 ** **

 500M rows on one node, or on the cluster? Reading 100 random rows at total
 of 40KB data from a data set of 180GB uncompressed under 30ms is not an
 easy task.

 ** **

 ** **


Best regards / Pagarbiai
 *Viktor Jevdokimov*
 Senior Developer

 Email: viktor.jevdoki...@adform.com
 Phone: +370 5 212 3063, Fax +370 5 261 0453
 J. Jasinskio 16C, LT-01112 Vilnius, Lithuania
 Follow us on Twitter: @adforminsider http://twitter.com/#!/adforminsider
 What is Adform: watch this short video http://vimeo.com/adform/display
  [image: Adform News] http://www.adform.com

 Disclaimer: The information contained in this message and attachments is
 intended solely for the attention and use of the named addressee and may be
 confidential. If you are not the intended recipient, you are reminded that
 the information remains the property of the sender. You must not use,
 disclose, distribute, copy, print or rely on this e-mail. If you have
 received this message in error, please contact the sender immediately and
 irrevocably delete this message and any copies.

   *From:* Gurpreet Singh [mailto:gurpreet.si...@gmail.com]
 *Sent:* Thursday, May 17, 2012 20:24
 *To:* user@cassandra.apache.org
 *Subject:* Re: cassandra read latency help

 ** **

 Thanks Viktor for the advice.

 Right now, i just have 1 node that i am testing against and i am using CL
 one.

 Are you suggesting that the page cache might be doing better than the row
 cache?
 I am getting row cache hit of 0.66 right now.

 ** **

 /G

 ** **

 On Thu, May 17, 2012 at 12:26 AM, Viktor Jevdokimov 
 viktor.jevdoki...@adform.com wrote:

  Gurpreet Singh wrote:
  Any ideas on what could help here bring down the read latency even more ?
 

 Avoid Cassandra forwarding request to other nodes:
 - Use consistency level ONE;
 - Create data model to do single request with single key, since different
 keys may belong to different nodes and requires forwarding requests to them;
 - Use smart client to calculate token for key and select appropriate node
 (primary or replica) by token range;
 - Turn off Dynamic Snitch (it may forward request to other replica even it
 has the data);
 - Have all or hot data in page cache (no HDD disk IO) or use SSD;
 - If you do regular updates to key, do not use row cache, otherwise you
 may try.




 Best regards / Pagarbiai

 Viktor Jevdokimov
 Senior Developer

 Email: viktor.jevdoki...@adform.com
 Phone: +370 5 212 3063
 Fax: +370 5 261 0453

 J. Jasinskio 16C,
 LT-01112 Vilnius,
 Lithuania



 Disclaimer: The information contained in this message and attachments is
 intended solely for the attention and use of the named addressee and may be
 confidential. If you are not the intended recipient, you are reminded that
 the information remains the property of the sender. You must not use,
 disclose, distribute, copy, print or rely on this e-mail. If you have
 received this message in error, please contact the sender immediately and
 irrevocably delete this message and any copies.

 ** **

signature-logo29.png

RE: sstableloader 1.1 won't stream

2012-05-18 Thread Pieter Callewaert
Hi,

Sorry to say I didn't look further into this. I'm using CentOS 6.2 now for 
loader without any problems.

Kind regards,
Pieter Callewaert

-Original Message-
From: sj.climber [mailto:sj.clim...@gmail.com] 
Sent: vrijdag 18 mei 2012 3:56
To: cassandra-u...@incubator.apache.org
Subject: Re: sstableloader 1.1 won't stream

Pieter, Aaron,

Any further progress on this?  I'm running into the same issue, although in my 
case I'm trying to stream from Ubuntu 10.10 to a 2-node cluster (also Cassandra 
1.1.0, and running on separate Ubuntu 10.10 hosts).

Thanks in advance!

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/sstableloader-1-1-won-t-stream-tp7535517p7564811.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.




Re: Zurich / Swiss / Alps meetup

2012-05-18 Thread Benoit Perroud
+1 !



2012/5/17 Sasha Dolgy sdo...@gmail.com:
 All,

 A year ago I made a simple query to see if there were any users based in and
 around Zurich, Switzerland or the Alps region, interested in participating
 in some form of Cassandra User Group / Meetup.  At the time, 1-2 replies
 happened.  I didn't do much with that.

 Let's try this again.  Who all is interested?  I often am jealous about all
 the fun I miss out on with the regular meetups that happen stateside ...

 Regards,
 -sd

 --
 Sasha Dolgy
 sasha.do...@gmail.com



-- 
sent from my Nokia 3210


Re: cassandra read latency help

2012-05-18 Thread Radim Kolar
to get 100 random reads per second on large dataset (100 GB) you need 
more disks in raid 0 then 2.
Better is to add more nodes then stick too much disks into node. You 
need also adjust io scheduler in OS.


Re: Cassandra 1.0.6 multi data center read question

2012-05-18 Thread Tom Duffield (Mailing Lists)
Hey Roshan, 
Read requests accepted by your Coordinator node in your PROD environment will 
only be sent to your DR data center if you use a CONSISTENCY setting that 
specifies such. The easiest way to ensure you are only reading from Production 
is to use LOCAL_QUORUM or ONE on all reads in your PROD system. Unless you 
manage your Cassandra ring closely, other CONSISTENCY settings could result in 
data being read from DR. 

Hope this helps!

Tom 

-- 
Tom Duffield (Mailing Lists)
Sent with Sparrow (http://www.sparrowmailapp.com/?sig)


On Friday, May 18, 2012 at 12:51 AM, Roshan wrote:

 Hi 
 
 I have setup an Cassandra cluster in production and a separate cluster in
 our DR environment. The setup is basically 2 data center setup.
 
 I want to create a separate keyspace on production (production has some
 other keyspaces) and only that keyspace will sync the data with DR.
 
 If I do a read operation on the production, will that read operation goes to
 DR as well? If so can I disable that call?
 
 My primary purpose is to keep the DR upto date and won't to communicate the
 production with DR.
 
 Thanks.
 
 /Roshan 
 
 --
 View this message in context: 
 http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Cassandra-1-0-6-multi-data-center-read-question-tp7564940.html
 Sent from the cassandra-u...@incubator.apache.org 
 (mailto:cassandra-u...@incubator.apache.org) mailing list archive at 
 Nabble.com (http://Nabble.com).
 
 




unable to nodetool to remote EC2

2012-05-18 Thread ramesh

I updated the cassandra-env.sh
$JMX_HOST=10.20.30.40
JVM_OPTS=$JVM_OPTS -Djava.rmi.server.hostname=$JMX_HOST

netstat -ltn shows port 7199 is listening.

I tried both public and private IP for connecting but neither helps.

However, I am able to connect locally from within server.

 I get this error when I remote:

Error connection to remote JMX agent! java.rmi.ConnectException: 
Connection refused to host: 10.20.30.40; nested exception is: 
java.net.ConnectException: Connection timed out at 
sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:601) at 
sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:198) 
at sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:184) 
at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:110) at 
javax.management.remote.rmi.RMIServerImpl_Stub.newClient(Unknown Source) 
at 
javax.management.remote.rmi.RMIConnector.getConnection(RMIConnector.java:2329) 
at 
javax.management.remote.rmi.RMIConnector.connect(RMIConnector.java:279) 
at 
javax.management.remote.JMXConnectorFactory.connect(JMXConnectorFactory.java:248) 
at org.apache.cassandra.tools.NodeProbe.connect(NodeProbe.java:144) at 
org.apache.cassandra.tools.NodeProbe.(NodeProbe.java:114) at 
org.apache.cassandra.tools.NodeCmd.main(NodeCmd.java:623) Caused by: 
java.net.ConnectException: Connection timed out at 
java.net.PlainSocketImpl.socketConnect(Native Method) at 
java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351) at 
java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213) at 
java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200) at 
java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366) at 
java.net.Socket.connect(Socket.java:529) at 
java.net.Socket.connect(Socket.java:478) at 
java.net.Socket.(Socket.java:375) at java.net.Socket.(Socket.java:189) 
at 
sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(RMIDirectSocketFactory.java:22) 
at 
sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket(RMIMasterSocketFactory.java:128) 
at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:595) ... 
10 more


Any help appreciated.
Regards
Ramesh


Couldn't detect any schema definitions in local storage - after handling schema disagreement according to FAQ

2012-05-18 Thread Piavlo

 Hi,

I had a schema disagreement problem in cassandra 1.0.9 cluster, where 
one node had different schema version.
So I followed the faq at 
http://wiki.apache.org/cassandra/FAQ#schema_disagreement
disabled gossip, disabled thrift, drained  and finally stopped the 
cassandra process, on startup

noticed
INFO [main] 2012-05-18 16:23:11,879 DatabaseDescriptor.java (line 467) 
Couldn't detect any schema definitions in local storage.

in the log, and after
INFO [main] 2012-05-18 16:23:15,463 StorageService.java (line 619) 
Bootstrap/Replace/Move completed! Now serving reads.
it started throwing Fatal exceptions for all read/write operations 
endlessly.


I had to stop cassandra process again(no draining was done)

On second start it did came up ok immediately loading the correct 
cluster schema version
INFO [main] 2012-05-18 16:54:44,303 DatabaseDescriptor.java (line 499) 
Loading schema version 9db34ef0-a0be-11e1--f9687e034cf7


But now this node appears to have started with no data from keyspace 
which had schema disagreement.

The original keyspace sstables now appear under snapshots dir.

# nodetool -h localhost ring
Address DC  RackStatus State   Load
OwnsToken

   
141784319550391026443072753096570088106
10.49.127.4 eu-west 1a  Up Normal  8.19 GB 
16.67%  0
10.241.29.65eu-west 1b  Up Normal  8.18 GB 
16.67%  28356863910078205288614550619314017621
10.59.46.236eu-west 1c  Up Normal  8.22 GB 
16.67%  56713727820156410577229101238628035242
10.50.33.232eu-west 1a  Up Normal  8.2 GB  
16.67%  85070591730234615865843651857942052864
10.234.71.33eu-west 1b  Up Normal  8.15 GB 
16.67%  113427455640312821154458202477256070485
10.58.249.118   eu-west 1c  Up Normal  660.98 MB   
16.67%  141784319550391026443072753096570088106

#

The node is the one with 660.98 MB data( which is opscenter keyspace 
data which was not invalidated)


So i have some questions:

1) What did I wrong? - why cassandra was throwing exceptions on first 
startup?

2) Why the keyspace data was invalidated ? Is it expected?
3) If answer to #2 is  yes it's expected then  that's the point in 
doing http://wiki.apache.org/cassandra/FAQ#schema_disagreement
then all keyspace data is lost anyway? It makes more sense to just do 
http://wiki.apache.org/cassandra/Operations#Replacing_a_Dead_Node
4) afaiu i could also stop cassandra again move old sstables from 
snapshot back to keyspace data dir and run repair for all keyspace CFs? 
So that it finishes faster
and makes less load than running a repair which has no previous keyspace 
data at all?


The first startup log is below:

 INFO [main] 2012-05-18 16:23:07,367 AbstractCassandraDaemon.java (line 
105) Logging initialized
 INFO [main] 2012-05-18 16:23:07,382 AbstractCassandraDaemon.java (line 
126) JVM vendor/version: Java HotSpot(TM) 64-Bit Server VM/1.6.0_24
 INFO [main] 2012-05-18 16:23:07,383 AbstractCassandraDaemon.java (line 
127) Heap size: 2600468480/2600468480
 INFO [main] 2012-05-18 16:23:07,383 AbstractCassandraDaemon.java (line 
128) Classpath: 
/etc/cassandra/conf:/usr/share/java/jna.jar:/usr/share/java/mx4j-tools.jar:/usr/share/cassandra/lib/antlr-3.2.jar:/usr/share/cassandra/lib/apache-cassandra-1.0.9.jar:/usr/share/cassandra/lib/apache-cassandra-clientutil-1.0.9.jar:/usr/share/cassandra/lib/apache-cassandra-thrift-1.0.9.jar:/usr/share/cassandra/lib/avro-1.4.0-fixes.jar:/usr/share/cassandra/lib/avro-1.4.0-sources-fixes.jar:/usr/share/cassandra/lib/commons-cli-1.1.jar:/usr/share/cassandra/lib/commons-codec-1.2.jar:/usr/share/cassandra/lib/commons-lang-2.4.jar:/usr/share/cassandra/lib/compress-lzf-0.8.4.jar:/usr/share/cassandra/lib/concurrentlinkedhashmap-lru-1.2.jar:/usr/share/cassandra/lib/guava-r08.jar:/usr/share/cassandra/lib/high-scale-lib-1.1.2.jar:/usr/share/cassandra/lib/jackson-core-asl-1.4.0.jar:/usr/share/cassandra/lib/jackson-mapper-asl-1.4.0.jar:/usr/share/cassandra/lib/jamm-0.2.5.jar:/usr/share/cassandra/lib/jline-0.9.94.jar:/usr/share/cassandra/lib/joda-time-1.6.2.jar:/usr/share/cassandra/lib/json-simple-1.1.jar:/usr/share/cassandra/lib/libthrift-0.6.jar:/usr/share/cassandra/lib/log4j-1.2.16.jar:/usr/share/cassandra/lib/servlet-api-2.5-20081211.jar:/usr/share/cassandra/lib/slf4j-api-1.6.1.jar:/usr/share/cassandra/lib/slf4j-log4j12-1.6.1.jar:/usr/share/cassandra/lib/snakeyaml-1.6.jar:/usr/share/cassandra/lib/snappy-java-1.0.4.1.jar:/usr/share/cassandra//lib/jamm-0.2.5.jar
 INFO [main] 2012-05-18 16:23:10,661 CLibrary.java (line 109) JNA 
mlockall successful
 INFO [main] 2012-05-18 16:23:10,692 DatabaseDescriptor.java (line 114) 
Loading settings from file:/etc/cassandra/ssa/cassandra.yaml
 INFO [main] 2012-05-18 16:23:10,868 DatabaseDescriptor.java (line 168) 
DiskAccessMode 'auto' determined to 

Re: unable to nodetool to remote EC2

2012-05-18 Thread Tyler Hobbs
Your firewall rules need to allow TCP traffic on any port = 1024 for JMX
to work.  It initially connects on port 7199, but then the client is asked
to reconnect on a randomly chosen port.

You can open the firewall, SSH to the node first, or set up something like
this: http://simplygenius.com/2010/08/jconsole-via-socks-ssh-tunnel.html

On Fri, May 18, 2012 at 1:31 PM, ramesh dbgroup...@gmail.com wrote:

  I updated the cassandra-env.sh
 $JMX_HOST=10.20.30.40
 JVM_OPTS=$JVM_OPTS -Djava.rmi.server.hostname=$JMX_HOST

 netstat -ltn shows port 7199 is listening.

 I tried both public and private IP for connecting but neither helps.

 However, I am able to connect locally from within server.

  I get this error when I remote:

 Error connection to remote JMX agent! java.rmi.ConnectException:
 Connection refused to host: 10.20.30.40; nested exception is:
 java.net.ConnectException: Connection timed out at
 sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:601) at
 sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:198) at
 sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:184) at
 sun.rmi.server.UnicastRef.invoke(UnicastRef.java:110) at
 javax.management.remote.rmi.RMIServerImpl_Stub.newClient(Unknown Source) at
 javax.management.remote.rmi.RMIConnector.getConnection(RMIConnector.java:2329)
 at javax.management.remote.rmi.RMIConnector.connect(RMIConnector.java:279)
 at
 javax.management.remote.JMXConnectorFactory.connect(JMXConnectorFactory.java:248)
 at org.apache.cassandra.tools.NodeProbe.connect(NodeProbe.java:144) at
 org.apache.cassandra.tools.NodeProbe.**(NodeProbe.java:114) at
 org.apache.cassandra.tools.NodeCmd.main(NodeCmd.java:623) Caused by:
 java.net.ConnectException: Connection timed out at
 java.net.PlainSocketImpl.socketConnect(Native Method) at
 java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351) at
 java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213) at
 java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200) at
 java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366) at
 java.net.Socket.connect(Socket.java:529) at
 java.net.Socket.connect(Socket.java:478) at 
 java.net.Socket.**(Socket.java:375)
 at java.net.Socket.**(Socket.java:189) at
 sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(RMIDirectSocketFactory.java:22)
 at
 sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket(RMIMasterSocketFactory.java:128)
 at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:595) ... 10
 more**

 Any help appreciated.
 Regards
 Ramesh




-- 
Tyler Hobbs
DataStax http://datastax.com/


Re: unable to nodetool to remote EC2

2012-05-18 Thread ramesh

On 05/18/2012 01:35 PM, Tyler Hobbs wrote:
Your firewall rules need to allow TCP traffic on any port = 1024 for 
JMX to work.  It initially connects on port 7199, but then the client 
is asked to reconnect on a randomly chosen port.


You can open the firewall, SSH to the node first, or set up something 
like this: 
http://simplygenius.com/2010/08/jconsole-via-socks-ssh-tunnel.html


On Fri, May 18, 2012 at 1:31 PM, ramesh dbgroup...@gmail.com 
mailto:dbgroup...@gmail.com wrote:


I updated the cassandra-env.sh
$JMX_HOST=10.20.30.40
JVM_OPTS=$JVM_OPTS -Djava.rmi.server.hostname=$JMX_HOST

netstat -ltn shows port 7199 is listening.

I tried both public and private IP for connecting but neither helps.

However, I am able to connect locally from within server.

 I get this error when I remote:

Error connection to remote JMX agent! java.rmi.ConnectException:
Connection refused to host: 10.20.30.40; nested exception is:
java.net.ConnectException: Connection timed out at
sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:601)
at
sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:198)
at
sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:184) at
sun.rmi.server.UnicastRef.invoke(UnicastRef.java:110) at
javax.management.remote.rmi.RMIServerImpl_Stub.newClient(Unknown
Source) at

javax.management.remote.rmi.RMIConnector.getConnection(RMIConnector.java:2329)
at
javax.management.remote.rmi.RMIConnector.connect(RMIConnector.java:279)
at

javax.management.remote.JMXConnectorFactory.connect(JMXConnectorFactory.java:248)
at
org.apache.cassandra.tools.NodeProbe.connect(NodeProbe.java:144)
at org.apache.cassandra.tools.NodeProbe. (NodeProbe.java:114) at
org.apache.cassandra.tools.NodeCmd.main(NodeCmd.java:623) Caused
by: java.net.ConnectException: Connection timed out at
java.net.PlainSocketImpl.socketConnect(Native Method) at
java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351) at
java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213) at
java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200) at
java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366) at
java.net.Socket.connect(Socket.java:529) at
java.net.Socket.connect(Socket.java:478) at java.net.Socket.
(Socket.java:375) at java.net.Socket. (Socket.java:189) at

sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(RMIDirectSocketFactory.java:22)
at

sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket(RMIMasterSocketFactory.java:128)
at
sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:595)
... 10 more

Any help appreciated.
Regards
Ramesh




--
Tyler Hobbs
DataStax http://datastax.com/



It helped.
Thanks Tyler for the info and the link to the post.

Regards
Ramesh


Re: Migrating a column family from one cluster to another

2012-05-18 Thread Rob Coli
On Thu, May 17, 2012 at 9:37 AM, Bryan Fernandez bfernande...@gmail.com wrote:
 What would be the recommended
 approach to migrating a few column families from a six node cluster to a
 three node cluster?

The easiest way (if you are not using counters) is :

1) make sure all filenames of sstables are unique [1]
2) copy all sstablefiles from the 6 nodes to all 3 nodes
3) run a cleanup compaction on the 3 nodes

=Rob
[1] https://issues.apache.org/jira/browse/CASSANDRA-1983

-- 
=Robert Coli
AIMGTALK - rc...@palominodb.com
YAHOO - rcoli.palominob
SKYPE - rcoli_palominodb


Re: Migrating a column family from one cluster to another

2012-05-18 Thread Poziombka, Wade L
How does counters affect this?  Why would be different?  

Sent from my iPhone

On May 18, 2012, at 15:40, Rob Coli rc...@palominodb.com wrote:

 On Thu, May 17, 2012 at 9:37 AM, Bryan Fernandez bfernande...@gmail.com 
 wrote:
 What would be the recommended
 approach to migrating a few column families from a six node cluster to a
 three node cluster?
 
 The easiest way (if you are not using counters) is :
 
 1) make sure all filenames of sstables are unique [1]
 2) copy all sstablefiles from the 6 nodes to all 3 nodes
 3) run a cleanup compaction on the 3 nodes
 
 =Rob
 [1] https://issues.apache.org/jira/browse/CASSANDRA-1983
 
 -- 
 =Robert Coli
 AIMGTALK - rc...@palominodb.com
 YAHOO - rcoli.palominob
 SKYPE - rcoli_palominodb


Re: Migrating a column family from one cluster to another

2012-05-18 Thread Rob Coli
On Fri, May 18, 2012 at 1:41 PM, Poziombka, Wade L
wade.l.poziom...@intel.com wrote:
 How does counters affect this?  Why would be different?

Oh, actually this is an obsolete caution as of Cassandra 0.8beta1 :

https://issues.apache.org/jira/browse/CASSANDRA-1938

Sorry! :)

=Rob
PS - for historical reference, before this ticket the counts were
based on the ip address of the nodes and things would be hosed if you
did the copy-all-the-sstables operations. it is easy for me to forget
that almost no one was using cassandra counters before 0.8, heh.

-- 
=Robert Coli
AIMGTALK - rc...@palominodb.com
YAHOO - rcoli.palominob
SKYPE - rcoli_palominodb


Re: cassandra read latency help

2012-05-18 Thread Gurpreet Singh
Thanks Radim.
Radim, actually 100 reads per second is achievable even with 2 disks.
But achieving them with a really low avg latency per key is the issue.

I am wondering if anyone has played with index_interval, and how much of a
difference would it make to reads on reducing the index_interval. I am
thinking of devoting a 32 gig RAM machine to this node, and decreasing
index_interval to a value of 8 from 128.

For 500 million keys, this would mean 500/8 ~ 64 million keys in memory.

index overhead  = 64 million * (32 + avg key size) (
http://www.datastax.com/docs/1.0/cluster_architecture/cluster_planning)
my avg keysize=8. hence
overhead = 64 million * 40 = 2.56 gb (is this number same as the size in
memory?).
If yes,  then its not too bad, and eliminates the index disk read for a
large majority of the keys.

Also, my data has uniformly 2 columns. Will sstable compression help my
reads in any way?
Thanks
Gurpreet





On Fri, May 18, 2012 at 6:19 AM, Radim Kolar h...@filez.com wrote:

 to get 100 random reads per second on large dataset (100 GB) you need more
 disks in raid 0 then 2.
 Better is to add more nodes then stick too much disks into node. You need
 also adjust io scheduler in OS.



Cassandra 1.1.0 NullCompressor and DecoratedKey errors

2012-05-18 Thread Ron Siemens

We have some production Solaris boxes so I can't use SnappyCompressor (no 
library included for Solaris), so I set it to JavaDeflate.  I've also noticed 
higher load issues with 1.1.0 versus 1.0.6: could this be JavaDeflate, or is 
that what the old default was?  Anyway, I thought I would try no compression, 
since I found code like this in one of the issue discussions with 
SnappyCompression.

public class NullCompressor implements ICompressor
{
public static final NullCompressor instance = new NullCompressor();

public static NullCompressor create( MapString, String compressionOptions 
) {
return instance;
}

public int initialCompressedBufferLength( int chunkLength ) {
return chunkLength;
}

public int compress( byte[] input, int inputOffset, int inputLength, 
ICompressor.WrappedArray output, int outputOffset ) throws IOException {
System.arraycopy( input, inputOffset, output.buffer, outputOffset, 
inputLength );
return inputLength;
}

public int uncompress( byte[] input, int inputOffset, int inputLength, 
byte[] output, int outputOffset ) throws IOException {
System.arraycopy( input, inputOffset, output, outputOffset, inputLength 
);
return inputLength;
}

}

But now I get some curious errors in Cassandra log, that I haven't seen 
previously:

ERROR [ReadStage:294] 2012-05-18 15:33:40,039 AbstractCassandraDaemon.java 
(line 134) Exception in thread Thread[ReadStage:294,5,main]
java.lang.AssertionError: DecoratedKey(105946799083363489728328364782061531811, 
57161d05b50004b3130008007e04c057161d05b60004ae380008007f04c057161d05b700048d610008008004c057161d05b80004c1040008008104c057161d05b900048ac10008008204c057161d05ba0004ae8b0008008304c057161d05bb000474950008008404c057161d05bc0004bb240008008504c057161d05bd0004ba320008008604c057161d05be0004be9a0008008704c057161d05bf0004b9fa0008008804c057161d05c48e7f0008008904c057161d05c10004ba590008008a04c057161d05c20004b64d0008008b04c057161d05c30004bae30008008c04c057161d05c40004bee50008008d04c057161d05c5000487590008008e04c057161d05c60004bad8008f04c057161d05c70004badb0008009004c057161d05c80004bf140008009104c057161d05c90004b7ec0008009204c057161d05ca0004bace0008009304c057161d05cb0004ba170008009404c057161d05cc000484a10008009504c057161d05cd000495670008009604c057161d05ce0004ab98009704c057161d05cf0004b6110008009804c057161d05d4af550008009904c057161d05d10004abfc0008009a04c057161d05d20004bf350008009b04c057161d05d30004bacd0008009c04c057161d05d40004bd0a0008009d04c057161d05d50004bac10008009e04c057161d05d60004af530008009f04c057161d05d70004b97a000800a004c057161d05d80004af13000800a104c057161d05d90004a25600085452535f32373138004300130001020008004fb6cd4b0004c05716072b78000100080001000104c05716072b790004b87900074348535f3435360309001800030002be188961212f0cd18f5ddb69e0a336ed4fb6cd4f0004c057164b73f0001b00080001000204c057164b73f1000479ba00080002000104c057164b73f20004b4380003000104c057164b73f30004b462)
 != DecoratedKey(53124083656910387079795798648228312597, 5448535f323739) in 
/home/apollo/cassandra/data/ggstores3/ndx_items_category/ggstores3-ndx_items_category-hc-1-Data.db
at 
org.apache.cassandra.db.columniterator.SSTableSliceIterator.init(SSTableSliceIterator.java:58)
at 
org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:66)
at 
org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:78)
at 
org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:233)
at 
org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:61)
at 
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1273)
at 
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1155)
at 

Re: Cassandra 1.1.0 NullCompressor and DecoratedKey errors

2012-05-18 Thread Ron Siemens

I decided to wipe cassandra clean and try again.  Haven't seen it again yet, 
but will report if I do.  This may have been a symptom of having some previous 
data around, as my steps were:

1. shutdown and wipe data
2. run with NullCompressor
3. notice Cassandra complain compressor is not in package 
org.apache.cassandra.io
4. shutdown
5. move compressor to expected package
6. run with NullCOmpressor

Can't remember if I did another wipe after 4, so there may have been some data 
in a bad state?  It seems client side didn't care what package the compressor 
was in, but service side did.

Unless I see the error again, I'm guessing there was some data leftover between 
trials.

Ron


On May 18, 2012, at 3:38 PM, Ron Siemens wrote:

 
 We have some production Solaris boxes so I can't use SnappyCompressor (no 
 library included for Solaris), so I set it to JavaDeflate.  I've also noticed 
 higher load issues with 1.1.0 versus 1.0.6: could this be JavaDeflate, or is 
 that what the old default was?  Anyway, I thought I would try no compression, 
 since I found code like this in one of the issue discussions with 
 SnappyCompression.
 
 public class NullCompressor implements ICompressor
 {
public static final NullCompressor instance = new NullCompressor();
 
public static NullCompressor create( MapString, String 
 compressionOptions ) {
return instance;
}
 
public int initialCompressedBufferLength( int chunkLength ) {
return chunkLength;
}
 
public int compress( byte[] input, int inputOffset, int inputLength, 
 ICompressor.WrappedArray output, int outputOffset ) throws IOException {
System.arraycopy( input, inputOffset, output.buffer, outputOffset, 
 inputLength );
return inputLength;
}
 
public int uncompress( byte[] input, int inputOffset, int inputLength, 
 byte[] output, int outputOffset ) throws IOException {
System.arraycopy( input, inputOffset, output, outputOffset, 
 inputLength );
return inputLength;
}
 
 }
 
 But now I get some curious errors in Cassandra log, that I haven't seen 
 previously:
 
 ERROR [ReadStage:294] 2012-05-18 15:33:40,039 AbstractCassandraDaemon.java 
 (line 134) Exception in thread Thread[ReadStage:294,5,main]
 java.lang.AssertionError: 
 DecoratedKey(105946799083363489728328364782061531811, 
 57161d05b50004b3130008007e04c057161d05b60004ae380008007f04c057161d05b700048d610008008004c057161d05b80004c1040008008104c057161d05b900048ac10008008204c057161d05ba0004ae8b0008008304c057161d05bb000474950008008404c057161d05bc0004bb240008008504c057161d05bd0004ba320008008604c057161d05be0004be9a0008008704c057161d05bf0004b9fa0008008804c057161d05c48e7f0008008904c057161d05c10004ba590008008a04c057161d05c20004b64d0008008b04c057161d05c30004bae30008008c04c057161d05c40004bee50008008d04c057161d05c5000487590008008e04c057161d05c60004bad8008f04c057161d05c70004badb0008009004c057161d05c80004bf140008009104c057161d05c90004b7ec0008009204c057161d05ca0004bace0008009304c057161d05cb0004ba170008009404c057161d05cc000484a10008009504c057161d05cd000495670008009604c057161d05ce0004ab98009704c057161d05cf0004b6110008009804c057161d05d4af550008009904c057161d05d10004abfc0008009a04c057161d05d20004bf350008009b04c057161d05d30004bacd0008009c04c057161d05d40004bd0a0008009d04c057161d05d50004bac10008009e04c057161d05d60004af530008009f04c057161d05d70004b97a000800a004c057161d05d80004af13000800a104c057161d05d90004a25600085452535f32373138004300130001020008004fb6cd4b0004c05716072b78000100080001000104c05716072b790004b87900074348535f3435360309001800030002be188961212f0cd18f5ddb69e0a336ed4fb6cd4f0004c057164b73f0001b00080001000204c057164b73f1000479ba00080002000104c057164b73f20004b4380003000104c057164b73f30004b462)
  != DecoratedKey(53124083656910387079795798648228312597, 5448535f323739) in 
 /home/apollo/cassandra/data/ggstores3/ndx_items_category/ggstores3-ndx_items_category-hc-1-Data.db
   at 
 

Re: unable to nodetool to remote EC2

2012-05-18 Thread ramesh

On 05/18/2012 01:35 PM, Tyler Hobbs wrote:
Your firewall rules need to allow TCP traffic on any port = 1024 for 
JMX to work.  It initially connects on port 7199, but then the client 
is asked to reconnect on a randomly chosen port.


You can open the firewall, SSH to the node first, or set up something 
like this: 
http://simplygenius.com/2010/08/jconsole-via-socks-ssh-tunnel.html


On Fri, May 18, 2012 at 1:31 PM, ramesh dbgroup...@gmail.com 
mailto:dbgroup...@gmail.com wrote:


I updated the cassandra-env.sh
$JMX_HOST=10.20.30.40
JVM_OPTS=$JVM_OPTS -Djava.rmi.server.hostname=$JMX_HOST

netstat -ltn shows port 7199 is listening.

I tried both public and private IP for connecting but neither helps.

However, I am able to connect locally from within server.

 I get this error when I remote:

Error connection to remote JMX agent! java.rmi.ConnectException:
Connection refused to host: 10.20.30.40; nested exception is:
java.net.ConnectException: Connection timed out at
sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:601)
at
sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:198)
at
sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:184) at
sun.rmi.server.UnicastRef.invoke(UnicastRef.java:110) at
javax.management.remote.rmi.RMIServerImpl_Stub.newClient(Unknown
Source) at

javax.management.remote.rmi.RMIConnector.getConnection(RMIConnector.java:2329)
at
javax.management.remote.rmi.RMIConnector.connect(RMIConnector.java:279)
at

javax.management.remote.JMXConnectorFactory.connect(JMXConnectorFactory.java:248)
at
org.apache.cassandra.tools.NodeProbe.connect(NodeProbe.java:144)
at org.apache.cassandra.tools.NodeProbe. (NodeProbe.java:114) at
org.apache.cassandra.tools.NodeCmd.main(NodeCmd.java:623) Caused
by: java.net.ConnectException: Connection timed out at
java.net.PlainSocketImpl.socketConnect(Native Method) at
java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351) at
java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213) at
java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200) at
java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366) at
java.net.Socket.connect(Socket.java:529) at
java.net.Socket.connect(Socket.java:478) at java.net.Socket.
(Socket.java:375) at java.net.Socket. (Socket.java:189) at

sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(RMIDirectSocketFactory.java:22)
at

sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket(RMIMasterSocketFactory.java:128)
at
sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:595)
... 10 more

Any help appreciated.
Regards
Ramesh




--
Tyler Hobbs
DataStax http://datastax.com/

Got JConsole to work this way. But unable to get the a similar script 
for nodetool to work. Is there any guide or pointers to perform nodetool 
operations remotely with/without authentication ?


Also is Datastax Opscenter a replacement for nodetool ??

regards,
Ramesh