RE: Exception when setting tokens for the cassandra nodes

2013-04-30 Thread Rahul
Oh.my bad. Thanks mate, that worked.
On Apr 29, 2013 10:03 PM, moshe.kr...@barclays.com wrote:

 For starters:  If you are using the Murmur3 partitioner, which is the
 default in cassandra.yaml, then you need to calculate the tokens using:***
 *

 python -c 'print [str(((2**64 / 2) * i) - 2**63) for i in range(2)]'

 ** **

 which gives the following values:

 ['-9223372036854775808', '0']

 ** **

 *From:* Rahul [mailto:rahule...@gmail.com]
 *Sent:* Monday, April 29, 2013 7:23 PM
 *To:* user@cassandra.apache.org
 *Subject:* Exception when setting tokens for the cassandra nodes

 ** **

 Hi,

 I am testing out Cassandra 1.2 on two of my local servers. But I face
 problems with assigning  tokens to my nodes. When I use nodetool to set
 token, I end up getting an java Exception.

 My test setup is as follows,

 Node1: local ip 1 (seed)

 Node2: local ip 2 (seed)

 ** **

 Since I have two nodes, i calculated the tokens as 0 and 2^127/2
 = 85070591730234615865843651857942052864. I was able to set token 0 for my
 first node using nodetool move 0 , but when i am trying to
 set 85070591730234615865843651857942052864 for my second node,

 it throws a main UndeclaredThrowableException. Full stack is attached
 bellow.

 ** **

 user@server~$ nodetool move 85070591730234615865843651857942052864

 ** **

 Exception in thread main java.lang.reflect.UndeclaredThrowableException*
 ***

 at $Proxy0.getTokenToEndpointMap(Unknown Source)

 at
 org.apache.cassandra.tools.NodeProbe.getTokenToEndpointMap(NodeProbe.java:288)
 

 at
 org.apache.cassandra.tools.NodeCmd.printRing(NodeCmd.java:215)

 at org.apache.cassandra.tools.NodeCmd.main(NodeCmd.java:1051)*
 ***

 Caused by: javax.management.InstanceNotFoundException:
 org.apache.cassandra.db:type=StorageService

 at
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getMBean(DefaultMBeanServerInterceptor.java:1095)
 

 at
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:643)
 

 at
 com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:668)
 

 at
 javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1463)
 

 at
 javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:96)
 

 at
 javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1327)
 

 at
 javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1419)
 

 at
 javax.management.remote.rmi.RMIConnectionImpl.getAttribute(RMIConnectionImpl.java:656)
 

 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 

 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 

 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 

 at java.lang.reflect.Method.invoke(Method.java:601)

 at
 sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:322)

 at sun.rmi.transport.Transport$1.run(Transport.java:177)

 at sun.rmi.transport.Transport$1.run(Transport.java:174)

 at java.security.AccessController.doPrivileged(Native Method)*
 ***

 at sun.rmi.transport.Transport.serviceCall(Transport.java:173)
 

 at
 sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:553)**
 **

 at
 sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:808)
 

 at
 sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:667)
 

 at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
 

 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
 

 at java.lang.Thread.run(Thread.java:722)

 at
 sun.rmi.transport.StreamRemoteCall.exceptionReceivedFromServer(StreamRemoteCall.java:273)
 

 at
 sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:251)*
 ***

 at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:160)

 at com.sun.jmx.remote.internal.PRef.invoke(Unknown Source)

 at
 javax.management.remote.rmi.RMIConnectionImpl_Stub.getAttribute(Unknown
 Source)

 at
 javax.management.remote.rmi.RMIConnector$RemoteMBeanServerConnection.getAttribute(RMIConnector.java:901)
 

 at
 javax.management.MBeanServerInvocationHandler.invoke(MBeanServerInvocationHandler.java:280)
 

 ** **

 ** **

 Any suggestions towards solving this problem would be deeply appreciated.*
 ***

 thanks,

 rahul


nodetool status OWNS and multiple DCs

2013-04-30 Thread Sergey Naumov
Hello.

I have set up test cluster of 2 DCs with 1 node in each DC. In each config
I specified 256 virtual nodes and chosed GossippingPropertyFileSnitch.

For node1:
~/Cassandra$ cat /etc/cassandra/cassandra-rackdc.properties
dc=DC1
rack=RAC1

For node2:
~/Cassandra$ cat /etc/cassandra/cassandra-rackdc.properties
dc=DC2
rack=RAC1

When I call nodetool status, it shows not 100% ownership of tokens for
each DC:
:~/Cassandra$ nodetool status
Datacenter: DC1
===
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address Load   Tokens  Owns   Host
ID   Rack
UN  172.16.0.1  97.43 KB   256 51,5%
7ffd2432-46c9-443c-aa0c-8bfd960d2acc  RAC1
Datacenter: DC2
===
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address Load   Tokens  Owns   Host
ID   Rack
UN  172.16.0.2  91.78 KB   256 48,5%
93d3bdbf-8625-4e54-8c7c-087fbfe419f5  RAC1

I thought that each datacenter has 100% coverage of token range. What does
the value in Owns field mean and how it affects a replication (for
exaple, with replication factors DC1:1, DC2:2)?

Thanks in advance,
Sergey Naumov.


Exporting all data within a keyspace

2013-04-30 Thread Chidambaran Subramanian
Is there any easy way of exporting all data for a keyspace (and conversely)
importing it.

Regards
Chiddu


Re: Exporting all data within a keyspace

2013-04-30 Thread Kumar Ranjan
Try sstable2json and json2sstable. But it works on column family so you can
fetch all column family and iterate over list of CF and use sstable2json
tool to extract data. Remember this will only fetch on disk data do
anything in memtable/cache which is to be flushed will be missed. So run
compaction and then run the written script.

On Tuesday, April 30, 2013, Chidambaran Subramanian wrote:

 Is there any easy way of exporting all data for a keyspace (and
 conversely) importing it.

 Regards
 Chiddu



Re: Exporting all data within a keyspace

2013-04-30 Thread Brian O'Neill

You could always do something like this as well:
http://brianoneill.blogspot.com/2012/05/dumping-data-from-cassandra-like.htm
l

-brian

---
Brian O'Neill
Lead Architect, Software Development
Health Market Science
The Science of Better Results
2700 Horizon Drive € King of Prussia, PA € 19406
M: 215.588.6024 € @boneill42 http://www.twitter.com/boneill42   €
healthmarketscience.com


This information transmitted in this email message is for the intended
recipient only and may contain confidential and/or privileged material. If
you received this email in error and are not the intended recipient, or the
person responsible to deliver it to the intended recipient, please contact
the sender at the email above and delete this email and any attachments and
destroy any copies thereof. Any review, retransmission, dissemination,
copying or other use of, or taking any action in reliance upon, this
information by persons or entities other than the intended recipient is
strictly prohibited.
 


From:  Kumar Ranjan winnerd...@gmail.com
Reply-To:  user@cassandra.apache.org
Date:  Tuesday, April 30, 2013 9:11 AM
To:  user@cassandra.apache.org user@cassandra.apache.org
Subject:  Re: Exporting all data within a keyspace

Try sstable2json and json2sstable. But it works on column family so you can
fetch all column family and iterate over list of CF and use sstable2json
tool to extract data. Remember this will only fetch on disk data do anything
in memtable/cache which is to be flushed will be missed. So run compaction
and then run the written script.

On Tuesday, April 30, 2013, Chidambaran Subramanian  wrote:
 Is there any easy way of exporting all data for a keyspace (and conversely)
 importing it.
 
 Regards
 Chiddu




Re: normal thread counts?

2013-04-30 Thread William Oberman
I use phpcassa.

I did a thread dump.  99% of the threads look very similar (I'm using 1.1.9
in terms of matching source lines).  The thread names are all like this:
WRITE-/10.x.y.z.  There are a LOT of duplicates (in terms of the same
IP).  Many many many of the threads are trying to talk to IPs that aren't
in the cluster (I assume they are the IP's of dead hosts).  The stack trace
is basically the same for them all, attached at the bottom.

There is a lot of things I could talk about in terms of my situation, but
what I think might be pertinent to this thread: I hit a tipping point
recently and upgraded a 9 node cluster from AWS m1.large to m1.xlarge
(rolling, one at a time).  7 of the 9 upgraded fine and work great.  2 of
the 9 keep struggling.  I've replaced them many times now, each time using
this process:
http://www.datastax.com/docs/1.1/cluster_management#replacing-a-dead-node
And even this morning the only two nodes with a high number of threads are
those two (yet again).  And at some point they'll OOM.

Seems like there is something about my cluster (caused by the recent
upgrade?) that causes a thread leak on OutboundTcpConnection   But I don't
know how to escape from the trap.  Any ideas?



  stackTrace = [ {
className = sun.misc.Unsafe;
fileName = Unsafe.java;
lineNumber = -2;
methodName = park;
nativeMethod = true;
   }, {
className = java.util.concurrent.locks.LockSupport;
fileName = LockSupport.java;
lineNumber = 158;
methodName = park;
nativeMethod = false;
   }, {
className =
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject;
fileName = AbstractQueuedSynchronizer.java;
lineNumber = 1987;
methodName = await;
nativeMethod = false;
   }, {
className = java.util.concurrent.LinkedBlockingQueue;
fileName = LinkedBlockingQueue.java;
lineNumber = 399;
methodName = take;
nativeMethod = false;
   }, {
className = org.apache.cassandra.net.OutboundTcpConnection;
fileName = OutboundTcpConnection.java;
lineNumber = 104;
methodName = run;
nativeMethod = false;
   } ];
--




On Mon, Apr 29, 2013 at 4:31 PM, aaron morton aa...@thelastpickle.comwrote:

  I used JMX to check current number of threads in a production cassandra
 machine, and it was ~27,000.

 That does not sound too good.

 My first guess would be lots of client connections. What client are you
 using, does it do connection pooling ?
 See the comments in cassandra.yaml around rpc_server_type, the default
 uses sync uses one thread per connection, you may be better with HSHA. But
 if your app is leaking connection you should probably deal with that first.

 Cheers

-
 Aaron Morton
 Freelance Cassandra Consultant
 New Zealand

 @aaronmorton
 http://www.thelastpickle.com

 On 30/04/2013, at 3:07 AM, William Oberman ober...@civicscience.com
 wrote:

 Hi,

 I'm having some issues.  I keep getting:
 
 ERROR [GossipStage:1] 2013-04-28 07:48:48,876 AbstractCassandraDaemon.java
 (line 135) Exception in thread Thread[GossipStage:1,5,main]
 java.lang.OutOfMemoryError: unable to create new native thread
 --
 after a day or two of runtime.  I've checked and my system settings seem
 acceptable:
 memlock=unlimited
 nofiles=10
 nproc=122944

 I've messed with heap sizes from 6-12GB (15 physical, m1.xlarge in AWS),
 and I keep OOM'ing with the above error.

 I've found some (what seem to me) to be obscure references to the stack
 size interacting with # of threads.  If I'm understanding it correctly, to
 reason about Java mem usage I have to think of OS + Heap as being locked
 down, and the stack gets the leftovers of physical memory and each thread
 gets a stack.

 For me, the system ulimit setting on stack is 10240k (no idea if java sees
 or respects this setting).  My -Xss for cassandra is the default (I hope,
 don't remember messing with it) of 180k.  I used JMX to check current
 number of threads in a production cassandra machine, and it was ~27,000.
  Is that a normal thread count?  Could my OOM be related to stack + number
 of threads, or am I overlooking something more simple?

 will





SSTables not opened on new cluste

2013-04-30 Thread Philippe
Hello,

I'm trying to bring up a copy of an existing 3-node cluster running 1.0.8
into a 3-node cluster running 1.0.11.
The new cluster has been configured to have the same tokens and the same
partitioner.

Initially, I copied the files in the data directory of each node into their
corresponding node on the new cluster. When starting the new cluster, it
didn't pick up the KS  CF.
So I remove the directories, created the schema, stopped cassandra and
copied the data back. None of the SSTables are opened at startup.

When I nodetool refresh them, it says No new SSTables where found for
 XXX/YYY
where X is my KS and Y is my CF

I'm totally stumped: any ideas as to why this is happening ? I've checked
that there are actual files there and that the permissions are correct.

Thanks.


How does a healthy node look like?

2013-04-30 Thread Steppacher Ralf
Hi,

I have troubles finding some quantitative information as to how a healthy 
Cassandra node should look like (CPU usage, number of flushes,SSTables, 
compactions, GC), given a certain hardware spec and read/write load. I have 
troubles gauging our first and only Cassandra node, whether it needs tuning or 
is simply overloaded.
If anyone could point me to some data that would be very helpful.

(So far I have run the node with the default settings in cassandra.yaml and 
cassandra-env. The log claims that the server is occasionally under memory 
pressure and I get frequent timeouts for writes.  I see what I think are many 
flushes, compactions and GCs in the log. Some toying with heap and new gen 
sizes, key cache, and the compaction throughput settings did not improve the 
overall situation much.)


Thanks!
Ralf


Re: error casandra ring an hadoop connection ¿?

2013-04-30 Thread aaron morton
 ava.lang.RuntimeException: UnavailableException()

Looks like the pig script could talk to one node, but the coordinator could not 
process the request at the consistency level requested. Check all the nodes are 
up, that the RF is set to the correct value and the CL you are using. 

Cheers

-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 30/04/2013, at 4:55 AM, Miguel Angel Martin junquera 
mianmarjun.mailingl...@gmail.com wrote:

 
 
 hi all:
 
 i can run pig with cassandra and hadoop in EC2.
 
 I ,m trying to run pig with cassandra ring  and hadoop 
 The ring cassandra have the tasktrackers and datanodes , too. 
 
 and i running pig from another machine where i have intalled the 
 namenode-jobtracker.
 ihave a simple script to load data ffrom pygmalion keyspace adn columfalimily 
 account and dump result to test.
 I  installed another simple local cassandra  in namenode-job tacker machine  
 and i can run pig jobs ok, but when i try to run script  in cassandra ring 
 config changig the config of envitronment variable  PIG_INITIAL_ADDRESS to 
 the IP of one of the nodes of cassandra ring i have this error:
 
 
 ---
 
 
 j
 ava.lang.RuntimeException: UnavailableException()
   at 
 org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.maybeInit(ColumnFamilyRecordReader.java:384)
   at 
 org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.computeNext(ColumnFamilyRecordReader.java:390)
   at 
 org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.computeNext(ColumnFamilyRecordReader.java:313)
   at 
 com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
   at 
 com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
   at 
 org.apache.cassandra.hadoop.ColumnFamilyRecordReader.nextKeyValue(ColumnFamilyRecordReader.java:184)
   at 
 org.apache.cassandra.hadoop.pig.CassandraStorage.getNext(CassandraStorage.java:226)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:211)
   at 
 org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:532)
   at 
 org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136)
   at org.apache.hadoop.mapred.Child.main(Child.java:249)
 Caused by: UnavailableException()
   at 
 org.apache.cassandra.thrift.Cassandra$get_range_slices_result.read(Cassandra.java:12924)
   at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
   at 
 org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(Cassandra.java:734)
   at 
 org.apache.cassandra.thrift.Cassandra$Client.get_range_slices(Cassandra.java:718)
   at 
 org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.maybeInit(ColumnFamilyRecordReader.java:346)
   ... 17 more
 
 
 
 can anybody help me o have any idea?
 Thanks in advance
 pd:
 1.- the ports are open in EC2 
 2 The keyspace and cF are created in the cassandra cluster  EC2  too nad 
 likey at the name node cassandra installation.
 3.-i have this bash_profile configuration:
 # .bash_profile
 
 # Get the aliases and functions
 if [ -f ~/.bashrc ]; then
 . ~/.bashrc
 fi
 
 # User specific environment and startup programs
 
 PATH=$PATH:$HOME/.local/bin:$HOME/bin
 export PATH=$PATH:/usr/lib/jvm/java-1.7.0-openjdk.x86_64/bin
 export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk.x86_64
 export CASSANDRA_HOME=/home/ec2-user/apache-cassandra-1.2.4
 export PIG_HOME=/home/ec2-user/pig-0.11.1-src
 export PIG_INITIAL_ADDRESS=10.210.164.233
 #export PIG_INITIAL_ADDRESS=127.0.0.1
 export PIG_RPC_PORT=9160
 export PIG_CONF_DIR=/home/ec2-user/hadoop-1.1.1/conf
 export PIG_PARTITIONER=org.apache.cassandra.dht.Murmur3Partitioner
 #export PIG_PARTITIONER=org.apache.cassandra.dht.RandomPartitioner
 
 
 4.- I export all cassandrasjars in the hadoop-env.sh for all nodes of hadoop
 5.- i have the same error running  PIG in local mode 
 
 6.- if i change to ramdonpartioner  
 an reload changes   i have this error:
 
 java.lang.RuntimeException: InvalidRequestException(why:Start token sorts 
 after end token)
   at 
 org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.maybeInit(ColumnFamilyRecordReader.java:384)
   at 
 

Re: Compaction, Slow Ring, and bad behavior

2013-04-30 Thread aaron morton
Check the logs for warnings from the GCInspector. 
If you see messages that correlate with compaction running limit compaction to 
help stabilise things…

* set concurrrent_compactions to 2
* if you have wide rows reduce in_memory_compaction_limit
* reduce compaction_throughput

If you have a lot (more than 200 million) of rows check the size of the bloom 
filters using nodetool cfstats. If it's around 1GB consider increase the 
bloom_filter_fp_chance per CF to 0.01 or 0.1

 I've tried changing the amount of RAM between 8G and 12G,
More JVM memory is not always the answer, try to get back to stable on the the 
defaults or something close to them and then tune from there. 

  sometimes gets stuck on a compaction with near-idle disk throughput
Wide rows can slow down compaction, check the row size with nodetool cfstats or 
nodetool cfhistograms

Cheers

-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 30/04/2013, at 5:33 AM, Drew from Zhrodague drewzhroda...@zhrodague.net 
wrote:

   Hi, we have a 9-node ring on m1.xlarge AWS hosts. We started having 
 some trouble a while ago, and it's making me pull out all of my hair.
 
   The host in position #3 has been replaced 4 times. Each time, the host 
 joins the ring, I do a nodetool repair -pr, and she seems fine for about a 
 day. Then she gets real slow, sometimes OOMs, sometimes takes down the host 
 in position #5, sometimes gets stuck on a compaction with near-idle disk 
 throughput, and eventually dies without any kind of error message or reason 
 for failing.
 
   Sometimes our cluster gets so slow that it is almost unusable - we get 
 timeout errors from our application, AWS sends us voluminous alerts about 
 latency.
 
   I've tried changing the amount of RAM between 8G and 12G, changing the 
 MAX_HEAP_SIZE and HEAP_NEWSIZE, repeatedly forcing a stop compaction, setting 
 astronomical ulimit values, and praying to available gods. I'm a bit 
 confused. We're not using super-wide rows, most things are default.
 
   EL5, Cassandra 1.1.9, Java 1.6.0
 
 
 -- 
 
 Drew from Zhrodague
 lolcat divinator
 d...@zhrodague.net



Re: Really odd issue (AWS related?)

2013-04-30 Thread Ben Chobot
We've also had issues with ephemeral drives in a single AZ in us-east-1, so 
much so that we no longer use that AZ. Though our issues tended to be obvious 
from instance boot - they wouldn't suddenly degrade.

On Apr 28, 2013, at 2:27 PM, Alex Major wrote:

 Hi Mike,
 
 We had issues with the ephemeral drives when we first got started, although 
 we never got to the bottom of it so I can't help much with troubleshooting 
 unfortunately. Contrary to a lot of the comments on the mailing list we've 
 actually had a lot more success with EBS drives (PIOPs!). I'd definitely 
 suggest try striping 4 EBS drives (Raid 0) and using PIOPs.
 
 You could be having a noisy neighbour problem, I don't believe that m1.large 
 or m1.xlarge instances get all of the actual hardware, virtualisation on EC2 
 still sucks in isolating resources.
 
 We've also had more success with Ubuntu on EC2, not so much with our 
 Cassandra nodes but some of our other services didn't run as well on Amazon 
 Linux AMIs.
 
 Alex
 
 
 
 On Sun, Apr 28, 2013 at 7:12 PM, Michael Theroux mthero...@yahoo.com wrote:
 I forgot to mention,
 
 When things go really bad, I'm seeing I/O waits in the 80-95% range.  I 
 restarted cassandra once when a node is in this situation, and it took 45 
 minutes to start (primarily reading SSTables).  Typically, a node would start 
 in about 5 minutes.
 
 Thanks,
 -Mike
  
 On Apr 28, 2013, at 12:37 PM, Michael Theroux wrote:
 
 Hello,
 
 We've done some additional monitoring, and I think we have more information. 
  We've been collecting vmstat information every minute, attempting to catch  
 a node with issues,.
 
 So, it appears, that the cassandra node runs fine.  Then suddenly, without 
 any correlation to any event that I can identify, the I/O wait time goes way 
 up, and stays up indefinitely.  Even non-cassandra  I/O activities (such as 
 snapshots and backups) start causing large I/O Wait times when they 
 typically would not.  Previous to an issue, we would typically see I/O wait 
 times 3-4% with very few blocked processes on I/O.  Once this issue 
 manifests itself, i/O wait times for the same activities jump to 30-40% with 
 many blocked processes.  The I/O wait times do go back down when there is 
 literally no activity.   
 
 -  Updating the node to the latest Amazon Linux patches and rebooting the 
 instance doesn't correct the issue.
 -  Backing up the node, and replacing the instance does correct the issue.  
 I/O wait times return to normal.
 
 One relatively recent change we've made is we upgraded to m1.xlarge 
 instances which has 4 ephemeral drives available.  We create a logical 
 volume from the 4 drives with the idea that we should be able to get 
 increased I/O throughput.  When we ran m1.large instances, we had the same 
 setup, although it was only using 2 ephemeral drives.  We chose to use LVM, 
 vs. madm because we were having issues having madm create the raid volume 
 reliably on restart (and research showed that this was a common problem).  
 LVM just worked (and had worked for months before this upgrade)..
 
 For reference, this is the script we used to create the logical volume:
 
 vgcreate mnt_vg /dev/sdb /dev/sdc /dev/sdd /dev/sde
 lvcreate -L 1600G -n mnt_lv -i 4 mnt_vg -I 256K
 blockdev --setra 65536 /dev/mnt_vg/mnt_lv
 sleep 2
 mkfs.xfs /dev/mnt_vg/mnt_lv
 sleep 3
 mkdir -p /data  mount -t xfs -o noatime /dev/mnt_vg/mnt_lv /data
 sleep 3
 
 Another tidbit... thus far (and this maybe only a coincidence), we've only 
 had to replace DB nodes within a single availability zone within us-east.  
 Other availability zones, in the same region, have yet to show an issue.
 
 It looks like I'm going to need to replace a third DB node today.  Any 
 advice would be appreciated.
 
 Thanks,
 -Mike
 
 
 On Apr 26, 2013, at 10:14 AM, Michael Theroux wrote:
 
 Thanks.
 
 We weren't monitoring this value when the issue occurred, and this 
 particular issue has not appeared for a couple of days (knock on wood).  
 Will keep an eye out though,
 
 -Mike
 
 On Apr 26, 2013, at 5:32 AM, Jason Wee wrote:
 
 top command? st : time stolen from this vm by the hypervisor
 
 jason
 
 
 On Fri, Apr 26, 2013 at 9:54 AM, Michael Theroux mthero...@yahoo.com 
 wrote:
 Sorry, Not sure what CPU steal is :)
 
 I have AWS console with detailed monitoring enabled... things seem to 
 track close to the minute, so I can see the CPU load go to 0... then jump 
 at about the minute Cassandra reports the dropped messages,
 
 -Mike
 
 On Apr 25, 2013, at 9:50 PM, aaron morton wrote:
 
 The messages appear right after the node wakes up.
 Are you tracking CPU steal ? 
 
 -
 Aaron Morton
 Freelance Cassandra Consultant
 New Zealand
 
 @aaronmorton
 http://www.thelastpickle.com
 
 On 25/04/2013, at 4:15 AM, Robert Coli rc...@eventbrite.com wrote:
 
 On Wed, Apr 24, 2013 at 5:03 AM, Michael Theroux mthero...@yahoo.com 
 wrote:
 Another related question.  Once we see messages being dropped on one 
 node, our 

Re: Exporting all data within a keyspace

2013-04-30 Thread Chidambaran Subramanian
Thanks guys,both are good pointers

Regards
Chiddu

On Tue, Apr 30, 2013 at 7:09 PM, Brian O'Neill b...@alumni.brown.eduwrote:


 You could always do something like this as well:

 http://brianoneill.blogspot.com/2012/05/dumping-data-from-cassandra-like.html

 -brian

 ---

 Brian O'Neill

 Lead Architect, Software Development

 *Health Market Science*

 *The Science of Better Results*

 2700 Horizon Drive • King of Prussia, PA • 19406

 M: 215.588.6024 • @boneill42 http://www.twitter.com/boneill42  •

 healthmarketscience.com


 This information transmitted in this email message is for the intended
 recipient only and may contain confidential and/or privileged material. If
 you received this email in error and are not the intended recipient, or the
 person responsible to deliver it to the intended recipient, please contact
 the sender at the email above and delete this email and any attachments and
 destroy any copies thereof. Any review, retransmission, dissemination,
 copying or other use of, or taking any action in reliance upon, this
 information by persons or entities other than the intended recipient is
 strictly prohibited.

 ** **


 From: Kumar Ranjan winnerd...@gmail.com
 Reply-To: user@cassandra.apache.org
 Date: Tuesday, April 30, 2013 9:11 AM
 To: user@cassandra.apache.org user@cassandra.apache.org
 Subject: Re: Exporting all data within a keyspace

 Try sstable2json and json2sstable. But it works on column family so you
 can fetch all column family and iterate over list of CF and use
 sstable2json tool to extract data. Remember this will only fetch on disk
 data do anything in memtable/cache which is to be flushed will be missed.
 So run compaction and then run the written script.

 On Tuesday, April 30, 2013, Chidambaran Subramanian wrote:

 Is there any easy way of exporting all data for a keyspace (and
 conversely) importing it.

 Regards
 Chiddu




CfP 2013 Workshop on Middleware for HPC and Big Data Systems (MHPC'13)

2013-04-30 Thread MHPC 2013
we apologize if you receive multiple copies of this message
===

CALL FOR PAPERS

2013 Workshop on

Middleware for HPC and Big Data Systems

MHPC '13

as part of Euro-Par 2013, Aachen, Germany

===

Date: August 27, 2012

Workshop URL: http://m-hpc.org

Springer LNCS

SUBMISSION DEADLINE:

May 31, 2013 - LNCS Full paper submission (rolling abstract submission)
June 28, 2013 - Lightning Talk abstracts


SCOPE

Extremely large, diverse, and complex data sets are generated from
scientific applications, the Internet, social media and other applications.
Data may be physically distributed and shared by an ever larger community.
Collecting, aggregating, storing and analyzing large data volumes
presents major challenges. Processing such amounts of data efficiently
has been an issue to scientific discovery and technological
advancement. In addition, making the data accessible, understandable and
interoperable includes unsolved problems. Novel middleware architectures,
algorithms, and application development frameworks are required.

In this workshop we are particularly interested in original work at the
intersection of HPC and Big Data with regard to middleware handling
and optimizations. Scope is existing and proposed middleware for HPC
and big data, including analytics libraries and frameworks.

The goal of this workshop is to bring together software architects,
middleware and framework developers, data-intensive application developers
as well as users from the scientific and engineering community to exchange
their experience in processing large datasets and to report their scientific
achievement and innovative ideas. The workshop also offers a dedicated forum
for these researchers to access the state of the art, to discuss problems
and requirements, to identify gaps in current and planned designs, and to
collaborate in strategies for scalable data-intensive computing.

The workshop will be one day in length, composed of 20 min paper
presentations, each followed by 10 min discussion sections.
Presentations may be accompanied by interactive demonstrations.


TOPICS

Topics of interest include, but are not limited to:

- Middleware including: Hadoop, Apache Drill, YARN, Spark/Shark, Hive,
Pig, Sqoop,
HBase, HDFS, S4, CIEL, Oozie, Impala, Storm and Hyrack
- Data intensive middleware architecture
 - Libraries/Frameworks including: Apache Mahout, Giraph, UIMA and GraphLab
- NG Databases including Apache Cassandra, MongoDB and CouchDB/Couchbase
- Schedulers including Cascading
- Middleware for optimized data locality/in-place data processing
- Data handling middleware for deployment in virtualized HPC environments
- Parallelization and distributed processing architectures at the
middleware level
- Integration with cloud middleware and application servers
- Runtime environments and system level support for data-intensive computing
- Skeletons and patterns
- Checkpointing
- Programming models and languages
- Big Data ETL
- Stream processing middleware
- In-memory databases for HPC
- Scalability and interoperability
- Large-scale data storage and distributed file systems
- Content-centric addressing and networking
- Execution engines, languages and environments including CIEL/Skywriting
- Performance analysis, evaluation of data-intensive middleware
- In-depth analysis and performance optimizations in existing data-handling
middleware, focusing on indexing/fast storing or retrieval between compute
and storage nodes
- Highly scalable middleware optimized for minimum communication
- Use cases and experience for popular Big Data middleware
- Middleware security, privacy and trust architectures

DATES

Papers:
Rolling abstract submission
May 31, 2013 - Full paper submission
July 8, 2013 - Acceptance notification
October 3, 2013 - Camera-ready version due

Lightning Talks:
June 28, 2013 - Deadline for lightning talk abstracts
July 15, 2013 - Lightning talk notification

August 27, 2013 - Workshop Date


TPC

CHAIR

Michael Alexander (chair), TU Wien, Austria
Anastassios Nanos (co-chair), NTUA, Greece
Jie Tao (co-chair), Karlsruhe Institut of Technology, Germany
Lizhe Wang (co-chair), Chinese Academy of Sciences, China
Gianluigi Zanetti (co-chair), CRS4, Italy

PROGRAM COMMITTEE

Amitanand Aiyer, Facebook, USA
Costas Bekas, IBM, Switzerland
Jakob Blomer, CERN, Switzerland
William Gardner, University of Guelph, Canada
José Gracia, HPC Center of the University of Stuttgart, Germany
Zhenghua Guom,  Indiana University, USA
Marcus Hardt,  Karlsruhe Institute of Technology, Germany
Sverre Jarp, CERN, Switzerland
Christopher Jung,  Karlsruhe Institute of Technology, Germany
Andreas Knüpfer - Technische Universität Dresden, Germany
Nectarios Koziris, National Technical University of Athens, Greece
Yan Ma, Chinese Academy of Sciences, China
Martin Schulz - Lawrence Livermore National Laboratory
Viral Shah, 

Re: cassandra-shuffle time to completion and required disk space

2013-04-30 Thread aaron morton
 These are taken just before starting shuffle (ran repair/cleanup the day 
 before).
 During shuffle disabled all reads/writes to the cluster.
 
 nodetool status keyspace:
 
 Load   Tokens  Owns (effective)  Host ID
 80.95 GB   256 16.7% 754f9f4c-4ba7-4495-97e7-1f5b6755cb27

I'm a little confused when nodetool status was showing 256 tokens before the 
shuffle was run. 

Did you set num_tokens during the upgrade process ? But I doubt that would 
change anything as the inital_token is set in the system tables during 
bootstrap. 

`bin/cassandra-shuffle  ls` will show the list of moves the shuffle process 
was/is going to run. What does that say?

Cheers

-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 30/04/2013, at 5:08 AM, John Watson j...@disqus.com wrote:

 That's what we tried first before the shuffle. And ran into the space issue.
 
 That's detailed in another thread title: Adding nodes in 1.2 with vnodes 
 requires huge disks
 
 
 On Mon, Apr 29, 2013 at 4:08 AM, Sam Overton s...@acunu.com wrote:
 An alternative to running shuffle is to do a rolling bootstrap/decommission. 
 You would set num_tokens on the existing hosts (and restart them) so that 
 they split their ranges, then bootstrap in N new hosts, then decommission the 
 old ones.
 
 
 
 On 28 April 2013 22:21, John Watson j...@disqus.com wrote:
 The amount of time/space cassandra-shuffle requires when upgrading to using 
 vnodes should really be apparent in documentation (when some is made).
 
 Only semi-noticeable remark about the exorbitant amount of time is a bullet 
 point in: http://wiki.apache.org/cassandra/VirtualNodes/Balance
 
 Shuffling will entail moving a lot of data around the cluster and so has the 
 potential to consume a lot of disk and network I/O, and to take a 
 considerable amount of time. For this to be an online operation, the shuffle 
 will need to operate on a lower priority basis to other streaming operations, 
 and should be expected to take days or weeks to complete.
 
 We tried running shuffle on a QA version of our cluster and 2 things were 
 brought to light:
  - Even with no reads/writes it was going to take 20 days
  - Each machine needed enough free diskspace to potentially hold the entire 
 cluster's sstables on disk
 
 Regards,
 
 John
 
 
 
 -- 
 Sam Overton
 Acunu | http://www.acunu.com | @acunu
 



Re: nodetool status OWNS and multiple DCs

2013-04-30 Thread aaron morton
 I thought that each datacenter has 100% coverage of token range. What does 
 the value in Owns field mean and how it affects a replication (for exaple, 
 with replication factors DC1:1, DC2:2)?
Run the command and specify your keyspace, that will tell nodetool to use the 
Replication Strategy specified for KS when calculating the layout. If it's NTS 
you should see what you expect. 

Cheers
  
-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 30/04/2013, at 8:50 PM, Sergey Naumov sknau...@gmail.com wrote:

 Hello.
 
 I have set up test cluster of 2 DCs with 1 node in each DC. In each config I 
 specified 256 virtual nodes and chosed GossippingPropertyFileSnitch.
 
 For node1:
 ~/Cassandra$ cat /etc/cassandra/cassandra-rackdc.properties
 dc=DC1
 rack=RAC1
 
 For node2:
 ~/Cassandra$ cat /etc/cassandra/cassandra-rackdc.properties
 dc=DC2
 rack=RAC1
 
 When I call nodetool status, it shows not 100% ownership of tokens for each 
 DC:
 :~/Cassandra$ nodetool status
 Datacenter: DC1
 ===
 Status=Up/Down
 |/ State=Normal/Leaving/Joining/Moving
 --  Address Load   Tokens  Owns   Host ID 
   Rack
 UN  172.16.0.1  97.43 KB   256 51,5%  
 7ffd2432-46c9-443c-aa0c-8bfd960d2acc  RAC1
 Datacenter: DC2
 ===
 Status=Up/Down
 |/ State=Normal/Leaving/Joining/Moving
 --  Address Load   Tokens  Owns   Host ID 
   Rack
 UN  172.16.0.2  91.78 KB   256 48,5%  
 93d3bdbf-8625-4e54-8c7c-087fbfe419f5  RAC1
 
 I thought that each datacenter has 100% coverage of token range. What does 
 the value in Owns field mean and how it affects a replication (for exaple, 
 with replication factors DC1:1, DC2:2)?
 
 Thanks in advance,
 Sergey Naumov.



Re: normal thread counts?

2013-04-30 Thread aaron morton
  Many many many of the threads are trying to talk to IPs that aren't in the 
 cluster (I assume they are the IP's of dead hosts). 
Are these IP's from before the upgrade ? Are they IP's you expect to see ? 

Cross reference them with the output from nodetool gossipinfo to see why the 
node thinks they should be used. 
Could you provide a list of the thread names ? 

One way to remove those IPs that may be to rolling restart with 
-Dcassandra.load_ring_state=false i the JVM opts at the bottom of 
cassandra-env.sh

The OutboundTcpConnection threads are created in pairs by the 
OutboundTcpConnectionPool, which is created here 
https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/net/MessagingService.java#L502
 The threads are created in the OutboundTcpConnectionPool constructor checking 
to see if this could be the source of the leak. 

Cheers

-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 1/05/2013, at 2:18 AM, William Oberman ober...@civicscience.com wrote:

 I use phpcassa.
 
 I did a thread dump.  99% of the threads look very similar (I'm using 1.1.9 
 in terms of matching source lines).  The thread names are all like this: 
 WRITE-/10.x.y.z.  There are a LOT of duplicates (in terms of the same IP).  
 Many many many of the threads are trying to talk to IPs that aren't in the 
 cluster (I assume they are the IP's of dead hosts).  The stack trace is 
 basically the same for them all, attached at the bottom.   
 
 There is a lot of things I could talk about in terms of my situation, but 
 what I think might be pertinent to this thread: I hit a tipping point 
 recently and upgraded a 9 node cluster from AWS m1.large to m1.xlarge 
 (rolling, one at a time).  7 of the 9 upgraded fine and work great.  2 of the 
 9 keep struggling.  I've replaced them many times now, each time using this 
 process:
 http://www.datastax.com/docs/1.1/cluster_management#replacing-a-dead-node
 And even this morning the only two nodes with a high number of threads are 
 those two (yet again).  And at some point they'll OOM.
 
 Seems like there is something about my cluster (caused by the recent 
 upgrade?) that causes a thread leak on OutboundTcpConnection   But I don't 
 know how to escape from the trap.  Any ideas?
 
 
 
   stackTrace = [ { 
 className = sun.misc.Unsafe;
 fileName = Unsafe.java;
 lineNumber = -2;
 methodName = park;
 nativeMethod = true;
}, { 
 className = java.util.concurrent.locks.LockSupport;
 fileName = LockSupport.java;
 lineNumber = 158;
 methodName = park;
 nativeMethod = false;
}, { 
 className = 
 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject;
 fileName = AbstractQueuedSynchronizer.java;
 lineNumber = 1987;
 methodName = await;
 nativeMethod = false;
}, { 
 className = java.util.concurrent.LinkedBlockingQueue;
 fileName = LinkedBlockingQueue.java;
 lineNumber = 399;
 methodName = take;
 nativeMethod = false;
}, { 
 className = org.apache.cassandra.net.OutboundTcpConnection;
 fileName = OutboundTcpConnection.java;
 lineNumber = 104;
 methodName = run;
 nativeMethod = false;
} ];
 --
 
 
 
 
 On Mon, Apr 29, 2013 at 4:31 PM, aaron morton aa...@thelastpickle.com wrote:
  I used JMX to check current number of threads in a production cassandra 
 machine, and it was ~27,000.
 That does not sound too good. 
 
 My first guess would be lots of client connections. What client are you 
 using, does it do connection pooling ?
 See the comments in cassandra.yaml around rpc_server_type, the default uses 
 sync uses one thread per connection, you may be better with HSHA. But if your 
 app is leaking connection you should probably deal with that first. 
 
 Cheers
 
 -
 Aaron Morton
 Freelance Cassandra Consultant
 New Zealand
 
 @aaronmorton
 http://www.thelastpickle.com
 
 On 30/04/2013, at 3:07 AM, William Oberman ober...@civicscience.com wrote:
 
 Hi,
 
 I'm having some issues.  I keep getting:
 
 ERROR [GossipStage:1] 2013-04-28 07:48:48,876 AbstractCassandraDaemon.java 
 (line 135) Exception in thread Thread[GossipStage:1,5,main]
 java.lang.OutOfMemoryError: unable to create new native thread
 --
 after a day or two of runtime.  I've checked and my system settings seem 
 acceptable:
 memlock=unlimited
 nofiles=10
 nproc=122944
 
 I've messed with heap sizes from 6-12GB (15 physical, m1.xlarge in AWS), and 
 I keep OOM'ing with the above error.
 
 I've found some (what seem to me) to be obscure references to the stack size 
 interacting with # of threads.  If I'm understanding it correctly, to reason 
 about Java mem usage I have to think of OS + Heap as being locked down, and 
 the stack gets the leftovers of physical memory and each thread gets a 
 stack.
 
 For me, the system ulimit setting on stack is 10240k (no 

Re: SSTables not opened on new cluste

2013-04-30 Thread aaron morton
Double check the file permissions ? 

Write some data (using cqlsh or cassandra-cli) and flush to make sure the new 
files are created where you expect them to be. 

Cheers

-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 1/05/2013, at 4:15 AM, Philippe watche...@gmail.com wrote:

 Hello,
 
 I'm trying to bring up a copy of an existing 3-node cluster running 1.0.8 
 into a 3-node cluster running 1.0.11.
 The new cluster has been configured to have the same tokens and the same 
 partitioner. 
 
 Initially, I copied the files in the data directory of each node into their 
 corresponding node on the new cluster. When starting the new cluster, it 
 didn't pick up the KS  CF.
 So I remove the directories, created the schema, stopped cassandra and copied 
 the data back. None of the SSTables are opened at startup.
 
 When I nodetool refresh them, it says No new SSTables where found for  
 XXX/YYY where X is my KS and Y is my CF
 
 I'm totally stumped: any ideas as to why this is happening ? I've checked 
 that there are actual files there and that the permissions are correct.
 
 Thanks.
 



Re: SSTables not opened on new cluste

2013-04-30 Thread Philippe
Hi Aaron,
thanks for the response.

Permissions are correct : owner is cassandra (ubuntu) and permissions
are drwxr-xr-x

When I created the schema, the KS were created as directory in the
.../data/ directory
When I use cassandra-cli, set the CL to QUORUM, ensure two instances are up
(nodetool ring) and set a bogus column, I get an Unavailable exception
which is weird.

I just noticed that when the new cluster nodes start up, auto_bootstrap is
set to true. Given that it's no longer in the YAML file, I didn't set it to
false; would that explain it ?

Once I removed it and set logging to TRACE, I get these relevant log lines

SliceQueryFilter.java (line 123) collecting 1 of 2147483647:
XXX:5123@1367333147034
a bunch of Schema.java (line 382) Adding
org.apache.cassandra.config.CFMetaData@45f8acdc
[cfId=1009,ksName=XXX,cfName=YYY...
Removing compacted SSTable files from XXX
No bootstrapping, leaving or moving nodes - empty pending ranges for XXX
(my KS)
Table.java (line 317) Initializing XXX.YYY (KS and CF)

Any ideas ?


2013/4/30 aaron morton aa...@thelastpickle.com

 Double check the file permissions ?

 Write some data (using cqlsh or cassandra-cli) and flush to make sure the
 new files are created where you expect them to be.

 Cheers

 -
 Aaron Morton
 Freelance Cassandra Consultant
 New Zealand

 @aaronmorton
 http://www.thelastpickle.com

 On 1/05/2013, at 4:15 AM, Philippe watche...@gmail.com wrote:

  Hello,
 
  I'm trying to bring up a copy of an existing 3-node cluster running
 1.0.8 into a 3-node cluster running 1.0.11.
  The new cluster has been configured to have the same tokens and the same
 partitioner.
 
  Initially, I copied the files in the data directory of each node into
 their corresponding node on the new cluster. When starting the new cluster,
 it didn't pick up the KS  CF.
  So I remove the directories, created the schema, stopped cassandra and
 copied the data back. None of the SSTables are opened at startup.
 
  When I nodetool refresh them, it says No new SSTables where found for
  XXX/YYY where X is my KS and Y is my CF
 
  I'm totally stumped: any ideas as to why this is happening ? I've
 checked that there are actual files there and that the permissions are
 correct.
 
  Thanks.
 




Re: normal thread counts?

2013-04-30 Thread aaron morton
The issue below could result in abandoned threads under high contention, so 
we'll get that fixed. 

But we are not sure how/why it would be called so many times. If you could 
provide a full list of threads and the output from nodetool gossipinfo that 
would help. 

Cheers

-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 1/05/2013, at 8:34 AM, aaron morton aa...@thelastpickle.com wrote:

  Many many many of the threads are trying to talk to IPs that aren't in the 
 cluster (I assume they are the IP's of dead hosts). 
 Are these IP's from before the upgrade ? Are they IP's you expect to see ? 
 
 Cross reference them with the output from nodetool gossipinfo to see why the 
 node thinks they should be used. 
 Could you provide a list of the thread names ? 
 
 One way to remove those IPs that may be to rolling restart with 
 -Dcassandra.load_ring_state=false i the JVM opts at the bottom of 
 cassandra-env.sh
 
 The OutboundTcpConnection threads are created in pairs by the 
 OutboundTcpConnectionPool, which is created here 
 https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/net/MessagingService.java#L502
  The threads are created in the OutboundTcpConnectionPool constructor 
 checking to see if this could be the source of the leak. 
 
 Cheers
 
 -
 Aaron Morton
 Freelance Cassandra Consultant
 New Zealand
 
 @aaronmorton
 http://www.thelastpickle.com
 
 On 1/05/2013, at 2:18 AM, William Oberman ober...@civicscience.com wrote:
 
 I use phpcassa.
 
 I did a thread dump.  99% of the threads look very similar (I'm using 1.1.9 
 in terms of matching source lines).  The thread names are all like this: 
 WRITE-/10.x.y.z.  There are a LOT of duplicates (in terms of the same IP). 
  Many many many of the threads are trying to talk to IPs that aren't in the 
 cluster (I assume they are the IP's of dead hosts).  The stack trace is 
 basically the same for them all, attached at the bottom.   
 
 There is a lot of things I could talk about in terms of my situation, but 
 what I think might be pertinent to this thread: I hit a tipping point 
 recently and upgraded a 9 node cluster from AWS m1.large to m1.xlarge 
 (rolling, one at a time).  7 of the 9 upgraded fine and work great.  2 of 
 the 9 keep struggling.  I've replaced them many times now, each time using 
 this process:
 http://www.datastax.com/docs/1.1/cluster_management#replacing-a-dead-node
 And even this morning the only two nodes with a high number of threads are 
 those two (yet again).  And at some point they'll OOM.
 
 Seems like there is something about my cluster (caused by the recent 
 upgrade?) that causes a thread leak on OutboundTcpConnection   But I don't 
 know how to escape from the trap.  Any ideas?
 
 
 
   stackTrace = [ { 
 className = sun.misc.Unsafe;
 fileName = Unsafe.java;
 lineNumber = -2;
 methodName = park;
 nativeMethod = true;
}, { 
 className = java.util.concurrent.locks.LockSupport;
 fileName = LockSupport.java;
 lineNumber = 158;
 methodName = park;
 nativeMethod = false;
}, { 
 className = 
 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject;
 fileName = AbstractQueuedSynchronizer.java;
 lineNumber = 1987;
 methodName = await;
 nativeMethod = false;
}, { 
 className = java.util.concurrent.LinkedBlockingQueue;
 fileName = LinkedBlockingQueue.java;
 lineNumber = 399;
 methodName = take;
 nativeMethod = false;
}, { 
 className = org.apache.cassandra.net.OutboundTcpConnection;
 fileName = OutboundTcpConnection.java;
 lineNumber = 104;
 methodName = run;
 nativeMethod = false;
} ];
 --
 
 
 
 
 On Mon, Apr 29, 2013 at 4:31 PM, aaron morton aa...@thelastpickle.com 
 wrote:
  I used JMX to check current number of threads in a production cassandra 
 machine, and it was ~27,000.
 That does not sound too good. 
 
 My first guess would be lots of client connections. What client are you 
 using, does it do connection pooling ?
 See the comments in cassandra.yaml around rpc_server_type, the default uses 
 sync uses one thread per connection, you may be better with HSHA. But if 
 your app is leaking connection you should probably deal with that first. 
 
 Cheers
 
 -
 Aaron Morton
 Freelance Cassandra Consultant
 New Zealand
 
 @aaronmorton
 http://www.thelastpickle.com
 
 On 30/04/2013, at 3:07 AM, William Oberman ober...@civicscience.com wrote:
 
 Hi,
 
 I'm having some issues.  I keep getting:
 
 ERROR [GossipStage:1] 2013-04-28 07:48:48,876 AbstractCassandraDaemon.java 
 (line 135) Exception in thread Thread[GossipStage:1,5,main]
 java.lang.OutOfMemoryError: unable to create new native thread
 --
 after a day or two of runtime.  I've checked and my system settings seem 
 acceptable:
 memlock=unlimited
 nofiles=10
 nproc=122944