Re: Completely removing a node from the cluster

2011-08-23 Thread Jonathan Colby
I ran into this. I also tried log_ring_state=false which also did not help. The way I got through this was to stop the entire cluster and start the nodes one-by-one. I realize this is not a practical solution for everyone, but if you can afford to stop the cluster for a few minutes, it's

Re: Re: Urgent:!! Re: Need to maintenance on a cassandra node, are there problems with this process

2011-08-19 Thread jonathan . colby
Hi - From what I understand, Peter's recommendation should work for you. They have both worked for me. No need to copy anything by hand on the new node. Bootstrap/repair does that for you. From the Wiki: If a node goes down entirely, then you have two options: (Recommended approach)

upgrade from 0.7.6 to 0.8.4

2011-08-16 Thread Jonathan Colby
Hi - sorry if this was asked before but I couldn't find any answers about it. Is the upgrade path from 0.7.6 to 0.8.4 possible via a simple rolling restart? Are nodes with these different versions compatible - i.e., can one node be upgraded in order to see if we run into any problems

Re: Re: Cassandra start/stop scripts

2011-07-27 Thread jonathan . colby
A simple kill without -9 should work. Have you tried that? On , Jason Pell jasonmp...@gmail.com wrote: Check out the rpm packages from Cassandra they have init.d scripts that work very nicely, there are debs as well for ubuntu Sent from my iPhone On Jul 27, 2011, at 3:19, Priyanka

eliminate need to repair by using column TTL??

2011-07-22 Thread jonathan . colby
One of the main reasons for regularly running repair is to make sure deletes are propagated in the cluster, ie, data is not resurrected if a node never received the delete call. And repair-on-read takes care of repairing inconsistencies on-the-fly. So if I were to set a universal TTL on all

Re: Re: eliminate need to repair by using column TTL??

2011-07-22 Thread jonathan . colby
good points Aaron. I realize now how expensive repair on reads are. I'm going to keep doing repairs regularly but still have a max TTL on all columns to make sure we don't have really old data we no longer need getting buried in the cluster. On , aaron morton aa...@thelastpickle.com wrote:

Repair question - why is so much data transferred?

2011-07-21 Thread Jonathan Colby
I regularly run repair on my cassandra cluster. However, I often seen that during the repair operation very large amounts of data are transferred to other nodes. My questions is, if only some data is out of sync, why are entire Data files being transferred?

Re: Re: Repair question - why is so much data transferred?

2011-07-21 Thread jonathan . colby
situation. Thanks. Looking forward to the release where these 2 things are fixed. On , Jonathan Ellis jbel...@gmail.com wrote: On Thu, Jul 21, 2011 at 9:14 AM, Jonathan Colby jonathan.co...@gmail.com wrote: I regularly run repair on my cassandra cluster. However, I often seen that during

Re: Decorator Algorithm

2011-06-24 Thread Jonathan Colby
Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 23 Jun 2011, at 19:58, Jonathan Colby wrote: Hi - I'd like to understand more how the token is hashed with the key to determine on which node the data is stored - called decorating in cassandra speak. Can

Decorator Algorithm

2011-06-23 Thread Jonathan Colby
Hi - I'd like to understand more how the token is hashed with the key to determine on which node the data is stored - called decorating in cassandra speak. Can anyone share any documentation on this or describe this more in detail? Yes, I could look at the code, but I was hoping to be able

Re: insufficient space to compact even the two smallest files, aborting

2011-06-23 Thread Jonathan Colby
A compaction will be triggered when min number of same sized SStable files are found. So what's actually the purpose of the max part of the threshold? On Jun 23, 2011, at 12:55 AM, aaron morton wrote: Setting them to 2 and 2 means compaction can only ever compact 2 files at time, so

simple question about merged SSTable sizes

2011-06-22 Thread Jonathan Colby
The way compaction works, x same-sized files are merged into a new SSTable. This repeats itself and the SSTable get bigger and bigger. So what is the upper limit?? If you are not deleting stuff fast enough, wouldn't the SSTable sizes grow indefinitely? I ask because we have some rather

Re: simple question about merged SSTable sizes

2011-06-22 Thread Jonathan Colby
to hit a dead end. On Jun 22, 2011, at 6:50 PM, Eric tamme wrote: On Wed, Jun 22, 2011 at 12:35 PM, Jonathan Colby jonathan.co...@gmail.com wrote: The way compaction works, x same-sized files are merged into a new SSTable. This repeats itself and the SSTable get bigger and bigger. So

Re: simple question about merged SSTable sizes

2011-06-22 Thread Jonathan Colby
and avoid very large SSTables/node if possible. Edward On Wed, Jun 22, 2011 at 12:35 PM, Jonathan Colby jonathan.co...@gmail.com wrote: The way compaction works, x same-sized files are merged into a new SSTable. This repeats itself and the SSTable get bigger and bigger. So what

Re: simple question about merged SSTable sizes

2011-06-22 Thread Jonathan Colby
Thanks Ryan. Done that : ) 1 TB is the striped size.We might look into bigger disks for our blades. On Jun 22, 2011, at 7:09 PM, Ryan King wrote: On Wed, Jun 22, 2011 at 10:00 AM, Jonathan Colby jonathan.co...@gmail.com wrote: Thanks for the explanation. I'm still a bit skeptical

Re: simple question about merged SSTable sizes

2011-06-22 Thread Jonathan Colby
Awesome tip on TTL. We can really use this as a catch-all to make sure all columns are purged based on time. Fits our use-case good. I forgot this feature existed. On Jun 22, 2011, at 7:11 PM, Eric tamme wrote: Second, compacting such large files is an IO killer.What can be tuned

Re: New web client future API

2011-06-20 Thread Jonathan Colby
I just took a look at the demo. This is really great stuff! I will try this on our cluster as soon as possible. I like this because it allows people not too familiar with the cassandra CLI or Thrift a way to query cassandra data. On Jun 20, 2011, at 10:56 AM, Markus Wiesenbacher |

Re: jsvc hangs shell

2011-06-17 Thread Jonathan Colby
jsvc is not very flexible. Check out wrapper software out. we swear by it. http://wrapper.tanukisoftware.com/doc/english/download.jsp On Jun 17, 2011, at 2:52 AM, Ken Brumer wrote: Anton Belyaev anton.belyaev at gmail.com writes: I guess it is not trivial to modify the package to make

Re: Re: minor vs major compaction and purging data

2011-06-13 Thread jonathan . colby
? What would be the difference between cleanup and compactions? On Sat, Jun 11, 2011 at 8:14 AM, Jonathan Ellis jbel...@gmail.com wrote: Yes. On Sat, Jun 11, 2011 at 6:08 AM, Jonathan Colby jonathan.co...@gmail.com wrote: I've been reading inconsistent descriptions of what major

minor vs major compaction and purging data

2011-06-11 Thread Jonathan Colby
I've been reading inconsistent descriptions of what major and minor compactions do. So my question for clarification: Are tombstones purges (ie, space reclaimed) for minor AND major compactions? Thanks.

Compacting Large Row

2011-06-11 Thread Jonathan Colby
I'm seeing this in my logs. We are storing emails in cassandra and some of them might be rather large. Is this bad? What exactly is happening when this appears? INFO [CompactionExecutor:1] 2011-06-11 13:39:19,217 CompactionIterator.java (line 150) Compacting large row

after a while nothing happening with repair

2011-06-09 Thread Jonathan Colby
When I run repair on a node in my 0.7.6-2 cluster, the repair starts to stream data and activity is seen in the logs. However, after a while (a day or so) it seems like everything freezes up. The repair command is still running (the command prompt has not returned) and netstats shows output

fixing unbalanced cluster !?

2011-06-09 Thread Jonathan Colby
I got myself into a situation where one node (10.47.108.100) has a lot more data than the other nodes. In fact, the 1 TB disk on this node is almost full. I added 3 new nodes and let cassandra automatically calculate new tokens by taking the highest loaded nodes. Unfortunately there is

Re: fixing unbalanced cluster !?

2011-06-09 Thread Jonathan Colby
balancing should be an iteration on the above steps moving through the range. On 6/9/11 6:21 AM, Jonathan Colby wrote: I got myself into a situation where one node (10.47.108.100) has a lot more data than the other nodes. In fact, the 1 TB disk on this node is almost full. I added 3 new nodes

no additional log output after running repair

2011-05-31 Thread Jonathan Colby
I'm trying to run a repair on a 7.6-2 Node. After running the repair command, this line shows up in the cassandra.log, but nothing else. It's been hours. Nothing is seen in the logs from other servers or with nodetool commands like netstats or tpstats. How do I know if the repair is

Re: exception when adding a node replication factor (3) exceeds number of endpoints (1) - SOLVED

2011-05-28 Thread Jonathan Colby
OK, is seems a phantom node (one that was removed from the cluster) kept being passed around in gossip as a down endpoint and was messing up the gossip algorithm. I had the luxury of being able to stop the entire cluster and bring the nodes up one by one. That purged the bad node from gossip.

new thing going on with repair in 0.7.6??

2011-05-28 Thread Jonathan Colby
It might just not have occurred to me in the previous 0.7.4 version, but when I do a repair on a node in v0.7.6, it seems like data is also synced with neighboring nodes. My understanding of repair is that the data is reconciled one the node being repaired. i.e., data is removed or added to that

average repair/bootstrap durations

2011-05-27 Thread Jonathan Colby
Hi - Operations like repair and bootstrap on nodes in our cluster (average load 150GB each) take a very long time. By long I mean 1-2 days. With nodetool netstats I can see the progress % very slowly progressing. I guess there are some throttling mechanisms built into cassandra. And yes

Re: average repair/bootstrap durations

2011-05-27 Thread Jonathan Colby
Thanks Ed! I was thinking about surrendering more memory to mmap operations. I'm going to try bringing the Xmx down to 4G On Fri, May 27, 2011 at 5:19 PM, Edward Capriolo edlinuxg...@gmail.com wrote: On Fri, May 27, 2011 at 9:08 AM, Jonathan Colby jonathan.co...@gmail.com wrote: Hi

Re: Re: nodetool move trying to stream data to node no longer in cluster

2011-05-27 Thread Jonathan Colby
rounds and then disappears. Hope that helps. - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 26 May 2011, at 19:58, Jonathan Colby wrote: @Aaron - Unfortunately I'm still seeing message like:   is down, removing

Re: nodetool move trying to stream data to node no longer in cluster

2011-05-26 Thread Jonathan Colby
you check from the other nodes in the cluster to see if they are receiving the stream ? cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 26 May 2011, at 00:42, Jonathan Colby wrote: I recently

nodetool move trying to stream data to node no longer in cluster

2011-05-25 Thread Jonathan Colby
I recently removed a node (with decommission) from our cluster. I added a couple new nodes and am now trying to rebalance the cluster using nodetool move. However, netstats shows that the node being moved is trying to stream data to the node that I already decommissioned yesterday. The

Re: Database grows 10X bigger after running nodetool repair

2011-05-25 Thread jonathan . colby
I'm not sure if this is the absolute best advice, but perhaps running clean on the data will help cleanup any data that isn't assigned to this token - in case you've moved the cluster around before. Any exceptions in the logs, eg EOF ? I experienced this and it caused the repairs to trip

Re: Re: nodetool move trying to stream data to node no longer in cluster

2011-05-25 Thread jonathan . colby
, Jonathan Colby wrote: I recently removed a node (with decommission) from our cluster. I added a couple new nodes and am now trying to rebalance the cluster using nodetool move. However, netstats shows that the node being moved is trying to stream data to the node that I already

extremely high temporary disk utilization 0.7.5

2011-05-21 Thread Jonathan Colby
On each of our nodes we have an average of 80 - 100 GB actual cassandra data on 1 TB disks.There is normally plenty of capacity on the nodes. Swap is OFF. OS is Debian 64 bit. Every once in a while, the disk usage will skyrocket to 500+ GB, even once filling up the 1 TB disk (at least

Re: jsvc hangs shell

2011-05-11 Thread jonathan . colby
We use the Java Service Wrapper from Tanuki Software and are very happy with it. It's a lot more robust than jsvc. http://wrapper.tanukisoftware.com/doc/english/download.jsp The free community version will be enough in most cases. Jon On May 11, 2011 10:30pm, Anton Belyaev

Re: What will be the steps for adding new nodes

2011-04-18 Thread Jonathan Colby
Your questions are pretty fundamental. I recommend reading through the documentation to get a better understanding of how Cassandra works. Here's good documentation from DataStax: http://www.datastax.com/docs/0.7/operations/clustering#adding-capacity In a nutshell: you only bootstrap new

recurring EOFException exception in 0.7.4

2011-04-15 Thread Jonathan Colby
I've been struggling with these kinds of exceptions for some time now. I thought it might have been a one-time thing, so on the 2 nodes where I saw this problem I pulled in fresh data with a repair on an empty data directory. Unfortunately, this problem is now coming up on a new node that has,

Re: Questions about the nodetool ring.

2011-04-12 Thread Jonathan Colby
This is normal when you just add single nodes. When no token is assigned, the new node takes a portion of the ring from the most heavily loaded node. As a consequence of this, the nodes will be out of balance. In other words, when you double the amount nodes you would not have this

Re: Questions about the nodetool ring.

2011-04-12 Thread Jonathan Colby
? Thanks. On Tue, Apr 12, 2011 at 5:15 PM, Jonathan Colby jonathan.co...@gmail.com wrote: This is normal when you just add single nodes. When no token is assigned, the new node takes a portion of the ring from the most heavily loaded node. As a consequence of this, the nodes

repair never completes with finished successfully

2011-04-12 Thread Jonathan Colby
There are a few other threads related to problems with the nodetool repair in 0.7.4. However I'm not seeing any errors, just never getting a message that the repair completed successfully. In my production and test cluster (with just a few MB data) the repair nodetool prompt never returns

Re: repair never completes with finished successfully

2011-04-12 Thread Jonathan Colby
hang if a neighbour dies and fails to send a requested stream. It will timeout after 24 hours (I think). Aaron On 12 Apr 2011, at 23:39, Karl Hiramoto wrote: On 12/04/2011 13:31, Jonathan Colby wrote: There are a few other threads related to problems with the nodetool repair in 0.7.4

quick repair tool question

2011-04-12 Thread Jonathan Colby
does a repair just compare the existing data from sstables on the node being repaired, or will it figure out which data this node should have and copy it in? I'm trying to refresh all the data for a given node (without reassigning the token) starting with an emptied out data directory. I

Re: quick repair tool question

2011-04-12 Thread Jonathan Colby
/6003676843 - 0% Pool NameActive Pending Completed Commandsn/a 0 5765 Responses n/a 0 9811 On Apr 12, 2011, at 4:59 PM, Jonathan Colby wrote: does a repair just compare the existing data

Re: Cassandra 2 DC deployment

2011-04-12 Thread Jonathan Colby
When the down data center comes back up, the Quorum reads will result in a read-repair, so you will get valid data. Besides that, hinted handoff will take care of getting data replicated to a previously down node. You're example is a little unrealistic because you could theoretically have a

Re: Help on decommission

2011-04-12 Thread Jonathan Colby
how long as it been in Leaving status? Is the cluster under stress test load while you are doing the decommission? On Apr 12, 2011, at 6:53 PM, Baskar Duraikannu wrote: I have setup a 4 node cluster for testing. When I setup the cluster, I have setup initial tokens in such a way that each

Re: flush_largest_memtables_at messages in 7.4

2011-04-12 Thread Jonathan Colby
your jvm heap has reached 78% so cassandra automatically flushes its memtables. you need to explain more about your configuration. 32 or 64 bit OS, what is max heap, how much ram installed? If this happens under stress test conditions its probably understandable. you should look into

Re: quick repair tool question

2011-04-12 Thread Jonathan Colby
cool! and I thought I made that one up myself : ) On Apr 13, 2011, at 2:13 AM, Chris Burroughs wrote: On 04/12/2011 11:11 AM, Jonathan Colby wrote: I'm not sure if this is the kosher way to rebuild the sstable data, but it seemed to work. http://wiki.apache.org/cassandra/Operations

Re: repair never completes with finished successfully

2011-04-12 Thread Jonathan Colby
or Stored remote tree depending on which returns first at DEBUG level 3) Queuing comparison If we do not have the 3rd log then we did not get a replay from either local or remote. Aaron On 13 Apr 2011, at 00:57, Jonathan Colby wrote: There is no Repair session message either. It just

Re: unrepairable sstable data rows

2011-04-11 Thread Jonathan Colby
Thanks for the answer Aaron. There are Data, Index, Filter, and Statistics files associated with SSTables. What files must be physically moved/deleted? I tried just moving the Data file and Cassandra would not start. I see this exception: WARN [WrapperSimpleAppMain] 2011-04-11

exceptions during bootstrap 0.7.4

2011-04-11 Thread Jonathan Colby
Seeing these exceptions on a node during the bootstrap phase of a move . Cassandra 0.7.4. Anyone able to shed more light on what may be causing this? btw - the move was done to assign a new token, decommission phase seemed to have gone ok. bootstrapping is still in progress (i hope) INFO

help! seed node needs to be replaced

2011-04-11 Thread Jonathan Colby
My seed node (1 of 4) having the wraparound range (token 0) needs to be replaced. Should I bootstrap the node with a new IP, then add it back as a seed? Should I run remove token on another node to take over the range?

Re: help! seed node needs to be replaced

2011-04-11 Thread Jonathan Colby
I shutdown cassandra, deleted (with a backup) the contents of the data directory and did a nodetool move 0.It seems to be populating the node with its range of data.Hope that was a good idea. On Apr 11, 2011, at 10:38 PM, Jonathan Colby wrote: My seed node (1 of 4) having

Re: help! seed node needs to be replaced

2011-04-11 Thread Jonathan Colby
the earlier EOF error during bootstrap ? Aaron On 12 Apr 2011, at 08:42, Jonathan Colby wrote: I shutdown cassandra, deleted (with a backup) the contents of the data directory and did a nodetool move 0.It seems to be populating the node with its range of data.Hope that was a good idea

unrepairable sstable data rows

2011-04-10 Thread Jonathan Colby
It appears we have several unserializable or unreadable rows. These were not fixed even after doing a scrub on all nodes - even though the scrub seemed to have completed successfully. I trying to fix these by doing a repair, but these exceptions are thrown exactly when doing a repair.

Re: auto_bootstrap

2011-04-09 Thread Jonathan Colby
I can't explain the technical reason why it's not advisable to bootstrap a seed. However, from what I've read you would bootstrap the node as a non-seed first, then add it as seed once it has finished bootstrapping. On Apr 8, 2011, at 9:30 PM, mcasandra wrote: in yaml: # Set to true to

Re: nodetool move hammers the next node in the ring

2011-04-09 Thread Jonathan Colby
. This is similar to https://issues.apache.org/jira/browse/CASSANDRA-2156 but that ticket will not cover this case. I've added this use case to the comments, please check there if you want to follow along. Cheers Aaron On 6 Apr 2011, at 16:26, Jonathan Colby wrote: thanks for the response Aaron

Is the repair still going on or did it fail because of exceptions?

2011-04-08 Thread Jonathan Colby
It seems on my cluster there are a few unserializable Rows. I'm trying to run a repair on the nodes, but it also seems that the replica nodes have unreadable or unserializable rows.The problem is, I cannot determine if the repair is still going on, or if was interrupted because of these

Re: consistency ONE and null

2011-04-07 Thread Jonathan Colby
nonsense words and other nonsense are a direct result of using swype to type on the screen On 7 Apr 2011 00:10, Jonathan Colby jonathan.co...@gmail.com wrote: Let's say you have RF of 3 and a write was written to 2 nodes. 1 was not written because the node had a network hiccup (but came

reoccurring exceptions seen

2011-04-07 Thread Jonathan Colby
These types of exceptions is seen sporadically in our cassandra logs. They occur especially after running a repair with the nodetool. I assume there are a few corrupt rows. Is this cause for panic? Will a repair fix this, or is it best to do a decomission + bootstrap via a move for

Re: nodetool move hammers the next node in the ring

2011-04-06 Thread Jonathan Colby
and whats the RF? Aaron On 6 Apr 2011, at 01:16, Jonathan Colby wrote: When doing a move, decommission, loadbalance, etc. data is streamed to the next node in such a way that it really strains the receiving node - to the point where it has a problem serving requests. Any way

Re: Location-aware replication based on objects' access pattern

2011-04-06 Thread Jonathan Colby
good to see a discussion on this. This also has practical use for business continuity where you can control that the clients in a given data center first write replicas to its own data center, then to the other data center for backup. If I understand correctly, a write takes the token into

consistency ONE and null

2011-04-06 Thread Jonathan Colby
Let's say you have RF of 3 and a write was written to 2 nodes. 1 was not written because the node had a network hiccup (but came back online again). My question is, if you are reading a key with a CL of ONE, and you happen to land on that node that didn't get the write, will the read fail

Re: Re: nodetool cleanup - results in more disk use?

2011-04-05 Thread jonathan . colby
as previously. Am I missing something or am I just reading the docs wrong ? Cheers Aaron On 4 Apr 2011, at 22:20, Jonathan Colby wrote: hi Aaron - The Datastax documentation brought to light the fact that over time, major compactions will be performed on bigger and bigger

if nodetool operations abort with timeout, did the operation continue?

2011-04-05 Thread Jonathan Colby
when doing a nodetool move , after about 15 minutes I got the below exception. The cassandra log seems to indicate that the move is still ongoing. Is this anything to worry about? Exception in thread main java.rmi.UnmarshalException: Error unmarshaling return header; nested exception is:

Disable Swap? batch_mutate failed: out of sequence response

2011-04-05 Thread Jonathan Colby
Hi Jonathan - Would you recommend to disable system swap as a rule? I'm running on Debian 64bit and am seeing light swapping: total used free sharedbuffers cached Mem: 8003 7969 33 0 0 4254 -/+ buffers/cache:

extreme memory consumption

2011-04-05 Thread Jonathan Colby
I've seen the other posts about memory consumption, but I'm seeing some weird behavior with 0.7.4 with 5 GB heap size (64 bit system with 8 GB ram total)... note the virtual mem used 20.6 GB ?! and Shared 8.4 GB ?! PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND

nothing happening in the cluster after a nodetool move

2011-04-05 Thread Jonathan Colby
I added a node to the cluster and I am having a difficult time reassigning the new tokens. It seems after a while nothing shows up in the new node's logs and it just stays in status Leaving. nodetool netstats on all nodes shows Nothing streaming to/from. There is no activity in the other

Update: Re: nothing happening in the cluster after a nodetool move

2011-04-05 Thread Jonathan Colby
SSTableReader.java (line 154) Opening /var/lib/cassandra/data/DFS/main-f-129 INFO [CompactionExecutor:1] 2011-04-05 22:46:02,228 SSTableReader.java (line 154) Opening /var/lib/cassandra/data/DFS/main-f-130 On Apr 5, 2011, at 10:46 PM, Jonathan Colby wrote: I added a node to the cluster and I am

Re: nodetool cleanup - results in more disk use?

2011-04-04 Thread Jonathan Colby
thresholds are applied per bucket of files that share a similar size, there is normally more smaller files and fewer larger files. Aaron On 2 Apr 2011, at 01:45, Jonathan Colby wrote: I discovered that a Garbage collection cleans up the unused old SSTables. But I still wonder whether cleanup

Re: changing replication strategy and effects on replica nodes

2011-04-01 Thread Jonathan Colby
, but is a little messy. Depending on your setup it may also be possible to copy / move the nodes manually by moving sstable files. I've not done it myself, are you able to run a test ? Hope that helps. Aaron On 1 Apr 2011, at 02:04, Jonathan Colby wrote: From my understanding of replica

nodetool cleanup - results in more disk use?

2011-04-01 Thread Jonathan Colby
I ran node cleanup on a node in my cluster and discovered the disk usage went from 3.3 GB to 5.4 GB. Why is this? I thought cleanup just removed hinted handoff information. I read that *during* cleanup extra disk space will be used similar to a compaction. But I was expecting the disk

Re: nodetool cleanup - results in more disk use?

2011-04-01 Thread Jonathan Colby
I discovered that a Garbage collection cleans up the unused old SSTables. But I still wonder whether cleanup really does a full compaction. This would be undesirable if so. On Apr 1, 2011, at 4:08 PM, Jonathan Colby wrote: I ran node cleanup on a node in my cluster and discovered the disk

changing replication strategy and effects on replica nodes

2011-03-31 Thread Jonathan Colby
From my understanding of replica copies, cassandra picks which nodes to replicate the data based on replication strategy, and those same replica partner nodes are always used according to token ring distribution. If you change the replication strategy, does cassandra pick new nodes to

Re: How to determine if repair need to be run

2011-03-31 Thread Jonathan Colby
silly question, would every cassandra installation need to have manual repairs done on it? It would seem cassandra's read repair and regular compaction would take care of keeping the data clean. Am I missing something? On Mar 30, 2011, at 7:46 PM, Peter Schuller wrote: I just wanted to

Re: How to determine if repair need to be run

2011-03-31 Thread Jonathan Colby
Peter - Thanks a lot for elaborating on repairs.Still, it's a bit fuzzy to me why it is so important to run a repair before the GCGraceSeconds kicks in. Does this mean a delete does not get replicated ? In other words when I delete something on a node, doesn't cassandra set tombstones

difference between compaction, repair, clean

2011-03-30 Thread Jonathan Colby
I'm a little unclear on the differences between the nodetool operations: - compaction - repair - clean I understand that compaction consolidates the SSTables and physically performs deletes by taking into account the Tombstones. But what does clean and repair do then?

Re: Central monitoring of Cassandra cluster

2011-03-25 Thread Jonathan Colby
Cacti and Munin are great for graphing, nagios is good for monitoring. I wrote a very simple JMX proxy that you can send a request to and it retrieves the desired JMX beans. there are jmx proxys out there if you don't want to write your own, for example

how does cassandra pick its replicant peers?

2011-03-25 Thread Jonathan Colby
Does anyone know how cassandra chooses the nodes for its other replicant copies? The first node gets the first copy because its token is assigned for that key. But what about the other copies of the data? Do the replicant nodes stay the same based on the token range? Or are the other

Quorum, Hector, and datacenter preference

2011-03-24 Thread Jonathan Colby
Hi - Our cluster is spread between 2 datacenters. We have a straight-forward IP assignment so that OldNetworkTopology (rackinferring snitch) works well.We have cassandra clients written in Hector in each of those data centers. The Hector clients all have a list of all cassandra nodes

Re: Quorum, Hector, and datacenter preference

2011-03-24 Thread Jonathan Colby
, 2011, at 2:02 PM, Jonathan Colby wrote: Hi - Our cluster is spread between 2 datacenters. We have a straight-forward IP assignment so that OldNetworkTopology (rackinferring snitch) works well. We have cassandra clients written in Hector in each of those data centers. The Hector

Deleting old SSTables

2011-03-22 Thread Jonathan Colby
According to the Wiki Page on compaction: once compaction is finished, the old SSTable files may be deleted* * http://wiki.apache.org/cassandra/MemtableSSTable I thought the old SSTables would be deleted automatically, but this wiki page got me thinking otherwise. Question is, if it is true

Changing memtable_throughput_in_mb on a running system

2011-03-22 Thread Jonathan Colby
It seems some settings like memtable_throughput_in_mb are Keyspace-specific (at least with 0.7.4). How can these settings best be changed on a running cluster? PS - preferable by a sysadmin using nodetool or cassandra-cli Thanks! Jon

Re: Deleting old SSTables

2011-03-22 Thread Jonathan Colby
itself if it detects that it is low on space. A compaction marker is also added to obsolete sstables so they can be deleted on startup if the server does not perform a GC before being restarted. On Tue, Mar 22, 2011 at 8:30 AM, Jonathan Colby jonathan.co...@gmail.com wrote: According

Meaning of TotalReadLatencyMicros and TotalWriteLatencyMicrosStatistics

2011-03-22 Thread Jonathan Colby
Hi - On our recently live cassandra cluster of 5 nodes, we've noticed that the latency readings, especially Reads have gone up drastically. TotalReadLatencyMicros 5413483 TotalWriteLatencyMicros 1811824 I understand these are in microseconds, but what meaning do they have

cassandra nodes with mixed hard disk sizes

2011-03-21 Thread Jonathan Colby
This is a two part question ... 1. If you have cassandra nodes with different sized hard disks, how do you deal with assigning the token ring such that the nodes with larger disks get more data? In other words, given equally distributed token ranges, when the smaller disk nodes run out of

Re: script to modify cassandra.yaml file

2011-03-21 Thread Jonathan Colby
We use Puppet to manage the cassandra.yaml in a different location from the installation. Ours is in /etc/cassandra/cassandra.yaml You can set the environment CASSANDRA_CONF (i believe it is. check the cassandra.in.sh) and the startup script will pick up this as the configuration file to

Replacing a dead seed

2011-03-17 Thread Jonathan Colby
Hi - If a seed crashes (i.e., suddenly unavailable due to HW problem), what is the best way to replace the seed in the cluster? I've read that you should not bootstrap a seed. Therefore I came up with this procedure, but it seems pretty complicated. any better ideas? 1. update the seed

OldNetworkTopologyStrategy with one data center

2011-03-15 Thread Jonathan Colby
Hi - I have a question. Obviously there is no purpose in running OldNetworkTopologyStrategy in one data center. However, we want to share the same configuration in our production (multiple data centers) and pre-production (one data center) environments. My question is will

where to find the stress testing programs?

2011-03-15 Thread Jonathan Colby
According to the Cassandra Wiki and OReilly book supposedly there is a contrib directory within the cassandra download containing the Python Stress Test script stress.py. It's not in the binary tarball of 0.7.3. Anyone know where to find it? Anyone know of other, maybe better stress testing

Re: Virtual IP / hardware load balancing for cassandra nodes

2010-12-20 Thread Jonathan Colby
, but not sufficient. The real test is the JMX values. Dave Viner On Mon, Dec 20, 2010 at 6:25 AM, Jonathan Colby jonathan.co...@gmail.com wrote: I was unable to find example or documentation on my question. I'd like to know what the best way to group a cluster of cassandra nodes behind

Quorum and Datacenter loss

2010-12-12 Thread Jonathan Colby
Hi cassandra experts - We're planning a cassandra cluster across 2 datacenters (datacenter-aware, random partitioning) with QUORUM consistency. It seems to me that with 2 datacenters, if one datacenter is lost, the reads/writes to cassandra will fail in the surviving datacenter because of the

Re: Quorum and Datacenter loss

2010-12-12 Thread Jonathan Colby
Thanks a lot Peter. So basically we would need to choose a consistency other than QUORUM.I think in our case consistency is not necessarily an issue since our data is write-once, read-many (immutable data). I suppose having a replication factor of 4 would result in two nodes in each

understanding the cassandra storage scaling

2010-12-09 Thread Jonathan Colby
I have a very basic question which I have been unable to find in online documentation on cassandra. It seems like every node in a cassandra cluster contains all the data ever stored in the cluster (i.e., all nodes are identical). I don't understand how you can scale this on commodity servers

Re: understanding the cassandra storage scaling

2010-12-09 Thread Jonathan Colby
sense for R to be close to N in which case cassandra is useful so the database doesn't have a single a single point of failure but not so much b/c of the size of the data. But for large clusters it rarely makes sense to have N=R, usually N R. On Thu, Dec 9, 2010 at 12:28 PM, Jonathan Colby

Re: understanding the cassandra storage scaling

2010-12-09 Thread Jonathan Colby
awesome! Thank you guys for the really quick answers and the links to the presentations. On Thu, Dec 9, 2010 at 12:06 PM, Sylvain Lebresne sylv...@yakaz.com wrote: This helps a little but unfortunately I'm still a bit fuzzy for me.  So is it not true that each node contains all the data in the