Re: Replication-aware compaction

2011-06-07 Thread David Boxenhorn
Thanks! I'm actually on vacation now, so I hope to look into this next week.

On Mon, Jun 6, 2011 at 10:25 PM, aaron morton aa...@thelastpickle.com wrote:
 You should consider upgrading to 0.7.6 to get a fix to Gossip. Earlier 0.7 
 releases were prone to marking nodes up and down when they should not have 
 been. See 
 https://github.com/apache/cassandra/blob/cassandra-0.7/CHANGES.txt#L22

 Are the TimedOutExceptions to the client for read or write requests ? During 
 the burst times which stages are backing up  nodetool tpstats ? Compaction 
 should not affect writes too much (assuming different log and data spindles).

 You could also take a look at the read and write latency stats for a 
 particular CF using nodetool cfstats or JConsole. These will give you the 
 stats for the local operations. You could also take a look at the iostats on 
 the box http://spyced.blogspot.com/2010/01/linux-performance-basics.html

 Hope that helps.

 -
 Aaron Morton
 Freelance Cassandra Developer
 @aaronmorton
 http://www.thelastpickle.com

 On 7 Jun 2011, at 00:30, David Boxenhorn wrote:

 Version 0.7.3.

 Yes, I am talking about minor compactions. I have three nodes, RF=3.
 3G data (before replication). Not many users (yet). It seems like 3
 nodes should be plenty. But when all 3 nodes are compacting, I
 sometimes get timeouts on the client, and I see in my logs that each
 one is full of notifications that the other nodes have died (and come
 back to life after about a second). My cluster can tolerate one node
 being out of commission, so I would rather have longer compactions one
 at a time than shorter compactions all at the same time.

 I think that our usage pattern of bursty writes causes the three nodes
 to decide to compact at the same time. These bursts are followed by
 periods of relative quiet, so there should be time for the other two
 nodes to compact one at a time.


 On Mon, Jun 6, 2011 at 3:27 PM, David Boxenhorn da...@citypath.com wrote:

 Version 0.7.3.

 Yes, I am talking about minor compactions. I have three nodes, RF=3. 3G 
 data (before replication). Not many users (yet). It seems like 3 nodes 
 should be plenty. But when all 3 nodes are compacting, I sometimes get 
 timeouts on the client, and I see in my logs that each one is full of 
 notifications that the other nodes have died (and come back to life after 
 about a second). My cluster can tolerate one node being out of commission, 
 so I would rather have longer compactions one at a time than shorter 
 compactions all at the same time.

 I think that our usage pattern of bursty writes causes the three nodes to 
 decide to compact at the same time. These bursts are followed by periods of 
 relative quiet, so there should be time for the other two nodes to compact 
 one at a time.


 On Mon, Jun 6, 2011 at 2:36 PM, aaron morton aa...@thelastpickle.com 
 wrote:

 Are you talking about minor (automatic) compactions ? Can you provide some 
 more information on what's happening to make the node unusable and what 
 version you are using? It's not lightweight process, but it should not 
 hurt the node that badly. It is considered an online operation.

 Delaying compaction will only make it run for longer and take more 
 resources.

 Cheers

 -
 Aaron Morton
 Freelance Cassandra Developer
 @aaronmorton
 http://www.thelastpickle.com

 On 6 Jun 2011, at 20:14, David Boxenhorn wrote:

 Is there some deep architectural reason why compaction can't be
 replication-aware?

 What I mean is, if one node is doing compaction, its replicas
 shouldn't be doing compaction at the same time. Or, at least a quorum
 of nodes should be available at all times.

 For example, if RF=3, and one node is doing compaction, the nodes to
 its right and left in the ring should wait on compaction until that
 node is done.

 Of course, my real problem is that compaction makes a node pretty much
 unavailable. If we can fix that problem then this is not necessary.






Replication-aware compaction

2011-06-06 Thread David Boxenhorn
Is there some deep architectural reason why compaction can't be
replication-aware?

What I mean is, if one node is doing compaction, its replicas
shouldn't be doing compaction at the same time. Or, at least a quorum
of nodes should be available at all times.

For example, if RF=3, and one node is doing compaction, the nodes to
its right and left in the ring should wait on compaction until that
node is done.

Of course, my real problem is that compaction makes a node pretty much
unavailable. If we can fix that problem then this is not necessary.


Re: Replication-aware compaction

2011-06-06 Thread David Boxenhorn
Version 0.7.3.

Yes, I am talking about minor compactions. I have three nodes, RF=3.
3G data (before replication). Not many users (yet). It seems like 3
nodes should be plenty. But when all 3 nodes are compacting, I
sometimes get timeouts on the client, and I see in my logs that each
one is full of notifications that the other nodes have died (and come
back to life after about a second). My cluster can tolerate one node
being out of commission, so I would rather have longer compactions one
at a time than shorter compactions all at the same time.

I think that our usage pattern of bursty writes causes the three nodes
to decide to compact at the same time. These bursts are followed by
periods of relative quiet, so there should be time for the other two
nodes to compact one at a time.


On Mon, Jun 6, 2011 at 3:27 PM, David Boxenhorn da...@citypath.com wrote:

 Version 0.7.3.

 Yes, I am talking about minor compactions. I have three nodes, RF=3. 3G data 
 (before replication). Not many users (yet). It seems like 3 nodes should be 
 plenty. But when all 3 nodes are compacting, I sometimes get timeouts on the 
 client, and I see in my logs that each one is full of notifications that the 
 other nodes have died (and come back to life after about a second). My 
 cluster can tolerate one node being out of commission, so I would rather have 
 longer compactions one at a time than shorter compactions all at the same 
 time.

 I think that our usage pattern of bursty writes causes the three nodes to 
 decide to compact at the same time. These bursts are followed by periods of 
 relative quiet, so there should be time for the other two nodes to compact 
 one at a time.


 On Mon, Jun 6, 2011 at 2:36 PM, aaron morton aa...@thelastpickle.com wrote:

 Are you talking about minor (automatic) compactions ? Can you provide some 
 more information on what's happening to make the node unusable and what 
 version you are using? It's not lightweight process, but it should not hurt 
 the node that badly. It is considered an online operation.

 Delaying compaction will only make it run for longer and take more resources.

 Cheers

 -
 Aaron Morton
 Freelance Cassandra Developer
 @aaronmorton
 http://www.thelastpickle.com

 On 6 Jun 2011, at 20:14, David Boxenhorn wrote:

  Is there some deep architectural reason why compaction can't be
  replication-aware?
 
  What I mean is, if one node is doing compaction, its replicas
  shouldn't be doing compaction at the same time. Or, at least a quorum
  of nodes should be available at all times.
 
  For example, if RF=3, and one node is doing compaction, the nodes to
  its right and left in the ring should wait on compaction until that
  node is done.
 
  Of course, my real problem is that compaction makes a node pretty much
  unavailable. If we can fix that problem then this is not necessary.




Re: Replication-aware compaction

2011-06-06 Thread aaron morton
You should consider upgrading to 0.7.6 to get a fix to Gossip. Earlier 0.7 
releases were prone to marking nodes up and down when they should not have 
been. See https://github.com/apache/cassandra/blob/cassandra-0.7/CHANGES.txt#L22

Are the TimedOutExceptions to the client for read or write requests ? During 
the burst times which stages are backing up  nodetool tpstats ? Compaction 
should not affect writes too much (assuming different log and data spindles). 

You could also take a look at the read and write latency stats for a particular 
CF using nodetool cfstats or JConsole. These will give you the stats for the 
local operations. You could also take a look at the iostats on the box 
http://spyced.blogspot.com/2010/01/linux-performance-basics.html

Hope that helps.

-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 7 Jun 2011, at 00:30, David Boxenhorn wrote:

 Version 0.7.3.
 
 Yes, I am talking about minor compactions. I have three nodes, RF=3.
 3G data (before replication). Not many users (yet). It seems like 3
 nodes should be plenty. But when all 3 nodes are compacting, I
 sometimes get timeouts on the client, and I see in my logs that each
 one is full of notifications that the other nodes have died (and come
 back to life after about a second). My cluster can tolerate one node
 being out of commission, so I would rather have longer compactions one
 at a time than shorter compactions all at the same time.
 
 I think that our usage pattern of bursty writes causes the three nodes
 to decide to compact at the same time. These bursts are followed by
 periods of relative quiet, so there should be time for the other two
 nodes to compact one at a time.
 
 
 On Mon, Jun 6, 2011 at 3:27 PM, David Boxenhorn da...@citypath.com wrote:
 
 Version 0.7.3.
 
 Yes, I am talking about minor compactions. I have three nodes, RF=3. 3G data 
 (before replication). Not many users (yet). It seems like 3 nodes should be 
 plenty. But when all 3 nodes are compacting, I sometimes get timeouts on the 
 client, and I see in my logs that each one is full of notifications that the 
 other nodes have died (and come back to life after about a second). My 
 cluster can tolerate one node being out of commission, so I would rather 
 have longer compactions one at a time than shorter compactions all at the 
 same time.
 
 I think that our usage pattern of bursty writes causes the three nodes to 
 decide to compact at the same time. These bursts are followed by periods of 
 relative quiet, so there should be time for the other two nodes to compact 
 one at a time.
 
 
 On Mon, Jun 6, 2011 at 2:36 PM, aaron morton aa...@thelastpickle.com wrote:
 
 Are you talking about minor (automatic) compactions ? Can you provide some 
 more information on what's happening to make the node unusable and what 
 version you are using? It's not lightweight process, but it should not hurt 
 the node that badly. It is considered an online operation.
 
 Delaying compaction will only make it run for longer and take more 
 resources.
 
 Cheers
 
 -
 Aaron Morton
 Freelance Cassandra Developer
 @aaronmorton
 http://www.thelastpickle.com
 
 On 6 Jun 2011, at 20:14, David Boxenhorn wrote:
 
 Is there some deep architectural reason why compaction can't be
 replication-aware?
 
 What I mean is, if one node is doing compaction, its replicas
 shouldn't be doing compaction at the same time. Or, at least a quorum
 of nodes should be available at all times.
 
 For example, if RF=3, and one node is doing compaction, the nodes to
 its right and left in the ring should wait on compaction until that
 node is done.
 
 Of course, my real problem is that compaction makes a node pretty much
 unavailable. If we can fix that problem then this is not necessary.