[jira] Resolved: (CASSANDRA-2153) client temporarily freezes due to hard-coded JMX port

2011-02-11 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-2153.
---

Resolution: Not A Problem

use --jmxport

 client temporarily freezes due to hard-coded JMX port
 -

 Key: CASSANDRA-2153
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2153
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.7.0
Reporter: Yang Yang

 when you do a show keyspaces inside cassandra-cli,
 on the current 0.7 head svn,
 the client CliSessionState.java hardcodes the JMX port to be 8080
 so if there is indeed a webserver listening on 8080, it gets the connection 
 and indeed tries to talk to it, but the protocol doesn't make sense to 
 cassandra-cli , so it freezes for about 10 seconds before finally giving up.
 changing the hardcoded value to my actual jmx port fixes the issue.
 so this should be read from config file  or argument

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Reopened: (CASSANDRA-2153) client temporarily freezes due to hard-coded JMX port

2011-02-11 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis reopened CASSANDRA-2153:
---

  Assignee: Pavel Yaskevich

Hmm, looks like we accept --jmxport as an option but it doesn't get passed to 
CliSessionState's NodeProbe

 client temporarily freezes due to hard-coded JMX port
 -

 Key: CASSANDRA-2153
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2153
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.7.0
Reporter: Yang Yang
Assignee: Pavel Yaskevich
 Fix For: 0.7.2


 when you do a show keyspaces inside cassandra-cli,
 on the current 0.7 head svn,
 the client CliSessionState.java hardcodes the JMX port to be 8080
 so if there is indeed a webserver listening on 8080, it gets the connection 
 and indeed tries to talk to it, but the protocol doesn't make sense to 
 cassandra-cli , so it freezes for about 10 seconds before finally giving up.
 changing the hardcoded value to my actual jmx port fixes the issue.
 so this should be read from config file  or argument

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-2153) client temporarily freezes due to hard-coded JMX port

2011-02-11 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-2153:
--

Fix Version/s: 0.7.2

 client temporarily freezes due to hard-coded JMX port
 -

 Key: CASSANDRA-2153
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2153
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.7.0
Reporter: Yang Yang
Assignee: Pavel Yaskevich
 Fix For: 0.7.2


 when you do a show keyspaces inside cassandra-cli,
 on the current 0.7 head svn,
 the client CliSessionState.java hardcodes the JMX port to be 8080
 so if there is indeed a webserver listening on 8080, it gets the connection 
 and indeed tries to talk to it, but the protocol doesn't make sense to 
 cassandra-cli , so it freezes for about 10 seconds before finally giving up.
 changing the hardcoded value to my actual jmx port fixes the issue.
 so this should be read from config file  or argument

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (CASSANDRA-2153) client temporarily freezes due to hard-coded JMX port

2011-02-11 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12993522#comment-12993522
 ] 

Pavel Yaskevich commented on CASSANDRA-2153:


Re-tested on both cassandra-0.7 and trunk - works fine, you can see 
CliOptions.java:157 (place where JMX port from arguments gets passed to the 
CliSessionState)

 client temporarily freezes due to hard-coded JMX port
 -

 Key: CASSANDRA-2153
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2153
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.7.0
Reporter: Yang Yang
Assignee: Pavel Yaskevich
 Fix For: 0.7.2


 when you do a show keyspaces inside cassandra-cli,
 on the current 0.7 head svn,
 the client CliSessionState.java hardcodes the JMX port to be 8080
 so if there is indeed a webserver listening on 8080, it gets the connection 
 and indeed tries to talk to it, but the protocol doesn't make sense to 
 cassandra-cli , so it freezes for about 10 seconds before finally giving up.
 changing the hardcoded value to my actual jmx port fixes the issue.
 so this should be read from config file  or argument

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Created: (CASSANDRA-2154) Update BootstrapTest to include multiple columnfamilies

2011-02-11 Thread Jonathan Ellis (JIRA)
Update BootstrapTest to include multiple columnfamilies
---

 Key: CASSANDRA-2154
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2154
 Project: Cassandra
  Issue Type: Test
Reporter: Jonathan Ellis
Assignee: Pavel Yaskevich
 Fix For: 0.7.2


The goal is to make sure we catch any future regressions like CASSANDRA-1992

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-2154) Update BootstrapTest to include multiple columnfamilies

2011-02-11 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-2154:
--

Component/s: Tests

 Update BootstrapTest to include multiple columnfamilies
 ---

 Key: CASSANDRA-2154
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2154
 Project: Cassandra
  Issue Type: Test
  Components: Tests
Reporter: Jonathan Ellis
Assignee: Pavel Yaskevich
 Fix For: 0.7.2


 The goal is to make sure we catch any future regressions like CASSANDRA-1992

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-2154) Update BootstrapperTest to include multiple columnfamilies

2011-02-11 Thread Pavel Yaskevich (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Yaskevich updated CASSANDRA-2154:
---

Summary: Update BootstrapperTest to include multiple columnfamilies  (was: 
Update BootstrapTest to include multiple columnfamilies)

 Update BootstrapperTest to include multiple columnfamilies
 --

 Key: CASSANDRA-2154
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2154
 Project: Cassandra
  Issue Type: Test
  Components: Tests
Reporter: Jonathan Ellis
Assignee: Pavel Yaskevich
 Fix For: 0.7.2


 The goal is to make sure we catch any future regressions like CASSANDRA-1992

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (CASSANDRA-2154) Update BootstrapperTest to include multiple columnfamilies

2011-02-11 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12993530#comment-12993530
 ] 

Pavel Yaskevich commented on CASSANDRA-2154:


What about creating dedicated test called BootstrapTest in the test/distributed 
for this? 

 Update BootstrapperTest to include multiple columnfamilies
 --

 Key: CASSANDRA-2154
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2154
 Project: Cassandra
  Issue Type: Test
  Components: Tests
Reporter: Jonathan Ellis
Assignee: Pavel Yaskevich
 Fix For: 0.7.2


 The goal is to make sure we catch any future regressions like CASSANDRA-1992

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Created: (CASSANDRA-2155) Fix counter bug (regression from svn commit r1068504)

2011-02-11 Thread Sylvain Lebresne (JIRA)
Fix counter bug (regression from svn commit r1068504)
-

 Key: CASSANDRA-2155
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2155
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.8
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
Priority: Trivial
 Fix For: 0.8
 Attachments: 0001-Fix-regression-from-svn-commit-1068504.patch

A line was mistakenly removed by the merge from 0.7 at r1068504

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-2155) Fix counter bug (regression from svn commit r1068504)

2011-02-11 Thread Sylvain Lebresne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-2155:


Attachment: 0001-Fix-regression-from-svn-commit-1068504.patch

 Fix counter bug (regression from svn commit r1068504)
 -

 Key: CASSANDRA-2155
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2155
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.8
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
Priority: Trivial
 Fix For: 0.8

 Attachments: 0001-Fix-regression-from-svn-commit-1068504.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 A line was mistakenly removed by the merge from 0.7 at r1068504

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (CASSANDRA-2154) Update BootstrapperTest to include multiple columnfamilies

2011-02-11 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12993570#comment-12993570
 ] 

Jonathan Ellis commented on CASSANDRA-2154:
---

Improving the distributed bootstrap test is also good, but we should cover this 
in the unit test as well.

 Update BootstrapperTest to include multiple columnfamilies
 --

 Key: CASSANDRA-2154
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2154
 Project: Cassandra
  Issue Type: Test
  Components: Tests
Reporter: Jonathan Ellis
Assignee: Pavel Yaskevich
 Fix For: 0.7.2


 The goal is to make sure we catch any future regressions like CASSANDRA-1992

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




svn commit: r1069879 - /cassandra/trunk/src/java/org/apache/cassandra/service/StorageProxy.java

2011-02-11 Thread jbellis
Author: jbellis
Date: Fri Feb 11 16:43:24 2011
New Revision: 1069879

URL: http://svn.apache.org/viewvc?rev=1069879view=rev
Log:
fix merge
patch by slebresne; reviewed by jbellis for CASSANDRA-2155

Modified:
cassandra/trunk/src/java/org/apache/cassandra/service/StorageProxy.java

Modified: 
cassandra/trunk/src/java/org/apache/cassandra/service/StorageProxy.java
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/src/java/org/apache/cassandra/service/StorageProxy.java?rev=1069879r1=1069878r2=1069879view=diff
==
--- cassandra/trunk/src/java/org/apache/cassandra/service/StorageProxy.java 
(original)
+++ cassandra/trunk/src/java/org/apache/cassandra/service/StorageProxy.java Fri 
Feb 11 16:43:24 2011
@@ -222,7 +222,8 @@ public class StorageProxy implements Sto
 // unhinted writes
 if (destination.equals(FBUtilities.getLocalAddress()))
 {
-insertLocal(rm, responseHandler);
+if (insertLocalMessages)
+insertLocal(rm, responseHandler);
 }
 else
 {




[jira] Resolved: (CASSANDRA-2155) Fix counter bug (regression from svn commit r1068504)

2011-02-11 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-2155.
---

Resolution: Fixed

committed

 Fix counter bug (regression from svn commit r1068504)
 -

 Key: CASSANDRA-2155
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2155
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.8
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
Priority: Trivial
 Fix For: 0.8

 Attachments: 0001-Fix-regression-from-svn-commit-1068504.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 A line was mistakenly removed by the merge from 0.7 at r1068504

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-1938) Use UUID as node identifiers in counters instead of IP addresses

2011-02-11 Thread Sylvain Lebresne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-1938:


Attachment: (was: 0003-Thrift-change-to-CfDef.patch)

 Use UUID as node identifiers in counters instead of IP addresses 
 -

 Key: CASSANDRA-1938
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1938
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
 Fix For: 0.8

 Attachments: 1938_discussion

   Original Estimate: 56h
  Remaining Estimate: 56h

 The use of IP addresses as node identifiers in the partition of a given
 counter is fragile. Changes of the node's IP addresses can result in data
 loss. This patch proposes to use UUIDs instead.
 NOTE: this breaks the on-disk file format (for counters)

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-1938) Use UUID as node identifiers in counters instead of IP addresses

2011-02-11 Thread Sylvain Lebresne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-1938:


Attachment: (was: 0001-Use-uuid-instead-of-IP-for-counters.patch)

 Use UUID as node identifiers in counters instead of IP addresses 
 -

 Key: CASSANDRA-1938
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1938
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
 Fix For: 0.8

 Attachments: 1938_discussion

   Original Estimate: 56h
  Remaining Estimate: 56h

 The use of IP addresses as node identifiers in the partition of a given
 counter is fragile. Changes of the node's IP addresses can result in data
 loss. This patch proposes to use UUIDs instead.
 NOTE: this breaks the on-disk file format (for counters)

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-1938) Use UUID as node identifiers in counters instead of IP addresses

2011-02-11 Thread Sylvain Lebresne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-1938:


Attachment: (was: 0002-Merge-old-shard-locally.patch)

 Use UUID as node identifiers in counters instead of IP addresses 
 -

 Key: CASSANDRA-1938
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1938
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
 Fix For: 0.8

 Attachments: 1938_discussion

   Original Estimate: 56h
  Remaining Estimate: 56h

 The use of IP addresses as node identifiers in the partition of a given
 counter is fragile. Changes of the node's IP addresses can result in data
 loss. This patch proposes to use UUIDs instead.
 NOTE: this breaks the on-disk file format (for counters)

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-1938) Use UUID as node identifiers in counters instead of IP addresses

2011-02-11 Thread Sylvain Lebresne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-1938:


Attachment: 0003-Thrift-change-to-CfDef.patch
0002-Merge-old-shard-locally.patch
0001-Use-uuid-instead-of-IP-for-counters.patch

Rebased patch attached. I've also run more test, in particular some boostrap 
followed by decomission with and without cleanup in between (and 
writing/reading counter during all this) and those tests pass (they don't with 
current trunk). So confidence++ on this. 

 Use UUID as node identifiers in counters instead of IP addresses 
 -

 Key: CASSANDRA-1938
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1938
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
 Fix For: 0.8

 Attachments: 0001-Use-uuid-instead-of-IP-for-counters.patch, 
 0002-Merge-old-shard-locally.patch, 0003-Thrift-change-to-CfDef.patch, 
 1938_discussion

   Original Estimate: 56h
  Remaining Estimate: 56h

 The use of IP addresses as node identifiers in the partition of a given
 counter is fragile. Changes of the node's IP addresses can result in data
 loss. This patch proposes to use UUIDs instead.
 NOTE: this breaks the on-disk file format (for counters)

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (CASSANDRA-2155) Fix counter bug (regression from svn commit r1068504)

2011-02-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12993608#comment-12993608
 ] 

Hudson commented on CASSANDRA-2155:
---

Integrated in Cassandra #724 (See 
[https://hudson.apache.org/hudson/job/Cassandra/724/])
fix merge
patch by slebresne; reviewed by jbellis for CASSANDRA-2155


 Fix counter bug (regression from svn commit r1068504)
 -

 Key: CASSANDRA-2155
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2155
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.8
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
Priority: Trivial
 Fix For: 0.8

 Attachments: 0001-Fix-regression-from-svn-commit-1068504.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 A line was mistakenly removed by the merge from 0.7 at r1068504

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Resolved: (CASSANDRA-1016) Plugins

2011-02-11 Thread Stu Hood (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stu Hood resolved CASSANDRA-1016.
-

Resolution: Duplicate

Resolving as a dupe of CASSANDRA-1311, which is much further along. Thanks for 
the initial work here!

 Plugins
 ---

 Key: CASSANDRA-1016
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1016
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Ryan King
Assignee: Jeff Hodges
 Fix For: 0.8

 Attachments: CASSANDRA-1016-2.patch, CASSANDRA-1016.patch


 As discussed at the Digg-hosted hackathon.
 First off, this needs a better name, the idea isn't exactly like coprocessors 
 from BigTable and this entry should be considered a stub for now (Stu and 
 Marius should be able to provide more details).
 The idea is that for mutation operations, we should all the user to run a 
 routine that has access to the old version of the data and the new 
 version, and can take action.
 At a bare minimum, this should be capable of implementing distributed 
 secondary indexes.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (CASSANDRA-1906) Sanitize configuration code

2011-02-11 Thread Jon Hermes (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12993670#comment-12993670
 ] 

Jon Hermes commented on CASSANDRA-1906:
---

CFS has started to accumulate DefaultTs for settings that can be changed at 
runtime for per-node settings (min/max compaction threshold, and mem 
size/ops/time (soon row/key cache save period CASSANDRA-2100)). If we want to 
keep these ephemeral (non-migration) changes, then Tables (KSs) and 
StorageService (Global per-node settings) should follow in suit, and there 
should be one for every config option that doesn't impact the cluster (i.e. 
changing the saved_caches dir/ = Good, changing the partitioner or token 
ephemerally = bad).

This reduces it to three paths:
- Read from cassandra.yaml for SS settings at boot time,
- Read from the schema and accept migrations to the schema for permanent KS/CF 
settings,
- Change any per-node value at runtime in SS/Table/CFS,
... and the first and third may well be combined for a scant two _code_ paths 
(compare to current 4+ code paths) separated by the permanent/non-permanent 
taxonomy.

 Sanitize configuration code
 ---

 Key: CASSANDRA-1906
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1906
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jon Hermes
Assignee: Jon Hermes
Priority: Minor
 Fix For: 0.8

   Original Estimate: 24h
  Remaining Estimate: 24h

 Multipart:
 - Drop deprecated YAML config. Only config allowed is via thrift/JMX. Make 
 this gratuitously easy to do with sane defaults and accepting changesets as 
 opposed to full definitions.
 - Combine common code between KS/CF/ColumnDefs and between thrift/avro defs.
 - Provide an obvious and clean interface for changing settings locally versus 
 globally (JMX vs. thrift). Dox here.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Issue Comment Edited: (CASSANDRA-1906) Sanitize configuration code

2011-02-11 Thread Jon Hermes (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12993670#comment-12993670
 ] 

Jon Hermes edited comment on CASSANDRA-1906 at 2/11/11 8:08 PM:


CFS has started to accumulate DefaultTs for settings that can be changed at 
runtime for per-node settings (min/max compaction threshold, and mem 
size/ops/time (soon row/key cache save period CASSANDRA-2100)). If we want to 
keep these ephemeral (non-migration) changes, then Tables (KSs) and 
StorageService (Global per-node settings) should follow in suit, and there 
should be one for every config option that doesn't impact the cluster (i.e. 
changing the max_hint_window ephemerally = Good, changing the partitioner or 
token ephemerally = Bad/impossible).

This reduces it to three paths:
- Read from cassandra.yaml for SS settings at boot time,
- Read from the schema and accept migrations to the schema for permanent KS/CF 
settings,
- Change any per-node value at runtime in SS/Table/CFS,

... and the first and third may well be combined for a scant two _code_ paths 
(compare to current 4+ code paths) separated by the permanent/non-permanent 
taxonomy.

  was (Author: jhermes):
CFS has started to accumulate DefaultTs for settings that can be changed 
at runtime for per-node settings (min/max compaction threshold, and mem 
size/ops/time (soon row/key cache save period CASSANDRA-2100)). If we want to 
keep these ephemeral (non-migration) changes, then Tables (KSs) and 
StorageService (Global per-node settings) should follow in suit, and there 
should be one for every config option that doesn't impact the cluster (i.e. 
changing the saved_caches dir/ = Good, changing the partitioner or token 
ephemerally = bad).

This reduces it to three paths:
- Read from cassandra.yaml for SS settings at boot time,
- Read from the schema and accept migrations to the schema for permanent KS/CF 
settings,
- Change any per-node value at runtime in SS/Table/CFS,
... and the first and third may well be combined for a scant two _code_ paths 
(compare to current 4+ code paths) separated by the permanent/non-permanent 
taxonomy.
  
 Sanitize configuration code
 ---

 Key: CASSANDRA-1906
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1906
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jon Hermes
Assignee: Jon Hermes
Priority: Minor
 Fix For: 0.8

   Original Estimate: 24h
  Remaining Estimate: 24h

 Multipart:
 - Drop deprecated YAML config. Only config allowed is via thrift/JMX. Make 
 this gratuitously easy to do with sane defaults and accepting changesets as 
 opposed to full definitions.
 - Combine common code between KS/CF/ColumnDefs and between thrift/avro defs.
 - Provide an obvious and clean interface for changing settings locally versus 
 globally (JMX vs. thrift). Dox here.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-2154) Update StreamingTransferTest to include multiple ColumnFamilies

2011-02-11 Thread Pavel Yaskevich (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Yaskevich updated CASSANDRA-2154:
---

Description: The goal is to make sure we catch any future regressions like 
CASSANDRA-1992. Create a new BootstrapTest in test/distributed to test data 
consistency after bootstrapping new nodes to the cluster.  (was: The goal is to 
make sure we catch any future regressions like CASSANDRA-1992)
Summary: Update StreamingTransferTest to include multiple 
ColumnFamilies  (was: Update BootstrapperTest to include multiple 
columnfamilies)

 Update StreamingTransferTest to include multiple ColumnFamilies
 ---

 Key: CASSANDRA-2154
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2154
 Project: Cassandra
  Issue Type: Test
  Components: Tests
Reporter: Jonathan Ellis
Assignee: Pavel Yaskevich
 Fix For: 0.7.2


 The goal is to make sure we catch any future regressions like CASSANDRA-1992. 
 Create a new BootstrapTest in test/distributed to test data consistency after 
 bootstrapping new nodes to the cluster.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-2154) Update StreamingTransferTest to include multiple ColumnFamilies

2011-02-11 Thread Pavel Yaskevich (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Yaskevich updated CASSANDRA-2154:
---

Remaining Estimate: 4h
 Original Estimate: 4h

 Update StreamingTransferTest to include multiple ColumnFamilies
 ---

 Key: CASSANDRA-2154
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2154
 Project: Cassandra
  Issue Type: Test
  Components: Tests
Reporter: Jonathan Ellis
Assignee: Pavel Yaskevich
 Fix For: 0.7.2

   Original Estimate: 4h
  Remaining Estimate: 4h

 The goal is to make sure we catch any future regressions like CASSANDRA-1992. 
 Also create a new BootstrapTest in test/distributed to test data consistency 
 after bootstrapping new nodes to the cluster.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-2154) Update StreamingTransferTest to include multiple ColumnFamilies

2011-02-11 Thread Pavel Yaskevich (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Yaskevich updated CASSANDRA-2154:
---

Description: The goal is to make sure we catch any future regressions like 
CASSANDRA-1992. Also create a new BootstrapTest in test/distributed to test 
data consistency after bootstrapping new nodes to the cluster.  (was: The goal 
is to make sure we catch any future regressions like CASSANDRA-1992. Create a 
new BootstrapTest in test/distributed to test data consistency after 
bootstrapping new nodes to the cluster.)

 Update StreamingTransferTest to include multiple ColumnFamilies
 ---

 Key: CASSANDRA-2154
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2154
 Project: Cassandra
  Issue Type: Test
  Components: Tests
Reporter: Jonathan Ellis
Assignee: Pavel Yaskevich
 Fix For: 0.7.2


 The goal is to make sure we catch any future regressions like CASSANDRA-1992. 
 Also create a new BootstrapTest in test/distributed to test data consistency 
 after bootstrapping new nodes to the cluster.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-1709) CQL keyspace and column family management

2011-02-11 Thread Eric Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Evans updated CASSANDRA-1709:
--

Attachment: v1-0002-improved-error-messages.txt
v1-0001-CASSANDRA-1709-CREATE-COLUMNFAMILY-w-system-tests.txt

 CQL keyspace and column family management
 -

 Key: CASSANDRA-1709
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1709
 Project: Cassandra
  Issue Type: Sub-task
  Components: API
Affects Versions: 0.8
Reporter: Eric Evans
Assignee: Eric Evans
Priority: Minor
  Labels: cql
 Fix For: 0.8

 Attachments: 
 v1-0001-CASSANDRA-1709-CREATE-COLUMNFAMILY-w-system-tests.txt, 
 v1-0002-improved-error-messages.txt

   Original Estimate: 0h
  Remaining Estimate: 0h

 CQL specification and implementation for schema management.
 This corresponds to the following RPC methods:
 * system_add_column_family()
 * system_add_keyspace()
 * system_drop_keyspace()
 * system_update_keyspace()
 * system_update_columnfamily()

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Created: (CASSANDRA-2156) Compaction Throttling

2011-02-11 Thread Stu Hood (JIRA)
Compaction Throttling
-

 Key: CASSANDRA-2156
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2156
 Project: Cassandra
  Issue Type: New Feature
Reporter: Stu Hood
 Fix For: 0.8


Compaction is currently relatively bursty: we compact as fast as we can, and 
then we wait for the next compaction to be possible (hurry up and wait).

Instead, to properly amortize compaction, you'd like to compact exactly as fast 
as you need to to keep the sstable count under control.

For every new level of compaction, you need to increase the rate that you 
compact at: a rule of thumb that we're testing on our clusters is to determine 
the maximum number of buckets a node can support (aka, if the 15th bucket holds 
750 GB, we're not going to have more than 15 buckets), and then multiply the 
flush throughput by the number of buckets to get a minimum compaction 
throughput to maintain your sstable count.

Full explanation: for a min compaction threshold of {{T}}, the bucket at level 
{{N}} can contain {{SsubN = T^N}} 'units' (unit == memtable's worth of data on 
disk). Every time a new unit is added, it has a {{1/SsubN}} chance of causing 
the bucket at level N to fill. If the bucket at level N fills, it causes 
{{SsubN}} units to be compacted. So, for each active level in your system you 
have {{SubN * 1 / SsubN}}, or {{1}} amortized unit to compact any time a new 
unit is added.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-2156) Compaction Throttling

2011-02-11 Thread Stu Hood (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stu Hood updated CASSANDRA-2156:


Attachment: for-0.6-0002-Make-compaction-throttling-configurable.txt
for-0.6-0001-Throttle-compaction-to-a-fixed-throughput.txt

Attaching a patch for 0.6 that implements compaction throttling for a fixed 
value.

Since it is relatively easy to automatically figure out the proper throughput, 
we might want to make throttling automatic rather than exposing a config option.

 Compaction Throttling
 -

 Key: CASSANDRA-2156
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2156
 Project: Cassandra
  Issue Type: New Feature
Reporter: Stu Hood
 Fix For: 0.8

 Attachments: 
 for-0.6-0001-Throttle-compaction-to-a-fixed-throughput.txt, 
 for-0.6-0002-Make-compaction-throttling-configurable.txt


 Compaction is currently relatively bursty: we compact as fast as we can, and 
 then we wait for the next compaction to be possible (hurry up and wait).
 Instead, to properly amortize compaction, you'd like to compact exactly as 
 fast as you need to to keep the sstable count under control.
 For every new level of compaction, you need to increase the rate that you 
 compact at: a rule of thumb that we're testing on our clusters is to 
 determine the maximum number of buckets a node can support (aka, if the 15th 
 bucket holds 750 GB, we're not going to have more than 15 buckets), and then 
 multiply the flush throughput by the number of buckets to get a minimum 
 compaction throughput to maintain your sstable count.
 Full explanation: for a min compaction threshold of {{T}}, the bucket at 
 level {{N}} can contain {{SsubN = T^N}} 'units' (unit == memtable's worth of 
 data on disk). Every time a new unit is added, it has a {{1/SsubN}} chance of 
 causing the bucket at level N to fill. If the bucket at level N fills, it 
 causes {{SsubN}} units to be compacted. So, for each active level in your 
 system you have {{SubN * 1 / SsubN}}, or {{1}} amortized unit to compact any 
 time a new unit is added.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-1709) CQL keyspace and column family management

2011-02-11 Thread Eric Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Evans updated CASSANDRA-1709:
--

Attachment: v2-0003-updated-documentation-for-CREATE-COLUMNFAMILY.txt
v2-0002-improved-error-messages.txt
v2-0001-CASSANDRA-1709-CREATE-COLUMNFAMILY-w-system-tests.txt

 CQL keyspace and column family management
 -

 Key: CASSANDRA-1709
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1709
 Project: Cassandra
  Issue Type: Sub-task
  Components: API
Affects Versions: 0.8
Reporter: Eric Evans
Assignee: Eric Evans
Priority: Minor
  Labels: cql
 Fix For: 0.8

 Attachments: v1-0002-improved-error-messages.txt, 
 v2-0001-CASSANDRA-1709-CREATE-COLUMNFAMILY-w-system-tests.txt, 
 v2-0002-improved-error-messages.txt, 
 v2-0003-updated-documentation-for-CREATE-COLUMNFAMILY.txt

   Original Estimate: 0h
  Remaining Estimate: 0h

 CQL specification and implementation for schema management.
 This corresponds to the following RPC methods:
 * system_add_column_family()
 * system_add_keyspace()
 * system_drop_keyspace()
 * system_update_keyspace()
 * system_update_columnfamily()

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-1709) CQL keyspace and column family management

2011-02-11 Thread Eric Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Evans updated CASSANDRA-1709:
--

Attachment: (was: 
v1-0001-CASSANDRA-1709-CREATE-COLUMNFAMILY-w-system-tests.txt)

 CQL keyspace and column family management
 -

 Key: CASSANDRA-1709
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1709
 Project: Cassandra
  Issue Type: Sub-task
  Components: API
Affects Versions: 0.8
Reporter: Eric Evans
Assignee: Eric Evans
Priority: Minor
  Labels: cql
 Fix For: 0.8

 Attachments: v1-0002-improved-error-messages.txt, 
 v2-0001-CASSANDRA-1709-CREATE-COLUMNFAMILY-w-system-tests.txt, 
 v2-0002-improved-error-messages.txt, 
 v2-0003-updated-documentation-for-CREATE-COLUMNFAMILY.txt

   Original Estimate: 0h
  Remaining Estimate: 0h

 CQL specification and implementation for schema management.
 This corresponds to the following RPC methods:
 * system_add_column_family()
 * system_add_keyspace()
 * system_drop_keyspace()
 * system_update_keyspace()
 * system_update_columnfamily()

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-1709) CQL keyspace and column family management

2011-02-11 Thread Eric Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Evans updated CASSANDRA-1709:
--

Attachment: (was: v1-0002-improved-error-messages.txt)

 CQL keyspace and column family management
 -

 Key: CASSANDRA-1709
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1709
 Project: Cassandra
  Issue Type: Sub-task
  Components: API
Affects Versions: 0.8
Reporter: Eric Evans
Assignee: Eric Evans
Priority: Minor
  Labels: cql
 Fix For: 0.8

 Attachments: 
 v2-0001-CASSANDRA-1709-CREATE-COLUMNFAMILY-w-system-tests.txt, 
 v2-0002-improved-error-messages.txt, 
 v2-0003-updated-documentation-for-CREATE-COLUMNFAMILY.txt

   Original Estimate: 0h
  Remaining Estimate: 0h

 CQL specification and implementation for schema management.
 This corresponds to the following RPC methods:
 * system_add_column_family()
 * system_add_keyspace()
 * system_drop_keyspace()
 * system_update_keyspace()
 * system_update_columnfamily()

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (CASSANDRA-1709) CQL keyspace and column family management

2011-02-11 Thread Eric Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12993785#comment-12993785
 ] 

Eric Evans commented on CASSANDRA-1709:
---

The attached patches implement {{CREATE COLUMNFAMILY}}, with a system test, and 
updated documentation. There is more to come, but this part stands alone and 
could go in at any time.

Review of this patchset (before or after commit) is very welcome, but in the 
absence of feedback I will likely commit this early next week to minimize 
rebase/merge efforts.

 CQL keyspace and column family management
 -

 Key: CASSANDRA-1709
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1709
 Project: Cassandra
  Issue Type: Sub-task
  Components: API
Affects Versions: 0.8
Reporter: Eric Evans
Assignee: Eric Evans
Priority: Minor
  Labels: cql
 Fix For: 0.8

 Attachments: 
 v2-0001-CASSANDRA-1709-CREATE-COLUMNFAMILY-w-system-tests.txt, 
 v2-0002-improved-error-messages.txt, 
 v2-0003-updated-documentation-for-CREATE-COLUMNFAMILY.txt

   Original Estimate: 0h
  Remaining Estimate: 0h

 CQL specification and implementation for schema management.
 This corresponds to the following RPC methods:
 * system_add_column_family()
 * system_add_keyspace()
 * system_drop_keyspace()
 * system_update_keyspace()
 * system_update_columnfamily()

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




svn commit: r1070007 - /cassandra/trunk/drivers/py/cqlsh

2011-02-11 Thread eevans
Author: eevans
Date: Sat Feb 12 01:53:30 2011
New Revision: 1070007

URL: http://svn.apache.org/viewvc?rev=1070007view=rev
Log:
CQL interactive shell

Patch by eevans

Added:
cassandra/trunk/drivers/py/cqlsh   (with props)

Added: cassandra/trunk/drivers/py/cqlsh
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/drivers/py/cqlsh?rev=1070007view=auto
==
--- cassandra/trunk/drivers/py/cqlsh (added)
+++ cassandra/trunk/drivers/py/cqlsh Sat Feb 12 01:53:30 2011
@@ -0,0 +1,206 @@
+#!/usr/bin/env python
+
+from optparse import OptionParser
+from StringIO import StringIO
+
+import cmd
+import sys
+import readline
+import os
+import re
+
+try:
+from cql import Connection
+from cql.errors import CQLException
+except ImportError:
+sys.path.append(os.path.abspath(os.path.dirname(__file__)))
+from cql import Connection
+from cql.errors import CQLException
+
+HISTORY = os.path.join(os.path.expanduser('~'), '.cqlsh')
+CQLTYPES = (bytes, ascii, utf8, timeuuid, uuid, long, int)
+
+RED = \033[1;31m%s\033[0m
+GREEN = \033[1;32m%s\033[0m
+BLUE = \033[1;34m%s\033[0m
+YELLOW = \033[1;33m%s\033[0m
+CYAN = \033[1;36m%s\033[0m
+MAGENTA = \033[1;35m%s\033[0m
+
+def startswith(words, text):
+return [i for i in words if i.startswith(text)]
+
+class Shell(cmd.Cmd):
+default_prompt  = cqlsh 
+continue_prompt =... 
+
+def __init__(self, hostname, port, color=False, username=None,
+password=None):
+cmd.Cmd.__init__(self)
+self.conn = Connection(hostname,
+   port=port,
+   username=username,
+   password=password)
+   
+if os.path.exists(HISTORY):
+readline.read_history_file(HISTORY)
+
+if sys.stdin.isatty():
+self.prompt = Shell.default_prompt
+else:
+self.prompt = 
+
+self.statement = StringIO()
+self.color = color
+
+def reset_statement(self):
+self.set_prompt(Shell.default_prompt)
+self.statement.truncate(0)
+
+def get_statement(self, line):
+self.statement.write(%s\n % line)
+
+if not line.endswith(;):
+self.set_prompt(Shell.continue_prompt)
+return None
+
+try:
+return self.statement.getvalue()
+finally:
+self.reset_statement()
+
+def default(self, arg):
+if not arg.strip(): return
+statement = self.get_statement(arg)
+if not statement: return
+
+result = self.conn.execute(statement)
+
+if isinstance(result, list):
+for row in result:
+self.printout(row.key, BLUE, False)
+for column in row.columns:
+self.printout( | , newline=False)
+# XXX: repr() is better than trying to print binary
+self.printout(repr(column.name), MAGENTA, False)
+self.printout(,, newline=False)
+self.printout(repr(column.value), YELLOW, False)
+self.printout()
+else:
+if result: print result
+
+def emptyline(self):
+pass
+
+def complete_select(self, text, line, begidx, endidx):
+keywords = ('FIRST', 'REVERSED', 'FROM', 'WHERE', 'KEY')
+return startswith(keywords, text.upper())
+complete_SELECT = complete_select
+
+def complete_update(self, text, line, begidx, endidx):
+keywords = ('WHERE', 'KEY', 'SET')
+return startswith(keywords, text.upper())
+complete_UPDATE = complete_update
+
+def complete_create(self, text, line, begidx, endidx):
+words = line.split()
+if len(words)  3:
+return startswith(['COLUMNFAMILY', 'KEYSPACE'], text.upper())
+
+common = ['WITH', 'AND']
+
+if words[1].upper() == 'COLUMNFAMILY':
+types = startswith(CQLTYPES, text)
+props = startswith((comparator,
+comment,
+row_cache_size,
+key_cache_size,
+read_repair_chance,
+gc_grace_seconds,
+default_validation,
+min_compaction_threshold,
+max_compaction_threshold,
+row_cache_save_period_in_seconds,
+key_cache_save_period_in_seconds,
+memtable_flush_after_mins,
+memtable_throughput_in_mb,
+memtable_operations_in_millions,
+replicate_on_write), text)
+return 

[jira] Updated: (CASSANDRA-2156) Compaction Throttling

2011-02-11 Thread Stu Hood (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stu Hood updated CASSANDRA-2156:


Attachment: (was: 
for-0.6-0002-Make-compaction-throttling-configurable.txt)

 Compaction Throttling
 -

 Key: CASSANDRA-2156
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2156
 Project: Cassandra
  Issue Type: New Feature
Reporter: Stu Hood
 Fix For: 0.8

 Attachments: 
 for-0.6-0001-Throttle-compaction-to-a-fixed-throughput.txt, 
 for-0.6-0002-Make-compaction-throttling-configurable.txt


 Compaction is currently relatively bursty: we compact as fast as we can, and 
 then we wait for the next compaction to be possible (hurry up and wait).
 Instead, to properly amortize compaction, you'd like to compact exactly as 
 fast as you need to to keep the sstable count under control.
 For every new level of compaction, you need to increase the rate that you 
 compact at: a rule of thumb that we're testing on our clusters is to 
 determine the maximum number of buckets a node can support (aka, if the 15th 
 bucket holds 750 GB, we're not going to have more than 15 buckets), and then 
 multiply the flush throughput by the number of buckets to get a minimum 
 compaction throughput to maintain your sstable count.
 Full explanation: for a min compaction threshold of {{T}}, the bucket at 
 level {{N}} can contain {{SsubN = T^N}} 'units' (unit == memtable's worth of 
 data on disk). Every time a new unit is added, it has a {{1/SsubN}} chance of 
 causing the bucket at level N to fill. If the bucket at level N fills, it 
 causes {{SsubN}} units to be compacted. So, for each active level in your 
 system you have {{SubN * 1 / SsubN}}, or {{1}} amortized unit to compact any 
 time a new unit is added.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-2156) Compaction Throttling

2011-02-11 Thread Stu Hood (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stu Hood updated CASSANDRA-2156:


Attachment: for-0.6-0002-Make-compaction-throttling-configurable.txt

 Compaction Throttling
 -

 Key: CASSANDRA-2156
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2156
 Project: Cassandra
  Issue Type: New Feature
Reporter: Stu Hood
 Fix For: 0.8

 Attachments: 
 for-0.6-0001-Throttle-compaction-to-a-fixed-throughput.txt, 
 for-0.6-0002-Make-compaction-throttling-configurable.txt


 Compaction is currently relatively bursty: we compact as fast as we can, and 
 then we wait for the next compaction to be possible (hurry up and wait).
 Instead, to properly amortize compaction, you'd like to compact exactly as 
 fast as you need to to keep the sstable count under control.
 For every new level of compaction, you need to increase the rate that you 
 compact at: a rule of thumb that we're testing on our clusters is to 
 determine the maximum number of buckets a node can support (aka, if the 15th 
 bucket holds 750 GB, we're not going to have more than 15 buckets), and then 
 multiply the flush throughput by the number of buckets to get a minimum 
 compaction throughput to maintain your sstable count.
 Full explanation: for a min compaction threshold of {{T}}, the bucket at 
 level {{N}} can contain {{SsubN = T^N}} 'units' (unit == memtable's worth of 
 data on disk). Every time a new unit is added, it has a {{1/SsubN}} chance of 
 causing the bucket at level N to fill. If the bucket at level N fills, it 
 causes {{SsubN}} units to be compacted. So, for each active level in your 
 system you have {{SubN * 1 / SsubN}}, or {{1}} amortized unit to compact any 
 time a new unit is added.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-2156) Compaction Throttling

2011-02-11 Thread Stu Hood (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stu Hood updated CASSANDRA-2156:


Attachment: for-0.6-0001-Throttle-compaction-to-a-fixed-throughput.txt

 Compaction Throttling
 -

 Key: CASSANDRA-2156
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2156
 Project: Cassandra
  Issue Type: New Feature
Reporter: Stu Hood
 Fix For: 0.8

 Attachments: 
 for-0.6-0001-Throttle-compaction-to-a-fixed-throughput.txt, 
 for-0.6-0002-Make-compaction-throttling-configurable.txt


 Compaction is currently relatively bursty: we compact as fast as we can, and 
 then we wait for the next compaction to be possible (hurry up and wait).
 Instead, to properly amortize compaction, you'd like to compact exactly as 
 fast as you need to to keep the sstable count under control.
 For every new level of compaction, you need to increase the rate that you 
 compact at: a rule of thumb that we're testing on our clusters is to 
 determine the maximum number of buckets a node can support (aka, if the 15th 
 bucket holds 750 GB, we're not going to have more than 15 buckets), and then 
 multiply the flush throughput by the number of buckets to get a minimum 
 compaction throughput to maintain your sstable count.
 Full explanation: for a min compaction threshold of {{T}}, the bucket at 
 level {{N}} can contain {{SsubN = T^N}} 'units' (unit == memtable's worth of 
 data on disk). Every time a new unit is added, it has a {{1/SsubN}} chance of 
 causing the bucket at level N to fill. If the bucket at level N fills, it 
 causes {{SsubN}} units to be compacted. So, for each active level in your 
 system you have {{SubN * 1 / SsubN}}, or {{1}} amortized unit to compact any 
 time a new unit is added.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-2156) Compaction Throttling

2011-02-11 Thread Stu Hood (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stu Hood updated CASSANDRA-2156:


Attachment: (was: 
for-0.6-0001-Throttle-compaction-to-a-fixed-throughput.txt)

 Compaction Throttling
 -

 Key: CASSANDRA-2156
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2156
 Project: Cassandra
  Issue Type: New Feature
Reporter: Stu Hood
 Fix For: 0.8

 Attachments: 
 for-0.6-0001-Throttle-compaction-to-a-fixed-throughput.txt, 
 for-0.6-0002-Make-compaction-throttling-configurable.txt


 Compaction is currently relatively bursty: we compact as fast as we can, and 
 then we wait for the next compaction to be possible (hurry up and wait).
 Instead, to properly amortize compaction, you'd like to compact exactly as 
 fast as you need to to keep the sstable count under control.
 For every new level of compaction, you need to increase the rate that you 
 compact at: a rule of thumb that we're testing on our clusters is to 
 determine the maximum number of buckets a node can support (aka, if the 15th 
 bucket holds 750 GB, we're not going to have more than 15 buckets), and then 
 multiply the flush throughput by the number of buckets to get a minimum 
 compaction throughput to maintain your sstable count.
 Full explanation: for a min compaction threshold of {{T}}, the bucket at 
 level {{N}} can contain {{SsubN = T^N}} 'units' (unit == memtable's worth of 
 data on disk). Every time a new unit is added, it has a {{1/SsubN}} chance of 
 causing the bucket at level N to fill. If the bucket at level N fills, it 
 causes {{SsubN}} units to be compacted. So, for each active level in your 
 system you have {{SubN * 1 / SsubN}}, or {{1}} amortized unit to compact any 
 time a new unit is added.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (CASSANDRA-2156) Compaction Throttling

2011-02-11 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12993810#comment-12993810
 ] 

Jonathan Ellis commented on CASSANDRA-2156:
---

Related: CASSANDRA-1882.

I'm pretty uncomfortable committing changes to 0.6 compaction at this point.

0.7 is (*looks furtively over his shoulder*) probably ok, if it defaults to off.

 Compaction Throttling
 -

 Key: CASSANDRA-2156
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2156
 Project: Cassandra
  Issue Type: New Feature
Reporter: Stu Hood
 Fix For: 0.8

 Attachments: 
 for-0.6-0001-Throttle-compaction-to-a-fixed-throughput.txt, 
 for-0.6-0002-Make-compaction-throttling-configurable.txt


 Compaction is currently relatively bursty: we compact as fast as we can, and 
 then we wait for the next compaction to be possible (hurry up and wait).
 Instead, to properly amortize compaction, you'd like to compact exactly as 
 fast as you need to to keep the sstable count under control.
 For every new level of compaction, you need to increase the rate that you 
 compact at: a rule of thumb that we're testing on our clusters is to 
 determine the maximum number of buckets a node can support (aka, if the 15th 
 bucket holds 750 GB, we're not going to have more than 15 buckets), and then 
 multiply the flush throughput by the number of buckets to get a minimum 
 compaction throughput to maintain your sstable count.
 Full explanation: for a min compaction threshold of {{T}}, the bucket at 
 level {{N}} can contain {{SsubN = T^N}} 'units' (unit == memtable's worth of 
 data on disk). Every time a new unit is added, it has a {{1/SsubN}} chance of 
 causing the bucket at level N to fill. If the bucket at level N fills, it 
 causes {{SsubN}} units to be compacted. So, for each active level in your 
 system you have {{SubN * 1 / SsubN}}, or {{1}} amortized unit to compact any 
 time a new unit is added.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (CASSANDRA-2156) Compaction Throttling

2011-02-11 Thread Stu Hood (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12993827#comment-12993827
 ] 

Stu Hood commented on CASSANDRA-2156:
-

Actually, this throttling probably needs to occur on the read side to properly 
account for cases with lots of updates... on the write side, we might have 
compacted the data down by 32x for example.

 Compaction Throttling
 -

 Key: CASSANDRA-2156
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2156
 Project: Cassandra
  Issue Type: New Feature
Reporter: Stu Hood
 Fix For: 0.8

 Attachments: 
 for-0.6-0001-Throttle-compaction-to-a-fixed-throughput.txt, 
 for-0.6-0002-Make-compaction-throttling-configurable.txt


 Compaction is currently relatively bursty: we compact as fast as we can, and 
 then we wait for the next compaction to be possible (hurry up and wait).
 Instead, to properly amortize compaction, you'd like to compact exactly as 
 fast as you need to to keep the sstable count under control.
 For every new level of compaction, you need to increase the rate that you 
 compact at: a rule of thumb that we're testing on our clusters is to 
 determine the maximum number of buckets a node can support (aka, if the 15th 
 bucket holds 750 GB, we're not going to have more than 15 buckets), and then 
 multiply the flush throughput by the number of buckets to get a minimum 
 compaction throughput to maintain your sstable count.
 Full explanation: for a min compaction threshold of {{T}}, the bucket at 
 level {{N}} can contain {{SsubN = T^N}} 'units' (unit == memtable's worth of 
 data on disk). Every time a new unit is added, it has a {{1/SsubN}} chance of 
 causing the bucket at level N to fill. If the bucket at level N fills, it 
 causes {{SsubN}} units to be compacted. So, for each active level in your 
 system you have {{SubN * 1 / SsubN}}, or {{1}} amortized unit to compact any 
 time a new unit is added.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (CASSANDRA-2156) Compaction Throttling

2011-02-11 Thread Stu Hood (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12993828#comment-12993828
 ] 

Stu Hood commented on CASSANDRA-2156:
-

 I'm pretty uncomfortable committing changes to 0.6 compaction at this point.
Oh yea... I mostly posted this particular version for rcoli's benefit: it 
should go into trunk, and could probably be slipped into 0.7, depending on what 
the final patch looks like.

 Compaction Throttling
 -

 Key: CASSANDRA-2156
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2156
 Project: Cassandra
  Issue Type: New Feature
Reporter: Stu Hood
 Fix For: 0.8

 Attachments: 
 for-0.6-0001-Throttle-compaction-to-a-fixed-throughput.txt, 
 for-0.6-0002-Make-compaction-throttling-configurable.txt


 Compaction is currently relatively bursty: we compact as fast as we can, and 
 then we wait for the next compaction to be possible (hurry up and wait).
 Instead, to properly amortize compaction, you'd like to compact exactly as 
 fast as you need to to keep the sstable count under control.
 For every new level of compaction, you need to increase the rate that you 
 compact at: a rule of thumb that we're testing on our clusters is to 
 determine the maximum number of buckets a node can support (aka, if the 15th 
 bucket holds 750 GB, we're not going to have more than 15 buckets), and then 
 multiply the flush throughput by the number of buckets to get a minimum 
 compaction throughput to maintain your sstable count.
 Full explanation: for a min compaction threshold of {{T}}, the bucket at 
 level {{N}} can contain {{SsubN = T^N}} 'units' (unit == memtable's worth of 
 data on disk). Every time a new unit is added, it has a {{1/SsubN}} chance of 
 causing the bucket at level N to fill. If the bucket at level N fills, it 
 causes {{SsubN}} units to be compacted. So, for each active level in your 
 system you have {{SubN * 1 / SsubN}}, or {{1}} amortized unit to compact any 
 time a new unit is added.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-1956) Convert row cache to row+filter cache

2011-02-11 Thread Stu Hood (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stu Hood updated CASSANDRA-1956:


Comment: was deleted

(was: Thanks for the patch Daniel! We actually have existing 'filter' 
implementations (in {{org.apache.cassandra.db.filter}}) that I think would make 
the most sense for use aside cache entries.

 What about just invalidating (removing from the cache) the row on delete and 
 letting it get rebuild on the next read?
Also, regarding the tombstones in cache problem: I believe it came up in IRC 
the other day. The solution that seemed closest to our existing methods was to 
keep the tombstones in cache, but to add a thread that periodically walked the 
cache to perform GC (with our existing GC timeout) like we would during 
compaction.)

 Convert row cache to row+filter cache
 -

 Key: CASSANDRA-1956
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1956
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Stu Hood
Assignee: Daniel Doubleday
 Fix For: 0.7.2

 Attachments: 0001-row-cache-filter.patch


 Changing the row cache to a row+filter cache would make it much more useful. 
 We currently have to warn against using the row cache with wide rows, where 
 the read pattern is typically a peek at the head, but this usecase would be 
 perfect supported by a cache that stored only columns matching the filter.
 Possible implementations:
 * (copout) Cache a single filter per row, and leave the cache key as is
 * Cache a list of filters per row, leaving the cache key as is: this is 
 likely to have some gotchas for weird usage patterns, and it requires the 
 list overheard
 * Change the cache key to rowkey+filterid: basically ideal, but you need a 
 secondary index to lookup cache entries by rowkey so that you can keep them 
 in sync with the memtable
 * others?

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Created: (CASSANDRA-2157) Hector concurrentHClient pool gives out more connections than its quota

2011-02-11 Thread Yang Yang (JIRA)
Hector concurrentHClient pool gives out more connections than its quota
---

 Key: CASSANDRA-2157
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2157
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.7.0
Reporter: Yang Yang


Hector ConcurrentHClient.java can give up on connection pool grabbing, in line 
85 (following all refer to latest 0.7.0 head)


 } else {

try {
  cassandraClient = availableClientQueue.poll(maxWaitTimeWhenExhausted, 
TimeUnit.MILLISECONDS);
  if ( cassandraClient == null ) {
numBlocked.decrementAndGet();
throw new 
PoolExhaustedException(String.format(maxWaitTimeWhenExhausted exceeded for 
thread %s on host %s,
new Object[]{
Thread.currentThread().getName(),
cassandraHost.getName()}
));
  }
} catch (InterruptedException ie) {
  //monitor.incCounter(Counter.POOL_EXHAUSTED);
  numActive.decrementAndGet();
}

so if we specify a maxwaittime, it could give up and  do a 
numActive.decrementAndGet().


but in the HConnectionManager.java

  public void operateWithFailover(Operation? op) throws HectorException {

in the main loop of this method,  

client =  getClientFromLBPolicy(excludeHosts);
could throw Exception.
  in the catch part,  there is a clause for 

} else if ( he instanceof PoolExhaustedException ) {
  retryable = true;
  --retries;
  if ( hostPools.size() == 1 ) {
throw he;
  }
  monitor.incCounter(Counter.POOL_EXHAUSTED);
  excludeHosts.add(client.cassandraHost);
}

I guess this is written for the timeout scenario above, so it's supposed to 
catch that.
but getClientFromLBPolicy() reconstructs a general HectorException from the 
PoolExhaustedException given by borrowClient().
this makes all pool grabbing timeout immediately pop up to client, which I 
guess is not the original intention.

so I guess getClientFromLBPolicy() needs to throw directly the original 
Exception. so as to trigger the logic in the catch part.

but after I made those changes, I found that I often get ActiveNum() from the 
pool to be negative, and TillExhausted to be higher than the quota. this does 
not make sense.
this was because that every code path goes through the line releaseClient() 
in the  finally {} clause. so that on the pool grabbing , 
numActive.decrementAndGet() was already executed, and it also gets executed in 
the finally clause



this end up creating many connections to the server, which bogs down the server 
, we have seen it creating huge cpu load

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (CASSANDRA-2153) client temporarily freezes due to hard-coded JMX port

2011-02-11 Thread Yang Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12993850#comment-12993850
 ] 

Yang Yang commented on CASSANDRA-2153:
--

if it were to be left as it is, it would also be helpful to put a comment there 
saying these will be directed modified. I was kind of expecting 
setters/getters style



 client temporarily freezes due to hard-coded JMX port
 -

 Key: CASSANDRA-2153
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2153
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.7.0
Reporter: Yang Yang

 when you do a show keyspaces inside cassandra-cli,
 on the current 0.7 head svn,
 the client CliSessionState.java hardcodes the JMX port to be 8080
 so if there is indeed a webserver listening on 8080, it gets the connection 
 and indeed tries to talk to it, but the protocol doesn't make sense to 
 cassandra-cli , so it freezes for about 10 seconds before finally giving up.
 changing the hardcoded value to my actual jmx port fixes the issue.
 so this should be read from config file  or argument

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira