date:20110825

[
https://issues.apache.org/jira/browse/CASSANDRA-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Benjamin Coverston updated CASSANDRA-1608:
--

Attachment: 1608-v5.txt

I've made the changes requested in the last two comments. The latest
changes/merge seem to have caused a regression when the # of SSTables increases
beyond a few hundred. Next time I'll be able to look at this is Friday I'll try
to figure out what on earth is going on.

Redesigned Compaction
-

Key: CASSANDRA-1608
URL: https://issues.apache.org/jira/browse/CASSANDRA-1608
Project: Cassandra
Issue Type: Improvement
Components: Core
Reporter: Chris Goffinet
Assignee: Benjamin Coverston
Attachments: 1608-22082011.txt, 1608-v2.txt, 1608-v4.txt, 1608-v5.txt

After seeing the I/O issues in CASSANDRA-1470, I've been doing some more
thinking on this subject that I wanted to lay out.
I propose we redo the concept of how compaction works in Cassandra. At the
moment, compaction is kicked off based on a write access pattern, not read
access pattern. In most cases, you want the opposite. You want to be able to
track how well each SSTable is performing in the system. If we were to keep
statistics in-memory of each SSTable, prioritize them based on most accessed,
and bloom filter hit/miss ratios, we could intelligently group sstables that
are being read most often and schedule them for compaction. We could also
schedule lower priority maintenance on SSTable's not often accessed.
I also propose we limit the size of each SSTable to a fix sized, that gives
us the ability to better utilize our bloom filters in a predictable manner.
At the moment after a certain size, the bloom filters become less reliable.
This would also allow us to group data most accessed. Currently the size of
an SSTable can grow to a point where large portions of the data might not
actually be accessed as often.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Issue Comment Edited] (CASSANDRA-1608) Redesigned Compaction

[
https://issues.apache.org/jira/browse/CASSANDRA-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13090883#comment-13090883
]

Benjamin Coverston edited comment on CASSANDRA-1608 at 8/25/11 10:05 AM:
-

EDIT:
Somehow I screwed up the attached patch.. I'll fix it and resubmit.

was (Author: bcoverston):
I've made the changes requested in the last two comments. The latest
changes/merge seem to have caused a regression when the # of SSTables increases
beyond a few hundred. Next time I'll be able to look at this is Friday I'll try
to figure out what on earth is going on.

Redesigned Compaction
-

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-1608) Redesigned Compaction

[
https://issues.apache.org/jira/browse/CASSANDRA-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Benjamin Coverston updated CASSANDRA-1608:
--

Attachment: (was: 1608-v5.txt)

Redesigned Compaction
-

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-3075) Cassandra CLI unable to use list command with INTEGER column names, resulting in syntax error

[
https://issues.apache.org/jira/browse/CASSANDRA-3075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jonathan Ellis updated CASSANDRA-3075:
--

Fix Version/s: 0.8.5
Assignee: Pavel Yaskevich

One possible solution would be to allow quoting CF names.

Cassandra CLI unable to use list command with INTEGER column names, resulting
in syntax error
-

I have a Column Family named 1105115.
I have inserted the CF with Hector, and it did not
throw any exception concerning the name of the
column.
If I am issuing the command
list 1105115;
I incur the following error:
[default@unknown] list 1105115;
Syntax error at position 5: mismatched input '1105115' expecting Identifier
I presume we are not to name CFs as integers?
Or is there something I am missing from
the bellow help content:
[default@unknown] help list;
list cf;
list cf[startKey:];
list cf[startKey:endKey];
list cf[startKey:endKey] limit limit;
List a range of rows, and all of their columns, in the specified column
family.
The order of rows returned is dependant on the Partitioner in use.
Required Parameters:
- cf: Name of the column family to list rows from.
Optional Parameters:
- endKey: Key to end the range at. The end key will be included
in the result. Defaults to an empty byte array.
- limit: Number of rows to return. Default is 100.
- startKey: Key start the range from. The start key will be
included in the result. Defaults to an empty byte array.
Examples:
list Standard1;
list Super1[j:];
list Standard1[j:k] limit 40;

Column Family Info:
ColumnFamily: 1105115
Key Validation Class: org.apache.cassandra.db.marshal.BytesType
Default column value validator:
org.apache.cassandra.db.marshal.BytesType
Columns sorted by: org.apache.cassandra.db.marshal.AsciiType
Row cache size / save period in seconds: 0.0/0
Key cache size / save period in seconds: 20.0/14400
Memtable thresholds: 0.5203125/111/1440 (millions of ops/MB/minutes)
GC grace seconds: 864000
Compaction min/max thresholds: 4/32
Read repair chance: 1.0
Replicate on write: true
Built indexes: []

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3076) AssertionError in new GCInspector log


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13091037#comment-13091037
 ] 

T Jake Luciani commented on CASSANDRA-3076:
---

[junit] ERROR 10:20:16,755 Fatal exception in thread 
Thread[ScheduledTasks:1,5,main]
[junit] java.lang.AssertionError
[junit] at 
org.apache.cassandra.service.GCInspector.logGCResults(GCInspector.java:110)
[junit] at 
org.apache.cassandra.service.GCInspector.access$000(GCInspector.java:41)
[junit] at 
org.apache.cassandra.service.GCInspector$1.run(GCInspector.java:85)
[junit] at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
[junit] at 
java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
[junit] at 
java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
[junit] at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
[junit] at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
[junit] at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
[junit] at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
[junit] at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
[junit] at java.lang.Thread.run(Thread.java:680)
[junit] -  ---

 AssertionError in new GCInspector log
 -

 Key: CASSANDRA-3076
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3076
 Project: Cassandra
  Issue Type: Bug
 Environment: Lion OSX
Reporter: T Jake Luciani
Assignee: T Jake Luciani
Priority: Minor
 Fix For: 0.8.5


 Small regression from CASSANDRA-2868

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (CASSANDRA-2761) JDBC driver does not build


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-2761.
---

   Resolution: Fixed
Fix Version/s: (was: 1.0)

 JDBC driver does not build
 --

 Key: CASSANDRA-2761
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2761
 Project: Cassandra
  Issue Type: Bug
  Components: API
Affects Versions: 1.0
Reporter: Jonathan Ellis
Assignee: Rick Shaw
 Attachments: jdbc-driver-build-v1.txt, 
 v1-0001-CASSANDRA-2761-cleanup-nits.txt


 Need a way to build (and run tests for) the Java driver.
 Also: still some vestigal references to drivers/ in trunk build.xml.
 Should we remove drivers/ from the 0.8 branch as well?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Issue Comment Edited] (CASSANDRA-3076) AssertionError in new GCInspector log


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13091037#comment-13091037
 ] 

T Jake Luciani edited comment on CASSANDRA-3076 at 8/25/11 2:27 PM:


{code}
[junit] ERROR 10:20:16,755 Fatal exception in thread 
Thread[ScheduledTasks:1,5,main]
[junit] java.lang.AssertionError
[junit] at 
org.apache.cassandra.service.GCInspector.logGCResults(GCInspector.java:110)
[junit] at 
org.apache.cassandra.service.GCInspector.access$000(GCInspector.java:41)
[junit] at 
org.apache.cassandra.service.GCInspector$1.run(GCInspector.java:85)
[junit] at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
[junit] at 
java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
[junit] at 
java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
[junit] at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
[junit] at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
[junit] at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
[junit] at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
[junit] at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
[junit] at java.lang.Thread.run(Thread.java:680)
[junit] -  ---
{code}

  was (Author: tjake):
[junit] ERROR 10:20:16,755 Fatal exception in thread 
Thread[ScheduledTasks:1,5,main]
[junit] java.lang.AssertionError
[junit] at 
org.apache.cassandra.service.GCInspector.logGCResults(GCInspector.java:110)
[junit] at 
org.apache.cassandra.service.GCInspector.access$000(GCInspector.java:41)
[junit] at 
org.apache.cassandra.service.GCInspector$1.run(GCInspector.java:85)
[junit] at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
[junit] at 
java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
[junit] at 
java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
[junit] at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
[junit] at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
[junit] at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
[junit] at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
[junit] at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
[junit] at java.lang.Thread.run(Thread.java:680)
[junit] -  ---
  
 AssertionError in new GCInspector log
 -

 Key: CASSANDRA-3076
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3076
 Project: Cassandra
  Issue Type: Bug
 Environment: Lion OSX
Reporter: T Jake Luciani
Assignee: T Jake Luciani
Priority: Minor
 Fix For: 0.8.5


 Small regression from CASSANDRA-2868

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (CASSANDRA-3076) AssertionError in new GCInspector log

AssertionError in new GCInspector log
-

 Key: CASSANDRA-3076
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3076
 Project: Cassandra
  Issue Type: Bug
 Environment: Lion OSX
Reporter: T Jake Luciani
Assignee: T Jake Luciani
Priority: Minor
 Fix For: 0.8.5


Small regression from CASSANDRA-2868

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3074) comments and documentation for index_interval are misleading


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13091038#comment-13091038
 ] 

Jonathan Ellis commented on CASSANDRA-3074:
---

the proposed changes conflate the index entries themselves (always one per key) 
and the sampling rate (which is what index_interval affects).

 comments and documentation for index_interval are misleading
 

 Key: CASSANDRA-3074
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3074
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.8.4
Reporter: Matthew F. Dennis
Assignee: Matthew F. Dennis
Priority: Minor
 Attachments: 3074-cassandra-0.8.patch


 The comments and documentation for index_interval are misleading.  They state 
 the larger the *sampling* the more effective the index as at the cost of 
 space.  This is true, but in the context of the configuration variable it 
 implies the larger the *setting* is the larger the index is while in fact 
 it's the opposite of that.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-3076) AssertionError in new GCInspector log


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

T Jake Luciani updated CASSANDRA-3076:
--

Attachment: 3076.txt

 AssertionError in new GCInspector log
 -

 Key: CASSANDRA-3076
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3076
 Project: Cassandra
  Issue Type: Bug
 Environment: Lion OSX
Reporter: T Jake Luciani
Assignee: T Jake Luciani
Priority: Minor
 Fix For: 0.8.5

 Attachments: 3076.txt


 Small regression from CASSANDRA-2868

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-3076) AssertionError in new GCInspector log


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

T Jake Luciani updated CASSANDRA-3076:
--

Fix Version/s: (was: 0.8.5)
   0.7.9

 AssertionError in new GCInspector log
 -

 Key: CASSANDRA-3076
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3076
 Project: Cassandra
  Issue Type: Bug
 Environment: Lion OSX
Reporter: T Jake Luciani
Assignee: T Jake Luciani
Priority: Minor
 Fix For: 0.7.9

 Attachments: 3076.txt


 Small regression from CASSANDRA-2868

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3076) AssertionError in new GCInspector log


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13091042#comment-13091042
 ] 

Jonathan Ellis commented on CASSANDRA-3076:
---

The assert is basically saying if total gc time has increased, count should 
have increased as well.

If that's valid, then the if (previousTotal.equals(total)) continue check 
should handle this.  If it's not, we should probably remove the assert entirely.


 AssertionError in new GCInspector log
 -

 Key: CASSANDRA-3076
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3076
 Project: Cassandra
  Issue Type: Bug
 Environment: Lion OSX
Reporter: T Jake Luciani
Assignee: T Jake Luciani
Priority: Minor
 Fix For: 0.7.9

 Attachments: 3076.txt


 Small regression from CASSANDRA-2868

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3076) AssertionError in new GCInspector log


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13091049#comment-13091049
 ] 

T Jake Luciani commented on CASSANDRA-3076:
---

Right, I think it's likely a OSX lion thing.  Removing the assert works for me.

 AssertionError in new GCInspector log
 -

 Key: CASSANDRA-3076
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3076
 Project: Cassandra
  Issue Type: Bug
 Environment: Lion OSX
Reporter: T Jake Luciani
Assignee: T Jake Luciani
Priority: Minor
 Fix For: 0.7.9

 Attachments: 3076.txt


 Small regression from CASSANDRA-2868

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

svn commit: r1161607 - in /cassandra/branches/cassandra-0.7: CHANGES.txt src/java/org/apache/cassandra/service/GCInspector.java

Author: jbellis
Date: Thu Aug 25 15:36:57 2011
New Revision: 1161607

URL: http://svn.apache.org/viewvc?rev=1161607view=rev
Log:
r/m failing assert to match 1161167 in 0.8

Modified:
cassandra/branches/cassandra-0.7/CHANGES.txt

cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/GCInspector.java

Modified: cassandra/branches/cassandra-0.7/CHANGES.txt
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/CHANGES.txt?rev=1161607r1=1161606r2=1161607view=diff
==
--- cassandra/branches/cassandra-0.7/CHANGES.txt (original)
+++ cassandra/branches/cassandra-0.7/CHANGES.txt Thu Aug 25 15:36:57 2011
@@ -9,7 +9,7 @@
CompactionManager.estimatedCompactions (CASSANDRA-2708)
  * remove gossip state when a new IP takes over a token (CASSANDRA-3071)
  * work around native memory leak in com.sun.management.GarbageCollectorMXBean
-(CASSANDRA-2868)
+   (CASSANDRA-2868, 3076)
 
 
 0.7.8

Modified: 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/GCInspector.java
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/GCInspector.java?rev=1161607r1=1161606r2=1161607view=diff
==
--- 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/GCInspector.java
 (original)
+++ 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/GCInspector.java
 Thu Aug 25 15:36:57 2011
@@ -106,7 +106,6 @@ public class GCInspector
 if (previousCount == null)
 previousCount = 0L;
 gccounts.put(gc.getName(), count);
-assert count  previousCount;
 
 MemoryUsage mu = membean.getHeapMemoryUsage();
 long memoryUsed = mu.getUsed();

svn commit: r1161608 - in /cassandra/branches/cassandra-0.8: ./ contrib/ interface/thrift/gen-java/org/apache/cassandra/thrift/

Author: jbellis
Date: Thu Aug 25 15:37:43 2011
New Revision: 1161608

URL: http://svn.apache.org/viewvc?rev=1161608view=rev
Log:
merge from 0.7

Modified:
cassandra/branches/cassandra-0.8/   (props changed)
cassandra/branches/cassandra-0.8/contrib/   (props changed)

cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java
   (props changed)

cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java
   (props changed)

cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/InvalidRequestException.java
   (props changed)

cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/NotFoundException.java
   (props changed)

cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/SuperColumn.java
   (props changed)

Propchange: cassandra/branches/cassandra-0.8/
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Thu Aug 25 15:37:43 2011
@@ -1,5 +1,5 @@
 
/cassandra/branches/cassandra-0.6:922689-1052356,1052358-1053452,1053454,1053456-1131291
-/cassandra/branches/cassandra-0.7:1026516-1160444,1160825
+/cassandra/branches/cassandra-0.7:1026516-1160444,1160825,1161607
 /cassandra/branches/cassandra-0.7.0:1053690-1055654
 /cassandra/branches/cassandra-0.8:1090934-1125013,1125041
 /cassandra/branches/cassandra-0.8.0:1125021-1130369

Propchange: cassandra/branches/cassandra-0.8/contrib/
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Thu Aug 25 15:37:43 2011
@@ -1,5 +1,5 @@
 
/cassandra/branches/cassandra-0.6/contrib:922689-1052356,1052358-1053452,1053454,1053456-1068009
-/cassandra/branches/cassandra-0.7/contrib:1026516-1160444,1160825
+/cassandra/branches/cassandra-0.7/contrib:1026516-1160444,1160825,1161607
 /cassandra/branches/cassandra-0.7.0/contrib:1053690-1055654
 /cassandra/branches/cassandra-0.8/contrib:1090934-1125013,1125041
 /cassandra/branches/cassandra-0.8.0/contrib:1125021-1130369

Propchange: 
cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Thu Aug 25 15:37:43 2011
@@ -1,5 +1,5 @@
 
/cassandra/branches/cassandra-0.6/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:922689-1052356,1052358-1053452,1053454,1053456-1131291
-/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1026516-1160444,1160825
+/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1026516-1160444,1160825,1161607
 
/cassandra/branches/cassandra-0.7.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1053690-1055654
 
/cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1090934-1125013,1125041
 
/cassandra/branches/cassandra-0.8.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1125021-1130369

Propchange: 
cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Thu Aug 25 15:37:43 2011
@@ -1,5 +1,5 @@
 
/cassandra/branches/cassandra-0.6/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:922689-1052356,1052358-1053452,1053454,1053456-1131291
-/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1026516-1160444,1160825
+/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1026516-1160444,1160825,1161607
 
/cassandra/branches/cassandra-0.7.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1053690-1055654
 
/cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1090934-1125013,1125041
 
/cassandra/branches/cassandra-0.8.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1125021-1130369

Propchange: 
cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/InvalidRequestException.java
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Thu Aug 25 15:37:43 2011
@@ -1,5 +1,5 @@
 
/cassandra/branches/cassandra-0.6/interface/thrift/gen-java/org/apache/cassandra/thrift/InvalidRequestException.java:922689-1052356,1052358-1053452,1053454,1053456-1131291
-/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/InvalidRequestException.java:1026516-1160444,1160825

[jira] [Resolved] (CASSANDRA-3076) AssertionError in new GCInspector log


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-3076.
---

   Resolution: Fixed
Fix Version/s: 0.8.5
 Reviewer: jbellis

ok, done in 0.7.  (Brandon already did that in 0.8 in r1161167.)

 AssertionError in new GCInspector log
 -

 Key: CASSANDRA-3076
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3076
 Project: Cassandra
  Issue Type: Bug
 Environment: Lion OSX
Reporter: T Jake Luciani
Assignee: T Jake Luciani
Priority: Minor
 Fix For: 0.7.9, 0.8.5

 Attachments: 3076.txt


 Small regression from CASSANDRA-2868

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2380) Cassandra requires hostname is resolvable even when specifying IP's for listen and rpc addresses


[ 
https://issues.apache.org/jira/browse/CASSANDRA-2380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13091123#comment-13091123
 ] 

Jonathan Ellis commented on CASSANDRA-2380:
---

Exception thrown by the agent usually refers to the RMI agent used by JMX.  
I'd try uncommenting and editing this line in cassandra-env.sh:

{noformat}
# JVM_OPTS=$JVM_OPTS -Djava.rmi.server.hostname=public name 

{noformat}


 Cassandra requires hostname is resolvable even when specifying IP's for 
 listen and rpc addresses
 

 Key: CASSANDRA-2380
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2380
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.7.4
 Environment: open jdk 1.6.0_20 64-Bit 
Reporter: Eric Tamme
Priority: Trivial

 A strange looking error is printed out, with no stack trace and no other log, 
 when hostname is not resolvable regardless of whether or not the hostname is 
 being used to specify a listen or rpc address.  I am specifically using IPv6 
 addresses but I have tested it with IPv4 and gotten the same result.
 Error: Exception thrown by the agent : java.net.MalformedURLException: Local 
 host name unknown: java.net.UnknownHostException
 I have spent several hours trying to track down what is happening and have 
 been unable to determine if this is down in the java 
 getByName-getAllByName-getAllByName0 set of methods that is happening when  
 listenAddress = InetAddress.getByName(conf.listen_address);
 is called from DatabaseDescriptor.java
 I am not able to replicate the error in a stand alone java program (see 
 below) so I am not sure what cassandra is doing to force name resolution.  
 Perhaps the issue is not in DatabaseDescriptor, but some where else?  I get 
 no log output, and no stack trace when this happens, only the single line 
 error.
 import java.net.InetAddress;
 import java.net.UnknownHostException;
 class Test
 {
 public static void main(String args[])
 {
 try
 {
 InetAddress listenAddress = InetAddress.getByName(foo);
 System.out.println(listenAddress);
 }
 catch (UnknownHostException e)
 {
 System.out.println(Unable to parse address);
 }
 }
 }
 People have just said oh go put a line in your hosts file and while that 
 does work, it is not right.  If I am not using my hostname for any reason 
 cassandra should not have to resolve it, and carrying around that application 
 specific stuff in your hosts file is not correct.
 Regardless of if this bug gets fixed, I want to better understand what the 
 heck is going on that makes cassandra crash and print out that exception.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (CASSANDRA-2380) Cassandra requires hostname is resolvable even when specifying IP's for listen and rpc addresses


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-2380.
---

Resolution: Cannot Reproduce

 Cassandra requires hostname is resolvable even when specifying IP's for 
 listen and rpc addresses
 

 Key: CASSANDRA-2380
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2380
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.7.4
 Environment: open jdk 1.6.0_20 64-Bit 
Reporter: Eric Tamme
Priority: Trivial

 A strange looking error is printed out, with no stack trace and no other log, 
 when hostname is not resolvable regardless of whether or not the hostname is 
 being used to specify a listen or rpc address.  I am specifically using IPv6 
 addresses but I have tested it with IPv4 and gotten the same result.
 Error: Exception thrown by the agent : java.net.MalformedURLException: Local 
 host name unknown: java.net.UnknownHostException
 I have spent several hours trying to track down what is happening and have 
 been unable to determine if this is down in the java 
 getByName-getAllByName-getAllByName0 set of methods that is happening when  
 listenAddress = InetAddress.getByName(conf.listen_address);
 is called from DatabaseDescriptor.java
 I am not able to replicate the error in a stand alone java program (see 
 below) so I am not sure what cassandra is doing to force name resolution.  
 Perhaps the issue is not in DatabaseDescriptor, but some where else?  I get 
 no log output, and no stack trace when this happens, only the single line 
 error.
 import java.net.InetAddress;
 import java.net.UnknownHostException;
 class Test
 {
 public static void main(String args[])
 {
 try
 {
 InetAddress listenAddress = InetAddress.getByName(foo);
 System.out.println(listenAddress);
 }
 catch (UnknownHostException e)
 {
 System.out.println(Unable to parse address);
 }
 }
 }
 People have just said oh go put a line in your hosts file and while that 
 does work, it is not right.  If I am not using my hostname for any reason 
 cassandra should not have to resolve it, and carrying around that application 
 specific stuff in your hosts file is not correct.
 Regardless of if this bug gets fixed, I want to better understand what the 
 heck is going on that makes cassandra crash and print out that exception.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-1608) Redesigned Compaction

[
https://issues.apache.org/jira/browse/CASSANDRA-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Benjamin Coverston updated CASSANDRA-1608:
--

Attachment: 1608-v5.txt

Fixed up the patch according to the comments given. Took a stab a culling some
of the SSTables from the locking mechanism.

Redesigned Compaction
-

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2252) arena allocation for memtables


[ 
https://issues.apache.org/jira/browse/CASSANDRA-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13091196#comment-13091196
 ] 

Yang Yang commented on CASSANDRA-2252:
--

SlabAllocator:

private void tryRetireRegion(Region region)
{
if (currentRegion.compareAndSet(region, null))
{
filledRegions.add(region);
}
}


could you please explain why we need to add them to filledRegions? when all 
the buffers that share the same region die/become unreachable, shouldn't we 
just let the region go  and free memory? , then we should not tie this region 
in memory through the references starting from filledRegions . no ??

just to confirm my thoughts, I looked at the HBase implementation:
./src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreLAB.java


  /**
   * Try to retire the current chunk if it is still
   * codec/code. Postcondition is that curChunk.get()
   * != c
   */
  private void tryRetireChunk(Chunk c) {
@SuppressWarnings(unused)
boolean weRetiredIt = curChunk.compareAndSet(c, null);
// If the CAS succeeds, that means that we won the race
// to retire the chunk. We could use this opportunity to
// update metrics on external fragmentation.
//
// If the CAS fails, that means that someone else already
// retired the chunk for us.
  }

it does not tie it to a region list .



the current result of tying regions together through the filledRegions is that 
all regions (even if those dead ones) still occupy memory.  --- well  if the 
purpose is to count the size() held in allocator, should we use weak references?


 arena allocation for memtables
 --

 Key: CASSANDRA-2252
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2252
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
 Fix For: 1.0

 Attachments: 0001-add-MemtableAllocator.txt, 
 0002-add-off-heap-MemtableAllocator-support.txt, 2252-v3.txt, 2252-v4.txt, 
 merged-2252.tgz


 The memtable design practically actively fights Java's GC design.  Todd 
 Lipcon gave a good explanation over on HBASE-3455.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Issue Comment Edited] (CASSANDRA-2252) arena allocation for memtables


[ 
https://issues.apache.org/jira/browse/CASSANDRA-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13091196#comment-13091196
 ] 

Yang Yang edited comment on CASSANDRA-2252 at 8/25/11 6:26 PM:
---

SlabAllocator:

private void tryRetireRegion(Region region)
{
if (currentRegion.compareAndSet(region, null))
{
filledRegions.add(region);
}
}


could you please explain why we need to add them to filledRegions? when all 
the buffers that share the same region die/become unreachable, shouldn't we 
just let the region go  and free memory? , then we should not tie this region 
in memory through the references starting from filledRegions . no ??

just to confirm my thoughts, I looked at the HBase implementation:
./src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreLAB.java


  /**
   * Try to retire the current chunk if it is still
   * codec/code. Postcondition is that curChunk.get()
   * != c
   */
  private void tryRetireChunk(Chunk c) {
@SuppressWarnings(unused)
boolean weRetiredIt = curChunk.compareAndSet(c, null);
// If the CAS succeeds, that means that we won the race
// to retire the chunk. We could use this opportunity to
// update metrics on external fragmentation.
//
// If the CAS fails, that means that someone else already
// retired the chunk for us.
  }

it does not tie it to a region list .



the current result of tying regions together through the filledRegions is that 
all regions (even if those dead ones) always occupy memory.  --- well  if the 
purpose is to count the size() held in allocator, should we use weak references?


  was (Author: yangyangyyy):
SlabAllocator:

private void tryRetireRegion(Region region)
{
if (currentRegion.compareAndSet(region, null))
{
filledRegions.add(region);
}
}


could you please explain why we need to add them to filledRegions? when all 
the buffers that share the same region die/become unreachable, shouldn't we 
just let the region go  and free memory? , then we should not tie this region 
in memory through the references starting from filledRegions . no ??

just to confirm my thoughts, I looked at the HBase implementation:
./src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreLAB.java


  /**
   * Try to retire the current chunk if it is still
   * codec/code. Postcondition is that curChunk.get()
   * != c
   */
  private void tryRetireChunk(Chunk c) {
@SuppressWarnings(unused)
boolean weRetiredIt = curChunk.compareAndSet(c, null);
// If the CAS succeeds, that means that we won the race
// to retire the chunk. We could use this opportunity to
// update metrics on external fragmentation.
//
// If the CAS fails, that means that someone else already
// retired the chunk for us.
  }

it does not tie it to a region list .



the current result of tying regions together through the filledRegions is that 
all regions (even if those dead ones) still occupy memory.  --- well  if the 
purpose is to count the size() held in allocator, should we use weak references?

  
 arena allocation for memtables
 --

 Key: CASSANDRA-2252
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2252
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
 Fix For: 1.0

 Attachments: 0001-add-MemtableAllocator.txt, 
 0002-add-off-heap-MemtableAllocator-support.txt, 2252-v3.txt, 2252-v4.txt, 
 merged-2252.tgz


 The memtable design practically actively fights Java's GC design.  Todd 
 Lipcon gave a good explanation over on HBASE-3455.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Issue Comment Edited] (CASSANDRA-2252) arena allocation for memtables


[ 
https://issues.apache.org/jira/browse/CASSANDRA-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13091196#comment-13091196
 ] 

Yang Yang edited comment on CASSANDRA-2252 at 8/25/11 6:32 PM:
---

SlabAllocator:

private void tryRetireRegion(Region region)
{
if (currentRegion.compareAndSet(region, null))
{
filledRegions.add(region);
}
}


could you please explain why we need to add them to filledRegions? when all 
the buffers that share the same region die/become unreachable, shouldn't we 
just let the region go  and free memory? , then we should not tie this region 
in memory through the references starting from filledRegions . no ??

just to confirm my thoughts, I looked at the HBase implementation:
./src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreLAB.java


  /**
   * Try to retire the current chunk if it is still
   * codec/code. Postcondition is that curChunk.get()
   * != c
   */
  private void tryRetireChunk(Chunk c) {
@SuppressWarnings(unused)
boolean weRetiredIt = curChunk.compareAndSet(c, null);
// If the CAS succeeds, that means that we won the race
// to retire the chunk. We could use this opportunity to
// update metrics on external fragmentation.
//
// If the CAS fails, that means that someone else already
// retired the chunk for us.
  }

it does not tie it to a region list .



the current result of tying regions together through the filledRegions is that 
all regions (even if those dead ones) always occupy memory.  --- well  if the 
purpose is to count the size() held in allocator, should we just keep a int var 
of total size  , or use weak references?


  was (Author: yangyangyyy):
SlabAllocator:

private void tryRetireRegion(Region region)
{
if (currentRegion.compareAndSet(region, null))
{
filledRegions.add(region);
}
}


could you please explain why we need to add them to filledRegions? when all 
the buffers that share the same region die/become unreachable, shouldn't we 
just let the region go  and free memory? , then we should not tie this region 
in memory through the references starting from filledRegions . no ??

just to confirm my thoughts, I looked at the HBase implementation:
./src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreLAB.java


  /**
   * Try to retire the current chunk if it is still
   * codec/code. Postcondition is that curChunk.get()
   * != c
   */
  private void tryRetireChunk(Chunk c) {
@SuppressWarnings(unused)
boolean weRetiredIt = curChunk.compareAndSet(c, null);
// If the CAS succeeds, that means that we won the race
// to retire the chunk. We could use this opportunity to
// update metrics on external fragmentation.
//
// If the CAS fails, that means that someone else already
// retired the chunk for us.
  }

it does not tie it to a region list .



the current result of tying regions together through the filledRegions is that 
all regions (even if those dead ones) always occupy memory.  --- well  if the 
purpose is to count the size() held in allocator, should we use weak references?

  
 arena allocation for memtables
 --

 Key: CASSANDRA-2252
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2252
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
 Fix For: 1.0

 Attachments: 0001-add-MemtableAllocator.txt, 
 0002-add-off-heap-MemtableAllocator-support.txt, 2252-v3.txt, 2252-v4.txt, 
 merged-2252.tgz


 The memtable design practically actively fights Java's GC design.  Todd 
 Lipcon gave a good explanation over on HBASE-3455.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2252) arena allocation for memtables


[ 
https://issues.apache.org/jira/browse/CASSANDRA-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13091202#comment-13091202
 ] 

Jonathan Ellis commented on CASSANDRA-2252:
---

the purpose is we need to keep them alive until flush.  so weak would not work.

 arena allocation for memtables
 --

 Key: CASSANDRA-2252
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2252
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
 Fix For: 1.0

 Attachments: 0001-add-MemtableAllocator.txt, 
 0002-add-off-heap-MemtableAllocator-support.txt, 2252-v3.txt, 2252-v4.txt, 
 merged-2252.tgz


 The memtable design practically actively fights Java's GC design.  Todd 
 Lipcon gave a good explanation over on HBASE-3455.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2252) arena allocation for memtables


[ 
https://issues.apache.org/jira/browse/CASSANDRA-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13091209#comment-13091209
 ] 

Yang Yang commented on CASSANDRA-2252:
--

Thanks Jonathan,
but why do we need them alive? 
for example

I create a 2MB region, which is carved out to 100 ByteBuffers, each of these 
ByteBuffers would point to the data of the Region, so as long as one of them 
is live, the bytes pointed to by Region.data is still in heap; and if these 100 
ByteBuffers all die, isn't it our goal to free the 2MB region, since no one is 
using them??




 arena allocation for memtables
 --

 Key: CASSANDRA-2252
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2252
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
 Fix For: 1.0

 Attachments: 0001-add-MemtableAllocator.txt, 
 0002-add-off-heap-MemtableAllocator-support.txt, 2252-v3.txt, 2252-v4.txt, 
 merged-2252.tgz


 The memtable design practically actively fights Java's GC design.  Todd 
 Lipcon gave a good explanation over on HBASE-3455.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-3074) comments and documentation for index_interval are misleading


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthew F. Dennis updated CASSANDRA-3074:
-

Attachment: 3074-cassandra-0.8.patch

poor choice of words on my part.  new version attached.

 comments and documentation for index_interval are misleading
 

 Key: CASSANDRA-3074
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3074
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.8.4
Reporter: Matthew F. Dennis
Assignee: Matthew F. Dennis
Priority: Minor
 Attachments: 3074-cassandra-0.8.patch


 The comments and documentation for index_interval are misleading.  They state 
 the larger the *sampling* the more effective the index as at the cost of 
 space.  This is true, but in the context of the configuration variable it 
 implies the larger the *setting* is the larger the index is while in fact 
 it's the opposite of that.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-3074) comments and documentation for index_interval are misleading


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthew F. Dennis updated CASSANDRA-3074:
-

Attachment: (was: 3074-cassandra-0.8.patch)

 comments and documentation for index_interval are misleading
 

 Key: CASSANDRA-3074
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3074
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.8.4
Reporter: Matthew F. Dennis
Assignee: Matthew F. Dennis
Priority: Minor
 Attachments: 3074-cassandra-0.8.patch


 The comments and documentation for index_interval are misleading.  They state 
 the larger the *sampling* the more effective the index as at the cost of 
 space.  This is true, but in the context of the configuration variable it 
 implies the larger the *setting* is the larger the index is while in fact 
 it's the opposite of that.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (CASSANDRA-3077) Support TTL option to be set for column family

2011-08-25 Thread Aleksey Vorona (JIRA)

Support TTL option to be set for column family
--

 Key: CASSANDRA-3077
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3077
 Project: Cassandra
  Issue Type: Wish
  Components: Core
Affects Versions: 0.8.4
Reporter: Aleksey Vorona
Priority: Minor


Use case: I want one of my CFs not to store any data older than two months. It 
is a notifications CF which is of no interest to user.

Currently I am setting TTL with each insert in the CF, but since it is a 
constant it makes sense to me to have it configured in CF definition to apply 
automatically to all rows in the CF.


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-3074) comments and documentation for index_interval are misleading


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthew F. Dennis updated CASSANDRA-3074:
-

Attachment: (was: 3074-cassandra-0.8.patch)

 comments and documentation for index_interval are misleading
 

 Key: CASSANDRA-3074
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3074
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.8.4
Reporter: Matthew F. Dennis
Assignee: Matthew F. Dennis
Priority: Minor
 Attachments: 3074-cassandra-0.8.patch


 The comments and documentation for index_interval are misleading.  They state 
 the larger the *sampling* the more effective the index as at the cost of 
 space.  This is true, but in the context of the configuration variable it 
 implies the larger the *setting* is the larger the index is while in fact 
 it's the opposite of that.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-3074) comments and documentation for index_interval are misleading


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthew F. Dennis updated CASSANDRA-3074:
-

Attachment: 3074-cassandra-0.8.patch

 comments and documentation for index_interval are misleading
 

 Key: CASSANDRA-3074
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3074
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.8.4
Reporter: Matthew F. Dennis
Assignee: Matthew F. Dennis
Priority: Minor
 Attachments: 3074-cassandra-0.8.patch


 The comments and documentation for index_interval are misleading.  They state 
 the larger the *sampling* the more effective the index as at the cost of 
 space.  This is true, but in the context of the configuration variable it 
 implies the larger the *setting* is the larger the index is while in fact 
 it's the opposite of that.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3077) Support TTL option to be set for column family

2011-08-25 Thread Brandon Williams (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13091211#comment-13091211
 ] 

Brandon Williams commented on CASSANDRA-3077:
-

I could see this as a default ttl option that is used when one is not 
specified, sort of like a default validator.

 Support TTL option to be set for column family
 --

 Key: CASSANDRA-3077
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3077
 Project: Cassandra
  Issue Type: Wish
  Components: Core
Affects Versions: 0.8.4
Reporter: Aleksey Vorona
Priority: Minor

 Use case: I want one of my CFs not to store any data older than two months. 
 It is a notifications CF which is of no interest to user.
 Currently I am setting TTL with each insert in the CF, but since it is a 
 constant it makes sense to me to have it configured in CF definition to apply 
 automatically to all rows in the CF.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

svn commit: r1161701 - /cassandra/branches/cassandra-0.8/conf/cassandra.yaml

Author: jbellis
Date: Thu Aug 25 19:05:49 2011
New Revision: 1161701

URL: http://svn.apache.org/viewvc?rev=1161701view=rev
Log:
clarify index_interval explanation
patch by mdennis for CASSANDRA-3074

Modified:
cassandra/branches/cassandra-0.8/conf/cassandra.yaml

Modified: cassandra/branches/cassandra-0.8/conf/cassandra.yaml
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8/conf/cassandra.yaml?rev=1161701r1=1161700r2=1161701view=diff
==
--- cassandra/branches/cassandra-0.8/conf/cassandra.yaml (original)
+++ cassandra/branches/cassandra-0.8/conf/cassandra.yaml Thu Aug 25 19:05:49 
2011
@@ -380,9 +380,16 @@ request_scheduler: org.apache.cassandra.
 # the request scheduling. Currently the only valid option is keyspace.
 # request_scheduler_id: keyspace
 
-# The Index Interval determines how large the sampling of row keys
-#  is for a given SSTable. The larger the sampling, the more effective
-#  the index is at the cost of space.
+# index_interval controls the sampling of entries from the primrary
+# row index in terms of space versus time.  The larger the interval,
+# the smaller and less effective the sampling will be.  In technicial
+# terms, the interval coresponds to the number of index entries that
+# are skipped between taking each sample.  All the sampled entries
+# must fit in memory.  Generally, a value between 128 and 512 here
+# coupled with a large key cache size on CFs results in the best trade
+# offs.  This value is not often changed, however if you have many
+# very small rows (many to an OS page), then increasing this will
+# often lower memory usage without a impact on performance.
 index_interval: 128
 
 # Enable or disable inter-node encryption

[jira] [Updated] (CASSANDRA-3074) comments and documentation for index_interval are misleading


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-3074:
--

Affects Version/s: (was: 0.8.4)

 comments and documentation for index_interval are misleading
 

 Key: CASSANDRA-3074
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3074
 Project: Cassandra
  Issue Type: Bug
Reporter: Matthew F. Dennis
Assignee: Matthew F. Dennis
Priority: Minor
 Attachments: 3074-cassandra-0.8.patch


 The comments and documentation for index_interval are misleading.  They state 
 the larger the *sampling* the more effective the index as at the cost of 
 space.  This is true, but in the context of the configuration variable it 
 implies the larger the *setting* is the larger the index is while in fact 
 it's the opposite of that.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2252) arena allocation for memtables


[ 
https://issues.apache.org/jira/browse/CASSANDRA-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13091225#comment-13091225
 ] 

Yang Yang commented on CASSANDRA-2252:
--

my line of thought comes from update-heavy workload (counters etc). conceivably 
(putting the row key issue aside for a while)  one region would contain 
bytebuffer values of similar age. as more updates come in, all the columns in 
older regions are likely to have all died out, thus allowing us to free the 
entire region before flushing happens.  


coming back to the row key issue, in the original slab allocator paper ( Jeff 
Bonwick ) , a slab contains strictly the same objects, which imply that they 
die at roughly the same time.   if they don't, then yes, in our case, slab has 
the disadvantage that an entire slab (2MB worth of mem) is held simply because 
a row key in it is not dead yet.  so to overcome this disadvantage, we probably 
need to further distinguish between object types to be allocated in the slab: 
this JIRA (same as HBase code) distinguishes between all the allocations 
between different memtables, to work better with update-heavy traffic, we need 
to distinguish between row keys and column values (they have different life 
times)




 arena allocation for memtables
 --

 Key: CASSANDRA-2252
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2252
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
 Fix For: 1.0

 Attachments: 0001-add-MemtableAllocator.txt, 
 0002-add-off-heap-MemtableAllocator-support.txt, 2252-v3.txt, 2252-v4.txt, 
 merged-2252.tgz


 The memtable design practically actively fights Java's GC design.  Todd 
 Lipcon gave a good explanation over on HBASE-3455.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Issue Comment Edited] (CASSANDRA-2252) arena allocation for memtables


[ 
https://issues.apache.org/jira/browse/CASSANDRA-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13091225#comment-13091225
 ] 

Yang Yang edited comment on CASSANDRA-2252 at 8/25/11 7:09 PM:
---

my line of thought comes from update-heavy workload (counters etc). conceivably 
(putting the row key issue aside for a while)  one region would contain 
bytebuffer values of similar age. as more updates come in, all the columns in 
older regions are likely to have all died out, thus allowing us to free the 
entire region before flushing happens.  


coming back to the row key issue, in the original slab allocator paper ( Jeff 
Bonwick ) , a slab contains strictly the same objects, which imply that they 
die at roughly the same time.   if they don't, then yes, in our case, slab has 
the disadvantage that an entire slab (2MB worth of mem) is held simply because 
a row key in it is not dead yet.  so to overcome this disadvantage, we probably 
need to further distinguish between object types to be allocated in the slab: 
this JIRA (same as HBase code) distinguishes between all the allocations 
between different memtables, to work better with update-heavy traffic, we need 
to *distinguish between row keys and column values (they have different life 
times)*




  was (Author: yangyangyyy):
my line of thought comes from update-heavy workload (counters etc). 
conceivably (putting the row key issue aside for a while)  one region would 
contain bytebuffer values of similar age. as more updates come in, all the 
columns in older regions are likely to have all died out, thus allowing us to 
free the entire region before flushing happens.  


coming back to the row key issue, in the original slab allocator paper ( Jeff 
Bonwick ) , a slab contains strictly the same objects, which imply that they 
die at roughly the same time.   if they don't, then yes, in our case, slab has 
the disadvantage that an entire slab (2MB worth of mem) is held simply because 
a row key in it is not dead yet.  so to overcome this disadvantage, we probably 
need to further distinguish between object types to be allocated in the slab: 
this JIRA (same as HBase code) distinguishes between all the allocations 
between different memtables, to work better with update-heavy traffic, we need 
to distinguish between row keys and column values (they have different life 
times)



  
 arena allocation for memtables
 --

 Key: CASSANDRA-2252
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2252
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
 Fix For: 1.0

 Attachments: 0001-add-MemtableAllocator.txt, 
 0002-add-off-heap-MemtableAllocator-support.txt, 2252-v3.txt, 2252-v4.txt, 
 merged-2252.tgz


 The memtable design practically actively fights Java's GC design.  Todd 
 Lipcon gave a good explanation over on HBASE-3455.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2252) arena allocation for memtables


[ 
https://issues.apache.org/jira/browse/CASSANDRA-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13091229#comment-13091229
 ] 

Yang Yang commented on CASSANDRA-2252:
--

actually even if we don't do the further optimization suggested in the last 
comment (separate rowkey and column value into different slab allocators), it 
would still very much likely work better and kill off some dead regions. 

let's say a row/column is continually updated 1000 times ,  and 100 column 
value fit into 2MB, then to do these 1000 updates, we allocate 10 regions, only 
the first region would contain the row key, and finally all the 8 regions in 
the middle would die, the first one remains due to the row key, and the last 
remains due to the latest (live) column value

 arena allocation for memtables
 --

 Key: CASSANDRA-2252
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2252
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
 Fix For: 1.0

 Attachments: 0001-add-MemtableAllocator.txt, 
 0002-add-off-heap-MemtableAllocator-support.txt, 2252-v3.txt, 2252-v4.txt, 
 merged-2252.tgz


 The memtable design practically actively fights Java's GC design.  Todd 
 Lipcon gave a good explanation over on HBASE-3455.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

svn commit: r1161709 - in /cassandra/trunk: ./ conf/ contrib/ interface/thrift/gen-java/org/apache/cassandra/thrift/ src/java/org/apache/cassandra/config/ src/java/org/apache/cassandra/db/ src/java/or

Author: jbellis
Date: Thu Aug 25 19:28:24 2011
New Revision: 1161709

URL: http://svn.apache.org/viewvc?rev=1161709view=rev
Log:
merge from 0.8

Modified:
cassandra/trunk/   (props changed)
cassandra/trunk/CHANGES.txt
cassandra/trunk/build.xml
cassandra/trunk/conf/cassandra.yaml
cassandra/trunk/contrib/   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/InvalidRequestException.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/NotFoundException.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/SuperColumn.java
   (props changed)
cassandra/trunk/src/java/org/apache/cassandra/config/CFMetaData.java
cassandra/trunk/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
cassandra/trunk/src/java/org/apache/cassandra/db/index/keys/KeysIndex.java
cassandra/trunk/src/java/org/apache/cassandra/io/sstable/SSTableReader.java
cassandra/trunk/src/java/org/apache/cassandra/service/GCInspector.java
cassandra/trunk/src/java/org/apache/cassandra/service/StorageService.java
cassandra/trunk/src/java/org/apache/cassandra/tools/SSTableExport.java

Propchange: cassandra/trunk/
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Thu Aug 25 19:28:24 2011
@@ -1,7 +1,7 @@
 
/cassandra/branches/cassandra-0.6:922689-1052356,1052358-1053452,1053454,1053456-1131291
-/cassandra/branches/cassandra-0.7:1026516-1160444,1160825
+/cassandra/branches/cassandra-0.7:1026516-1160444,1160825,1161607
 /cassandra/branches/cassandra-0.7.0:1053690-1055654
-/cassandra/branches/cassandra-0.8:1090934-1125013,1125019-1133844,1133846-1133917,1133919-1135156,1135158-1160459,1160827
+/cassandra/branches/cassandra-0.8:1090934-1125013,1125019-1161708
 /cassandra/branches/cassandra-0.8.0:1125021-1130369
 /cassandra/branches/cassandra-0.8.1:1101014-1125018
 /cassandra/tags/cassandra-0.7.0-rc3:1051699-1053689

Modified: cassandra/trunk/CHANGES.txt
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/CHANGES.txt?rev=1161709r1=1161708r2=1161709view=diff
==
--- cassandra/trunk/CHANGES.txt (original)
+++ cassandra/trunk/CHANGES.txt Thu Aug 25 19:28:24 2011
@@ -73,6 +73,8 @@
CompactionManager.estimatedCompactions (CASSANDRA-2708)
  * expose rpc timeouts per host in MessagingServiceMBean (CASSANDRA-2941)
  * avoid including cwd in classpath for deb and rpm packages (CASSANDRA-2881)
+ * remove gossip state when a new IP takes over a token (CASSANDRA-3071)
+ * allow sstable2json to work on index sstable files (CASSANDRA-3059)
 
 
 0.8.4

Modified: cassandra/trunk/build.xml
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/build.xml?rev=1161709r1=1161708r2=1161709view=diff
==
--- cassandra/trunk/build.xml (original)
+++ cassandra/trunk/build.xml Thu Aug 25 19:28:24 2011
@@ -36,7 +36,6 @@
 property name=build.src value=${basedir}/src/
 property name=build.src.java value=${basedir}/src/java/
 property name=build.src.resources value=${basedir}/src/resources/
-property name=build.src.driver value=${basedir}/drivers/java/src /
 property name=avro.src value=${basedir}/src/avro/
 property name=build.src.gen-java value=${basedir}/src/gen-java/
 property name=build.lib value=${basedir}/lib/
@@ -46,7 +45,6 @@
 property name=build.classes value=${build.dir}/classes/
 property name=build.classes.main value=${build.classes}/main /
 property name=build.classes.thrift value=${build.classes}/thrift /
-property name=build.classes.cql value=${build.classes}/cql /
 property name=javadoc.dir value=${build.dir}/javadoc/
 property name=javadoc.jars.dir value=${build.dir}/javadocs/
 property name=interface.dir value=${basedir}/interface/
@@ -161,7 +159,6 @@
 message=Not a source artifact, stopping here. /
 mkdir dir=${build.classes.main}/
 mkdir dir=${build.classes.thrift}/
-mkdir dir=${build.classes.cql}/
 mkdir dir=${test.lib}/
 mkdir dir=${test.classes}/
 mkdir dir=${build.src.gen-java}/
@@ -396,7 +393,6 @@ url=${svn.entry.url}?pathrev=${svn.entry
   dependency groupId=log4j artifactId=log4j version=1.2.16 /
   dependency groupId=org.apache.cassandra 
artifactId=cassandra-all version=${version} /
   dependency groupId=org.apache.cassandra 
artifactId=cassandra-thrift version=${version} /
-  dependency groupId=org.apache.cassandra 
artifactId=cassandra-cql version=${version} /
 /dependencyManagement
 developer

buildbot failure in ASF Buildbot on cassandra-trunk

2011-08-25 Thread buildbot

The Buildbot has detected a new failure on builder cassandra-trunk while 
building ASF Buildbot.
Full details are available at:
 http://ci.apache.org/builders/cassandra-trunk/builds/1553

Buildbot URL: http://ci.apache.org/

Buildslave for this Build: isis_ubuntu

Build Reason: scheduler
Build Source Stamp: [branch cassandra/trunk] 1161709
Blamelist: jbellis

BUILD FAILED: failed compile

sincerely,
 -The Buildbot

svn commit: r1161712 - /cassandra/trunk/build.xml

Author: jbellis
Date: Thu Aug 25 19:33:04 2011
New Revision: 1161712

URL: http://svn.apache.org/viewvc?rev=1161712view=rev
Log:
fix bad merge

Modified:
cassandra/trunk/build.xml

Modified: cassandra/trunk/build.xml
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/build.xml?rev=1161712r1=1161711r2=1161712view=diff
==
--- cassandra/trunk/build.xml (original)
+++ cassandra/trunk/build.xml Thu Aug 25 19:33:04 2011
@@ -36,6 +36,7 @@
 property name=build.src value=${basedir}/src/
 property name=build.src.java value=${basedir}/src/java/
 property name=build.src.resources value=${basedir}/src/resources/
+property name=build.src.driver value=${basedir}/drivers/java/src /
 property name=avro.src value=${basedir}/src/avro/
 property name=build.src.gen-java value=${basedir}/src/gen-java/
 property name=build.lib value=${basedir}/lib/
@@ -45,6 +46,7 @@
 property name=build.classes value=${build.dir}/classes/
 property name=build.classes.main value=${build.classes}/main /
 property name=build.classes.thrift value=${build.classes}/thrift /
+property name=build.classes.cql value=${build.classes}/cql /
 property name=javadoc.dir value=${build.dir}/javadoc/
 property name=javadoc.jars.dir value=${build.dir}/javadocs/
 property name=interface.dir value=${basedir}/interface/
@@ -159,6 +161,7 @@
 message=Not a source artifact, stopping here. /
 mkdir dir=${build.classes.main}/
 mkdir dir=${build.classes.thrift}/
+mkdir dir=${build.classes.cql}/
 mkdir dir=${test.lib}/
 mkdir dir=${test.classes}/
 mkdir dir=${build.src.gen-java}/
@@ -393,6 +396,7 @@ url=${svn.entry.url}?pathrev=${svn.entry
   dependency groupId=log4j artifactId=log4j version=1.2.16 /
   dependency groupId=org.apache.cassandra 
artifactId=cassandra-all version=${version} /
   dependency groupId=org.apache.cassandra 
artifactId=cassandra-thrift version=${version} /
+  dependency groupId=org.apache.cassandra 
artifactId=cassandra-cql version=${version} /
 /dependencyManagement
 developer id=alakshman name=Avinash Lakshman/
 developer id=antelder name=Anthony Elder/
@@ -499,6 +503,22 @@ url=${svn.entry.url}?pathrev=${svn.entry
 dependency groupId=org.slf4j artifactId=slf4j-api/
 dependency groupId=org.apache.thrift artifactId=libthrift/
   /artifact:pom
+  artifact:pom id=cql-pom
+artifactId=cassandra-cql
+url=http://cassandra.apache.org;
+name=Apache Cassandra
+parent groupId=org.apache.cassandra
+artifactId=cassandra-parent
+version=${version}/
+scm connection=${scm.connection} 
developerConnection=${scm.developerConnection} url=${scm.url}/
+dependency groupId=com.google.guava artifactId=guava/
+dependency groupId=org.slf4j artifactId=slf4j-api/
+dependency groupId=org.apache.thrift artifactId=libthrift/
+dependency groupId=org.apache.cassandra 
artifactId=cassandra-thrift/
+dependency groupId=org.apache.cassandra artifactId=cassandra-all/
+!-- because cassandra-all uses log4j, and we need cassandra-all, 
consumers must use log4j, so force log4j version of slf4j --
+dependency groupId=org.slf4j artifactId=slf4j-log4j12 
scope=runtime/
+  /artifact:pom
 
   artifact:pom id=dist-pom
 artifactId=apache-cassandra
@@ -668,6 +688,11 @@ url=${svn.entry.url}?pathrev=${svn.entry
 src path=${build.src.gen-java}/
 classpath refid=cassandra.classpath/
 /javac
+javac debug=true debuglevel=${debuglevel}
+   destdir=${build.classes.cql} includeantruntime=false
+src path=${build.src.driver} /
+classpath refid=cassandra.classpath/
+/javac
 copy todir=${build.classes.main}
 fileset dir=${build.src.resources} /
 /copy
@@ -725,6 +750,20 @@ url=${svn.entry.url}?pathrev=${svn.entry
 !-- /section --
 /manifest
   /jar
+
+  !-- CQL driver Jar --
+  artifact:writepom pomRefId=cql-pom 
+  
file=${build.dir}/${ant.project.name}-cql-${cql.driver.version}.pom/
+  jar 
jarfile=${build.dir}/${ant.project.name}-cql-${cql.driver.version}.jar
+   basedir=${build.classes.cql}
+manifest
+  attribute name=Implementation-Title value=Cassandra/
+  attribute name=Implementation-Version value=${version}/
+  attribute name=Implementation-Vendor value=Apache/
+  attribute name=Class-Path
+ value=${ant.project.name}-thrift-${version}.jar /
+/manifest
+  /jar
 /target
 
 !--
@@ -750,11 +789,23 @@ url=${svn.entry.url}?pathrev=${svn.entry
 fileset dir=${build.src.gen-java} defaultexcludes=yes

[Cassandra Wiki] Update of ConfigurationNotes by PavelYaskevich

2011-08-25 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on Cassandra Wiki for 
change notification.

The ConfigurationNotes page has been changed by PavelYaskevich:
http://wiki.apache.org/cassandra/ConfigurationNotes?action=diffrev1=9rev2=10

  
  Per-node options are loaded from yaml and held in !DatabaseDescriptor.
  
- Per-KS, per-CF, and per-Column options are loaded from the !MigrationsTable 
at startup and are encapsulated with KSMetaData, CFMetaData, and 
!ColumnDefinition objects, which are held by !DatabaseDescriptor as well as 
!Tables and !ColumnFamilyStores respectively. When a migration arrives, it 
writes to the !MigrationsTable, then propogates the changes out to the KS/CFMD 
objects in the system.
+ Per-KS, per-CF, and per-Column options are loaded from the !MigrationsTable 
at startup and are encapsulated with !KSMetaData, !CFMetaData, and 
!ColumnDefinition objects, which are held by !Schema and !Table. When a 
migration arrives, it writes to the !MigrationsTable, then propogates the 
changes out to the KS/CFMD objects in the system.
  
  Configuration can be changed at runtime without a restart (excluding the ones 
that change on-disk format (which cannot be changed without clearing the 
cluster) and ones that change routing). For per-node options, poke 
!StorageService via JMX (which in turn pokes !DatabaseDescriptor). For per-KS 
options, poke the appropriate !Table. For per-CF and per-Column options, poke 
the appropriate !ColumnFamilyStore. These ephemeral changes are stronger than 
migrations (they stay set regardless of new config coming in), but do not 
persist between reboots.
  
@@ -22, +22 @@

  
   * define T getFoo() {return foo;} since all optional params are private
  
-  * update deflate() and inflate() to handle the new option -!CfDef and 
!CfDef-
+  * update to{Avro/Thrift}() and from{Avro/Thrift}() to handle the new option 
-!CfDef and !CfDef-
  
   * update equals(), hashcode(), and tostring() to build with the new prop
  
   * update applyImplicitDefaults()
  
-  * update convertTo{Thrift/Avro}()
- 
   * update apply() (a.k.a. applyAvroMigrationChangesToCurrentCFMD)
- 
-  * update convertToCFMetaData (a.k.a. convert thrift to CFMD and validate it)
  
   * if desired, add new option to CLI add/update CF

svn commit: r1161719 - /cassandra/trunk/src/java/org/apache/cassandra/tools/SSTableExport.java

Author: jbellis
Date: Thu Aug 25 19:49:52 2011
New Revision: 1161719

URL: http://svn.apache.org/viewvc?rev=1161719view=rev
Log:
fix bad merge

Modified:
cassandra/trunk/src/java/org/apache/cassandra/tools/SSTableExport.java

Modified: cassandra/trunk/src/java/org/apache/cassandra/tools/SSTableExport.java
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/src/java/org/apache/cassandra/tools/SSTableExport.java?rev=1161719r1=1161718r2=1161719view=diff
==
--- cassandra/trunk/src/java/org/apache/cassandra/tools/SSTableExport.java 
(original)
+++ cassandra/trunk/src/java/org/apache/cassandra/tools/SSTableExport.java Thu 
Aug 25 19:49:52 2011
@@ -29,6 +29,7 @@ import org.apache.cassandra.config.Colum
 import org.apache.cassandra.config.DatabaseDescriptor;
 import org.apache.cassandra.config.Schema;
 import org.apache.cassandra.db.*;
+import org.apache.cassandra.db.index.keys.KeysIndex;
 import org.apache.cassandra.db.marshal.AbstractType;
 import org.apache.cassandra.io.util.RandomAccessReader;
 import org.apache.cassandra.service.StorageService;
@@ -341,13 +342,13 @@ public class SSTableExport
 // look up index metadata from parent
 int i = descriptor.cfname.indexOf(.);
 String parentName = descriptor.cfname.substring(0, i);
-CFMetaData parent = 
DatabaseDescriptor.getCFMetaData(descriptor.ksname, parentName);
+CFMetaData parent = 
Schema.instance.getCFMetaData(descriptor.ksname, parentName);
 ColumnDefinition def = 
parent.getColumnDefinitionForIndex(descriptor.cfname.substring(i + 1));
-metadata = CFMetaData.newIndexMetadata(parent, def, 
ColumnFamilyStore.indexComparator());
+metadata = CFMetaData.newIndexMetadata(parent, def, 
KeysIndex.indexComparator());
 }
 else
 {
-metadata = DatabaseDescriptor.getCFMetaData(descriptor.ksname, 
descriptor.cfname);
+metadata = Schema.instance.getCFMetaData(descriptor.ksname, 
descriptor.cfname);
 }
 
 export(SSTableReader.open(descriptor, metadata), outs, excludes);

buildbot success in ASF Buildbot on cassandra-trunk

2011-08-25 Thread buildbot

The Buildbot has detected a restored build on builder cassandra-trunk while 
building ASF Buildbot.
Full details are available at:
 http://ci.apache.org/builders/cassandra-trunk/builds/1555

Buildbot URL: http://ci.apache.org/

Buildslave for this Build: isis_ubuntu

Build Reason: scheduler
Build Source Stamp: [branch cassandra/trunk] 1161719
Blamelist: jbellis

Build succeeded!

sincerely,
 -The Buildbot

[jira] [Commented] (CASSANDRA-3074) comments and documentation for index_interval are misleading

2011-08-25 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13091296#comment-13091296
 ] 

Hudson commented on CASSANDRA-3074:
---

Integrated in Cassandra-0.8 #295 (See 
[https://builds.apache.org/job/Cassandra-0.8/295/])
clarify index_interval explanation
patch by mdennis for CASSANDRA-3074

jbellis : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1161701
Files : 
* /cassandra/branches/cassandra-0.8/conf/cassandra.yaml


 comments and documentation for index_interval are misleading
 

 Key: CASSANDRA-3074
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3074
 Project: Cassandra
  Issue Type: Bug
Reporter: Matthew F. Dennis
Assignee: Matthew F. Dennis
Priority: Minor
 Fix For: 0.8.5

 Attachments: 3074-cassandra-0.8.patch


 The comments and documentation for index_interval are misleading.  They state 
 the larger the *sampling* the more effective the index as at the cost of 
 space.  This is true, but in the context of the configuration variable it 
 implies the larger the *setting* is the larger the index is while in fact 
 it's the opposite of that.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (CASSANDRA-3078) Make Secondary Indexes Pluggable

Make Secondary Indexes Pluggable 
-

 Key: CASSANDRA-3078
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3078
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: T Jake Luciani
Assignee: T Jake Luciani
 Fix For: 1.0


CASSANDRA-2982 got us most of the way there...

This ticket removes the IndexType enum (while keeping support for KEYS 
internally from old cf metadata).

You now specify a index_class rather than index_type.  index_class is the full 
classname of the SecondaryIndex impl.  This also adds a index_options map to 
pass extra info to the secondary index impl if needed.



--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-3025) PHP/PDO driver for Cassandra CQL

2011-08-25 Thread Mikko Koppanen (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikko Koppanen updated CASSANDRA-3025:
--

Attachment: pdo_cassandra-0.1.2.tgz

Hi,

PDO doesn't seem to support different amount of columns on rows, which is 
slightly problematic with sparse columns. I did the following solution for now:

https://github.com/mkoppanen/php-pdo_cassandra/blob/master/tests/017-sparsecolumns.phpt

The columns that are not set for the row are named __column_not_set_%d, which 
I think is about the cleanest way to do it.

The test for integers is updated here:
https://github.com/mkoppanen/php-pdo_cassandra/blob/master/tests/018-int.phpt

UUID behaviour is here:
https://github.com/mkoppanen/php-pdo_cassandra/blob/master/tests/019-uuid.phpt
For UUIDs I was thinking about adding additional configuration option to 
automatically unparse them into string representation.

Test with the available data types as values:
https://github.com/mkoppanen/php-pdo_cassandra/blob/master/tests/020-types.phpt

And a test using bigint comparator + sparse columns:
https://github.com/mkoppanen/php-pdo_cassandra/blob/master/tests/021-comparators.phpt


 PHP/PDO driver for Cassandra CQL
 

 Key: CASSANDRA-3025
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3025
 Project: Cassandra
  Issue Type: New Feature
  Components: API
Reporter: Mikko Koppanen
  Labels: php
 Attachments: pdo_cassandra-0.1.0.tgz, pdo_cassandra-0.1.1.tgz, 
 pdo_cassandra-0.1.2.tgz, php_test_results_20110818_2317.txt


 Hello,
 attached is the initial version of the PDO driver for Cassandra CQL language. 
 This is a native PHP extension written in what I would call a combination of 
 C and C++, due to PHP being C. The thrift API used is the C++.
 The API looks roughly following:
 {code}
 ?php
 $db = new PDO('cassandra:host=127.0.0.1;port=9160');
 $db-exec (CREATE KEYSPACE mytest with strategy_class = 'SimpleStrategy' and 
 strategy_options:replication_factor=1;);
 $db-exec (USE mytest);
 $db-exec (CREATE COLUMNFAMILY users (
   my_key varchar PRIMARY KEY,
   full_name varchar ););
   
 $stmt = $db-prepare (INSERT INTO users (my_key, full_name) VALUES (:key, 
 :full_name););
 $stmt-execute (array (':key' = 'mikko', ':full_name' = 'Mikko K' ));
 {code}
 Currently prepared statements are emulated on the client side but I 
 understand that there is a plan to add prepared statements to Cassandra CQL 
 API as well. I will add this feature in to the extension as soon as they are 
 implemented.
 Additional documentation can be found in github 
 https://github.com/mkoppanen/php-pdo_cassandra, in the form of rendered 
 MarkDown file. Tests are currently not included in the package file and they 
 can be found in the github for now as well.
 I have created documentation in docbook format as well, but have not yet 
 rendered it.
 Comments and feedback are welcome.
 Thanks,
 Mikko

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-3078) Make Secondary Indexes Pluggable


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

T Jake Luciani updated CASSANDRA-3078:
--

Attachment: 3078.txt

 Make Secondary Indexes Pluggable 
 -

 Key: CASSANDRA-3078
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3078
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.0
Reporter: T Jake Luciani
Assignee: T Jake Luciani
  Labels: secondary_index
 Fix For: 1.0

 Attachments: 3078.txt


 CASSANDRA-2982 got us most of the way there...
 This ticket removes the IndexType enum (while keeping support for KEYS 
 internally from old cf metadata).
 You now specify a index_class rather than index_type.  index_class is the 
 full classname of the SecondaryIndex impl.  This also adds a index_options 
 map to pass extra info to the secondary index impl if needed.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-3078) Make Secondary Indexes Pluggable


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

T Jake Luciani updated CASSANDRA-3078:
--

Attachment: 3078_thrift.txt

 Make Secondary Indexes Pluggable 
 -

 Key: CASSANDRA-3078
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3078
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.0
Reporter: T Jake Luciani
Assignee: T Jake Luciani
  Labels: secondary_index
 Fix For: 1.0

 Attachments: 3078.txt, 3078_thrift.txt


 CASSANDRA-2982 got us most of the way there...
 This ticket removes the IndexType enum (while keeping support for KEYS 
 internally from old cf metadata).
 You now specify a index_class rather than index_type.  index_class is the 
 full classname of the SecondaryIndex impl.  This also adds a index_options 
 map to pass extra info to the secondary index impl if needed.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-957) convenience workflow for replacing dead node

2011-08-25 Thread Nick Bailey (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13091364#comment-13091364
 ] 

Nick Bailey commented on CASSANDRA-957:
---

The hint rework looks good. The only comment I have there is that it would be 
nice if the logging statements for sending hints creating hints indicated the 
ip as well as the token. Even though it's stored by token it would be nice to 
immediately see the ip in the log without having to look it up.

I'm also unsure about the reasoning behind the last patch. Why increase the 
initial sleep in joinTokenRing?

 convenience workflow for replacing dead node
 

 Key: CASSANDRA-957
 URL: https://issues.apache.org/jira/browse/CASSANDRA-957
 Project: Cassandra
  Issue Type: Wish
  Components: Core, Tools
Affects Versions: 0.8.2
Reporter: Jonathan Ellis
Assignee: Vijay
 Fix For: 1.0

 Attachments: 0001-Support-Token-Replace.patch, 
 0001-Support-bringing-back-a-node-to-the-cluster-that-exi.patch, 
 0001-Support-token-replace.patch, 0001-support-for-replace-token-v3.patch, 
 0002-Do-not-include-local-node-when-computing-workMap.patch, 
 0002-Rework-Hints-to-be-on-token.patch, 
 0002-Rework-Hints-to-be-on-token.patch, 
 0002-upport-for-hints-on-token-v3.patch, 
 0003-Make-HintedHandoff-More-reliable.patch, 
 0003-Make-hints-More-reliable.patch, 0003-making-bootstrap-sleep-longer.patch

   Original Estimate: 24h
  Remaining Estimate: 24h

 Replacing a dead node with a new one is a common operation, but nodetool 
 removetoken followed by bootstrap is inefficient (re-replicating data first 
 to the remaining nodes, then to the new one) and manually bootstrapping to a 
 token just less than the old one's, followed by nodetool removetoken is 
 slightly painful and prone to manual errors.
 First question: how would you expose this in our tool ecosystem?  It needs to 
 be a startup-time option to the new node, so it can't be nodetool, and 
 messing with the config xml definitely takes the convenience out.  A 
 one-off -DreplaceToken=XXY argument?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-957) convenience workflow for replacing dead node

[
https://issues.apache.org/jira/browse/CASSANDRA-957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13091376#comment-13091376
]

Vijay commented on CASSANDRA-957:
-

* In Gossiper.doStatusCheck() you made it ignore any state that is for the
local endpoint and is not a dead state. Shouldn't it just always ignore any
state about the local endpoint though? Basically what it was doing previously?
* Basically the same question about Gossiper.applyStateLocally() the loop
continues if the state is for the local node and the state is dead. Why would
we want to apply a live local state?
- Fixed, initial intention was to find the old state of the node, Seems like
it is not possible now…

* Does the hibernate state need the true/false value? Seems like all we care
about is that it is set at all. Looks like we we are starting up right now we
automatically go into a hibernate state, then we go into a bootstrap state
afterwards if the specified a replace token. Seems like we shouldn't set a
state at all until we know we are doing one of replace/bootstrap/just joining.
- it will be either true or false (If not a replace, or overwrite with the
state normal)… if you don't then Gossiper.applyStateLocally will mark it alive
on all the other nodes.

* It looks like right now you could specify a replace token that isn't part of
the cluster. If that happens we should throw an exception and tell the user to
do the normal bootstrap process.
- As we are ignoring the local states… this information is hard to gather when
we are trying to replace the same node…. The check is to see no other live node
owns this token….
- We can document in the wiki about the effects if they replace a token which
is not part of the ring…. (repair/decommission)

* Why use the last gossip time to determine if the node we are replacing is
alive? Why not just check gossip to see if the ring thinks it is alive?
- because by default when we hear about someone we consider them to be alive….
the idea is to check and see if we heard from them back or not (After the ring
delay) if not then there is more probability that the dead node is dead (Thats
why we have to wait for 90 + delay

* We should update the the message for the exception that is thrown when you
try to bootstrap to an existing token. It should indicate either remove the
dead node or follow this replacement process.
- I am not sure if i parse that, i have added more to it plz check.

* I'm not sure why we are calling updateNormalToken() in the
StorageService.bootstrap() method when it's a token replacement.
- Thats because you don't want the range request sent to the node which is not
existing.

* A little bit of doc on this would be good, maybe in cassandra.yaml? Just on
how to pass the argument to the startup process.
- Yaml is bad because this is a one time thing…. Wiki page? like the don't
join ring property

convenience workflow for replacing dead node

Key: CASSANDRA-957
URL: https://issues.apache.org/jira/browse/CASSANDRA-957
Project: Cassandra
Issue Type: Wish
Components: Core, Tools
Affects Versions: 0.8.2
Reporter: Jonathan Ellis
Assignee: Vijay
Fix For: 1.0

Attachments: 0001-Support-Token-Replace.patch,
0001-Support-bringing-back-a-node-to-the-cluster-that-exi.patch,
0001-Support-token-replace.patch, 0001-support-for-replace-token-v3.patch,
0002-Do-not-include-local-node-when-computing-workMap.patch,
0002-Rework-Hints-to-be-on-token.patch,
0002-Rework-Hints-to-be-on-token.patch,
0002-upport-for-hints-on-token-v3.patch,
0003-Make-HintedHandoff-More-reliable.patch,
0003-Make-hints-More-reliable.patch, 0003-making-bootstrap-sleep-longer.patch

Original Estimate: 24h
Remaining Estimate: 24h

Replacing a dead node with a new one is a common operation, but nodetool
removetoken followed by bootstrap is inefficient (re-replicating data first
to the remaining nodes, then to the new one) and manually bootstrapping to a
token just less than the old one's, followed by nodetool removetoken is
slightly painful and prone to manual errors.
First question: how would you expose this in our tool ecosystem? It needs to
be a startup-time option to the new node, so it can't be nodetool, and
messing with the config xml definitely takes the convenience out. A
one-off -DreplaceToken=XXY argument?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-957) convenience workflow for replacing dead node


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay updated CASSANDRA-957:


Attachment: 0001-support-token-replace-v4.patch

Attaching newer version with fix and rebase.

 convenience workflow for replacing dead node
 

 Key: CASSANDRA-957
 URL: https://issues.apache.org/jira/browse/CASSANDRA-957
 Project: Cassandra
  Issue Type: Wish
  Components: Core, Tools
Affects Versions: 0.8.2
Reporter: Jonathan Ellis
Assignee: Vijay
 Fix For: 1.0

 Attachments: 0001-Support-Token-Replace.patch, 
 0001-Support-bringing-back-a-node-to-the-cluster-that-exi.patch, 
 0001-Support-token-replace.patch, 0001-support-for-replace-token-v3.patch, 
 0001-support-token-replace-v4.patch, 
 0002-Do-not-include-local-node-when-computing-workMap.patch, 
 0002-Rework-Hints-to-be-on-token.patch, 
 0002-Rework-Hints-to-be-on-token.patch, 
 0002-upport-for-hints-on-token-v3.patch, 
 0003-Make-HintedHandoff-More-reliable.patch, 
 0003-Make-hints-More-reliable.patch, 0003-making-bootstrap-sleep-longer.patch

   Original Estimate: 24h
  Remaining Estimate: 24h

 Replacing a dead node with a new one is a common operation, but nodetool 
 removetoken followed by bootstrap is inefficient (re-replicating data first 
 to the remaining nodes, then to the new one) and manually bootstrapping to a 
 token just less than the old one's, followed by nodetool removetoken is 
 slightly painful and prone to manual errors.
 First question: how would you expose this in our tool ecosystem?  It needs to 
 be a startup-time option to the new node, so it can't be nodetool, and 
 messing with the config xml definitely takes the convenience out.  A 
 one-off -DreplaceToken=XXY argument?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-957) convenience workflow for replacing dead node


[ 
https://issues.apache.org/jira/browse/CASSANDRA-957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13091386#comment-13091386
 ] 

Vijay commented on CASSANDRA-957:
-

I'm also unsure about the reasoning behind the last patch. Why increase the 
initial sleep in joinTokenRing?
 -- Ring delay + extra time so we can check if there is any live server before 
actually replacing the node.

 convenience workflow for replacing dead node
 

 Key: CASSANDRA-957
 URL: https://issues.apache.org/jira/browse/CASSANDRA-957
 Project: Cassandra
  Issue Type: Wish
  Components: Core, Tools
Affects Versions: 0.8.2
Reporter: Jonathan Ellis
Assignee: Vijay
 Fix For: 1.0

 Attachments: 0001-Support-Token-Replace.patch, 
 0001-Support-bringing-back-a-node-to-the-cluster-that-exi.patch, 
 0001-Support-token-replace.patch, 0001-support-for-replace-token-v3.patch, 
 0001-support-token-replace-v4.patch, 
 0002-Do-not-include-local-node-when-computing-workMap.patch, 
 0002-Rework-Hints-to-be-on-token.patch, 
 0002-Rework-Hints-to-be-on-token.patch, 
 0002-upport-for-hints-on-token-v3.patch, 
 0003-Make-HintedHandoff-More-reliable.patch, 
 0003-Make-hints-More-reliable.patch, 0003-making-bootstrap-sleep-longer.patch

   Original Estimate: 24h
  Remaining Estimate: 24h

 Replacing a dead node with a new one is a common operation, but nodetool 
 removetoken followed by bootstrap is inefficient (re-replicating data first 
 to the remaining nodes, then to the new one) and manually bootstrapping to a 
 token just less than the old one's, followed by nodetool removetoken is 
 slightly painful and prone to manual errors.
 First question: how would you expose this in our tool ecosystem?  It needs to 
 be a startup-time option to the new node, so it can't be nodetool, and 
 messing with the config xml definitely takes the convenience out.  A 
 one-off -DreplaceToken=XXY argument?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-3079) Allow the client request scheduler to throw Timeouts

2011-08-25 Thread Stu Hood (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stu Hood updated CASSANDRA-3079:


Reviewer: lenn0x  (was: chrisg)

 Allow the client request scheduler to throw Timeouts
 

 Key: CASSANDRA-3079
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3079
 Project: Cassandra
  Issue Type: Improvement
  Components: API
Reporter: Stu Hood
Assignee: Stu Hood
Priority: Minor
 Fix For: 1.0


 The RoundRobinScheduler prioritizes threads by allowing them to queue on a 
 SynchronousQueue per scheduling bucket. These queues currently do not use 
 timeouts, and we observed a cascading case where client retries caused the 
 scheduler queues to fill such that latency was way above the client timeout.
 Allowing the IRequestScheduler.queue method to throw a (per-call 
 configurable) timeout, we can avoid this cascading.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-957) convenience workflow for replacing dead node


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay updated CASSANDRA-957:


Attachment: 0002-hints-on-token-than-ip-v4.patch

Added more logging for RMV i am not sure if we have to parse the string to 
token and then to ip.

 convenience workflow for replacing dead node
 

 Key: CASSANDRA-957
 URL: https://issues.apache.org/jira/browse/CASSANDRA-957
 Project: Cassandra
  Issue Type: Wish
  Components: Core, Tools
Affects Versions: 0.8.2
Reporter: Jonathan Ellis
Assignee: Vijay
 Fix For: 1.0

 Attachments: 0001-Support-Token-Replace.patch, 
 0001-Support-bringing-back-a-node-to-the-cluster-that-exi.patch, 
 0001-Support-token-replace.patch, 0001-support-for-replace-token-v3.patch, 
 0001-support-token-replace-v4.patch, 
 0002-Do-not-include-local-node-when-computing-workMap.patch, 
 0002-Rework-Hints-to-be-on-token.patch, 
 0002-Rework-Hints-to-be-on-token.patch, 0002-hints-on-token-than-ip-v4.patch, 
 0002-upport-for-hints-on-token-v3.patch, 
 0003-Make-HintedHandoff-More-reliable.patch, 
 0003-Make-hints-More-reliable.patch, 0003-making-bootstrap-sleep-longer.patch

   Original Estimate: 24h
  Remaining Estimate: 24h

 Replacing a dead node with a new one is a common operation, but nodetool 
 removetoken followed by bootstrap is inefficient (re-replicating data first 
 to the remaining nodes, then to the new one) and manually bootstrapping to a 
 token just less than the old one's, followed by nodetool removetoken is 
 slightly painful and prone to manual errors.
 First question: how would you expose this in our tool ecosystem?  It needs to 
 be a startup-time option to the new node, so it can't be nodetool, and 
 messing with the config xml definitely takes the convenience out.  A 
 one-off -DreplaceToken=XXY argument?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-3079) Allow the client request scheduler to throw Timeouts

2011-08-25 Thread Stu Hood (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stu Hood updated CASSANDRA-3079:


Attachment: 0002-Fix-try-finally-nesting-for-scheduling.txt
0001-Add-timeouts-to-request-scheduling.txt

0001 Adds the timeouts mentioned above (currently all schedule() calls use 
RpcTimeout, pending commit of CASSANDRA-2819)
0002 Fixes the try-finally nesting of schedule calls to avoid spurious 
release() calls due to timeouts

 Allow the client request scheduler to throw Timeouts
 

 Key: CASSANDRA-3079
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3079
 Project: Cassandra
  Issue Type: Improvement
  Components: API
Reporter: Stu Hood
Assignee: Stu Hood
Priority: Minor
 Fix For: 1.0

 Attachments: 0001-Add-timeouts-to-request-scheduling.txt, 
 0002-Fix-try-finally-nesting-for-scheduling.txt


 The RoundRobinScheduler prioritizes threads by allowing them to queue on a 
 SynchronousQueue per scheduling bucket. These queues currently do not use 
 timeouts, and we observed a cascading case where client retries caused the 
 scheduler queues to fill such that latency was way above the client timeout.
 Allowing the IRequestScheduler.queue method to throw a (per-call 
 configurable) timeout, we can avoid this cascading.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2434) node bootstrapping can violate consistency

2011-08-25 Thread paul cannon (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-2434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13091418#comment-13091418
]

paul cannon commented on CASSANDRA-2434:

So, it looks like it will be possible for the node-that-will-be-removed to
change between starting a bootstrap and finishing it (other nodes being
bootstrapped/moved/decom'd during that time period); in some cases, that could
still lead to a consistency violation. Is that unlikely enough that we don't
care, here? At least the situation would be better with the proposed fix than
it is now.

Second question: what might the permission from the operator to choose a
replica that is closer/less dead look like? Maybe just a boolean flag saying
it's ok to stream from any node for any range you need to stream? Or would
we want to allow specifying precise source nodes for any/all affected address
ranges?

node bootstrapping can violate consistency
--

Key: CASSANDRA-2434
URL: https://issues.apache.org/jira/browse/CASSANDRA-2434
Project: Cassandra
Issue Type: Bug
Reporter: Peter Schuller
Assignee: paul cannon
Fix For: 1.1

My reading (a while ago) of the code indicates that there is no logic
involved during bootstrapping that avoids consistency level violations. If I
recall correctly it just grabs neighbors that are currently up.
There are at least two issues I have with this behavior:
* If I have a cluster where I have applications relying on QUORUM with RF=3,
and bootstrapping complete based on only one node, I have just violated the
supposedly guaranteed consistency semantics of the cluster.
* Nodes can flap up and down at any time, so even if a human takes care to
look at which nodes are up and things about it carefully before
bootstrapping, there's no guarantee.
A complication is that not only does it depend on use-case where this is an
issue (if all you ever do you do at CL.ONE, it's fine); even in a cluster
which is otherwise used for QUORUM operations you may wish to accept
less-than-quorum nodes during bootstrap in various emergency situations.
A potential easy fix is to have bootstrap take an argument which is the
number of hosts to bootstrap from, or to assume QUORUM if none is given.
(A related concern is bootstrapping across data centers. You may *want* to
bootstrap to a local node and then do a repair to avoid sending loads of data
across DC:s while still achieving consistency. Or even if you don't care
about the consistency issues, I don't think there is currently a way to
bootstrap from local nodes only.)
Thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3079) Allow the client request scheduler to throw Timeouts

2011-08-25 Thread Melvin Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13091421#comment-13091421
 ] 

Melvin Wang commented on CASSANDRA-3079:


looks good to me.

 Allow the client request scheduler to throw Timeouts
 

 Key: CASSANDRA-3079
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3079
 Project: Cassandra
  Issue Type: Improvement
  Components: API
Reporter: Stu Hood
Assignee: Stu Hood
Priority: Minor
 Fix For: 1.0

 Attachments: 0001-Add-timeouts-to-request-scheduling.txt, 
 0002-Fix-try-finally-nesting-for-scheduling.txt


 The RoundRobinScheduler prioritizes threads by allowing them to queue on a 
 SynchronousQueue per scheduling bucket. These queues currently do not use 
 timeouts, and we observed a cascading case where client retries caused the 
 scheduler queues to fill such that latency was way above the client timeout.
 Allowing the IRequestScheduler.queue method to throw a (per-call 
 configurable) timeout, we can avoid this cascading.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (CASSANDRA-3080) Add throttling for internode streaming

2011-08-25 Thread Stu Hood (JIRA)

Add throttling for internode streaming
--

 Key: CASSANDRA-3080
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3080
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Stu Hood
 Fix For: 1.0


Cassandra does (mostly) sequential reads from disk to send data to other nodes, 
which means that it is easily possible to stream upwards of 100 MB/s per source 
node.

To avoid affecting service, we should add streaming throttling across all 
streams in the outbound direction, preferably configurable from JMX, and with 
`nodetool netstats` integration.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3025) PHP/PDO driver for Cassandra CQL


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13091520#comment-13091520
 ] 

Jonathan Ellis commented on CASSANDRA-3025:
---

bq. The columns that are not set for the row are named _column_not_set%d, 
which I think is about the cleanest way to do it

I don't think I understand.  Can we just set them to null instead?

bq. For UUIDs I was thinking about adding additional configuration option to 
automatically unparse them into string representation

Why not unparse into objects as in the phpcassa link above?

 PHP/PDO driver for Cassandra CQL
 

 Key: CASSANDRA-3025
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3025
 Project: Cassandra
  Issue Type: New Feature
  Components: API
Reporter: Mikko Koppanen
  Labels: php
 Attachments: pdo_cassandra-0.1.0.tgz, pdo_cassandra-0.1.1.tgz, 
 pdo_cassandra-0.1.2.tgz, php_test_results_20110818_2317.txt


 Hello,
 attached is the initial version of the PDO driver for Cassandra CQL language. 
 This is a native PHP extension written in what I would call a combination of 
 C and C++, due to PHP being C. The thrift API used is the C++.
 The API looks roughly following:
 {code}
 ?php
 $db = new PDO('cassandra:host=127.0.0.1;port=9160');
 $db-exec (CREATE KEYSPACE mytest with strategy_class = 'SimpleStrategy' and 
 strategy_options:replication_factor=1;);
 $db-exec (USE mytest);
 $db-exec (CREATE COLUMNFAMILY users (
   my_key varchar PRIMARY KEY,
   full_name varchar ););
   
 $stmt = $db-prepare (INSERT INTO users (my_key, full_name) VALUES (:key, 
 :full_name););
 $stmt-execute (array (':key' = 'mikko', ':full_name' = 'Mikko K' ));
 {code}
 Currently prepared statements are emulated on the client side but I 
 understand that there is a plan to add prepared statements to Cassandra CQL 
 API as well. I will add this feature in to the extension as soon as they are 
 implemented.
 Additional documentation can be found in github 
 https://github.com/mkoppanen/php-pdo_cassandra, in the form of rendered 
 MarkDown file. Tests are currently not included in the package file and they 
 can be found in the github for now as well.
 I have created documentation in docbook format as well, but have not yet 
 rendered it.
 Comments and feedback are welcome.
 Thanks,
 Mikko

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3078) Make Secondary Indexes Pluggable


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13091522#comment-13091522
 ] 

Jonathan Ellis commented on CASSANDRA-3078:
---

skimmed:

- hashconstruct addition should be moved into a separate ticket and used for 
strategy_options as well
- we should default to a package (org.apache.cassandra.db.index) if none is 
specified in class name so using built-ins is that much less of a pita
- style: spaces not tabs, space after //, space before () in conditionals + 
loops

 Make Secondary Indexes Pluggable 
 -

 Key: CASSANDRA-3078
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3078
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.0
Reporter: T Jake Luciani
Assignee: T Jake Luciani
  Labels: secondary_index
 Fix For: 1.0

 Attachments: 3078.txt, 3078_thrift.txt


 CASSANDRA-2982 got us most of the way there...
 This ticket removes the IndexType enum (while keeping support for KEYS 
 internally from old cf metadata).
 You now specify a index_class rather than index_type.  index_class is the 
 full classname of the SecondaryIndex impl.  This also adds a index_options 
 map to pass extra info to the secondary index impl if needed.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

svn commit: r1161983 - in /cassandra/trunk: ./ src/java/org/apache/cassandra/scheduler/ src/java/org/apache/cassandra/thrift/