svn commit: r1042730 - in /cassandra/branches/cassandra-0.7: CHANGES.txt build.xml debian/changelog

2010-12-06 Thread eevans
Author: eevans
Date: Mon Dec  6 17:22:31 2010
New Revision: 1042730

URL: http://svn.apache.org/viewvc?rev=1042730view=rev
Log:
update versioning for 0.7 rc2 release

Modified:
cassandra/branches/cassandra-0.7/CHANGES.txt
cassandra/branches/cassandra-0.7/build.xml
cassandra/branches/cassandra-0.7/debian/changelog

Modified: cassandra/branches/cassandra-0.7/CHANGES.txt
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/CHANGES.txt?rev=1042730r1=1042729r2=1042730view=diff
==
--- cassandra/branches/cassandra-0.7/CHANGES.txt (original)
+++ cassandra/branches/cassandra-0.7/CHANGES.txt Mon Dec  6 17:22:31 2010
@@ -1,4 +1,4 @@
-dev
+0.7.0-rc2
  * fix live-column-count of slice ranges including tombstoned supercolumn 
with live subcolumn (CASSANDRA-1591)
  * rename o.a.c.internal.AntientropyStage - AntiEntropyStage,

Modified: cassandra/branches/cassandra-0.7/build.xml
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/build.xml?rev=1042730r1=1042729r2=1042730view=diff
==
--- cassandra/branches/cassandra-0.7/build.xml (original)
+++ cassandra/branches/cassandra-0.7/build.xml Mon Dec  6 17:22:31 2010
@@ -47,7 +47,7 @@
 property name=test.unit.src value=${test.dir}/unit/
 property name=test.long.src value=${test.dir}/long/
 property name=dist.dir value=${build.dir}/dist/
-property name=base.version value=0.7.0-rc1/
+property name=base.version value=0.7.0-rc2/
 condition property=version value=${base.version}
   isset property=release/
 /condition

Modified: cassandra/branches/cassandra-0.7/debian/changelog
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/debian/changelog?rev=1042730r1=1042729r2=1042730view=diff
==
--- cassandra/branches/cassandra-0.7/debian/changelog (original)
+++ cassandra/branches/cassandra-0.7/debian/changelog Mon Dec  6 17:22:31 2010
@@ -1,3 +1,9 @@
+cassandra (0.7.0~rc2) unstable; urgency=low
+
+  * Release candidate release.
+
+ -- Eric Evans eev...@apache.org  Mon, 06 Dec 2010 11:19:40 -0600
+
 cassandra (0.7.0~rc1) unstable; urgency=low
 
   * Release candidate release.




svn commit: r1042731 - /cassandra/branches/cassandra-0.7/test/unit/org/apache/cassandra/service/ConsistencyLevelTest.java

2010-12-06 Thread eevans
Author: eevans
Date: Mon Dec  6 17:22:36 2010
New Revision: 1042731

URL: http://svn.apache.org/viewvc?rev=1042731view=rev
Log:
prepend missing license blurb

Modified:

cassandra/branches/cassandra-0.7/test/unit/org/apache/cassandra/service/ConsistencyLevelTest.java

Modified: 
cassandra/branches/cassandra-0.7/test/unit/org/apache/cassandra/service/ConsistencyLevelTest.java
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/test/unit/org/apache/cassandra/service/ConsistencyLevelTest.java?rev=1042731r1=1042730r2=1042731view=diff
==
--- 
cassandra/branches/cassandra-0.7/test/unit/org/apache/cassandra/service/ConsistencyLevelTest.java
 (original)
+++ 
cassandra/branches/cassandra-0.7/test/unit/org/apache/cassandra/service/ConsistencyLevelTest.java
 Mon Dec  6 17:22:36 2010
@@ -1,4 +1,25 @@
 package org.apache.cassandra.service;
+/*
+ * 
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ * 
+ */
+
 
 import java.net.InetAddress;
 import java.util.ArrayList;




[jira] Created: (CASSANDRA-1823) move init script from contrib/redhat to redhat/

2010-12-06 Thread Jonathan Ellis (JIRA)
move init script from contrib/redhat to redhat/
---

 Key: CASSANDRA-1823
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1823
 Project: Cassandra
  Issue Type: Sub-task
  Components: Contrib, Packaging
Reporter: Jonathan Ellis
Assignee: Nick Bailey
Priority: Minor
 Fix For: 0.8


(and update spec file for new location)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (CASSANDRA-1824) Schema only fully propagates from seeds

2010-12-06 Thread Brandon Williams (JIRA)
Schema only fully propagates from seeds
---

 Key: CASSANDRA-1824
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1824
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.7 beta 1
Reporter: Brandon Williams
 Fix For: 0.7.0


If you have nodes X, Y, and Z, and Y already has some schema, but X and Z do 
not, and X is the seed node for the cluster, X will pick up the schema from Y, 
but it will never propagate to Z.  If X has the schema, it will propagate to 
both Y and Z.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CASSANDRA-1805) refactor and remove contrib/

2010-12-06 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12967461#action_12967461
 ] 

Jonathan Ellis commented on CASSANDRA-1805:
---

Now I remember why we put pig in contrib to begin with: it wasn't in a maven 
repo and we didn't want to add more manual dependencies to lib/ for it.  Looks 
like they finally added maven support for the forthcoming 0.8 in PIG-1334.

 refactor and remove contrib/
 

 Key: CASSANDRA-1805
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1805
 Project: Cassandra
  Issue Type: Task
Reporter: Jonathan Ellis
Priority: Minor
 Fix For: 0.8


 Contrib is a mix of examples, tools, and miscellanea that probably doesn't 
 belong in our source tree.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (CASSANDRA-1824) Schema only fully propagates from seeds

2010-12-06 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis reassigned CASSANDRA-1824:
-

Assignee: Gary Dusbabek

 Schema only fully propagates from seeds
 ---

 Key: CASSANDRA-1824
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1824
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.7 beta 1
Reporter: Brandon Williams
Assignee: Gary Dusbabek
 Fix For: 0.7.0


 If you have nodes X, Y, and Z, and Y already has some schema, but X and Z do 
 not, and X is the seed node for the cluster, X will pick up the schema from 
 Y, but it will never propagate to Z.  If X has the schema, it will propagate 
 to both Y and Z.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (CASSANDRA-1825) Separation of Data (Cached/Non-Cached)

2010-12-06 Thread Chris Goffinet (JIRA)
Separation of Data (Cached/Non-Cached)
--

 Key: CASSANDRA-1825
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1825
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Chris Goffinet
 Fix For: 0.8


At the moment Cassandra goes through the ROW-READ stage to fetch data from the 
page cache, and if it's not in the page cache, it goes to disk.

Data that is currently hot (in page cache) will block if all I/O threads are 
busy reading from disk. We should seriously look at implementing a buffer pool 
similar to MySQL for storing data in-memory, and our I/O threads be dedicated 
to just going to disk.  I suggest studying how InnoDB does scheduling as well, 
they have good lessons to learn from.


Scaling I/O by thread's isn't going to be a good solution here either. I would 
argue that going past 64 threads for I/O is just going to hurt overall 
performance based on context switching.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CASSANDRA-1825) Separation of Data (Cached/Non-Cached)

2010-12-06 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12968393#action_12968393
 ] 

Jonathan Ellis commented on CASSANDRA-1825:
---

the easy solution that fits nicely w/ existing design is to just make the 
check-cache logic a separate stage (thread pool).

 Separation of Data (Cached/Non-Cached)
 --

 Key: CASSANDRA-1825
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1825
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Chris Goffinet
 Fix For: 0.8


 At the moment Cassandra goes through the ROW-READ stage to fetch data from 
 the page cache, and if it's not in the page cache, it goes to disk.
 Data that is currently hot (in page cache) will block if all I/O threads are 
 busy reading from disk. We should seriously look at implementing a buffer 
 pool similar to MySQL for storing data in-memory, and our I/O threads be 
 dedicated to just going to disk.  I suggest studying how InnoDB does 
 scheduling as well, they have good lessons to learn from.
 Scaling I/O by thread's isn't going to be a good solution here either. I 
 would argue that going past 64 threads for I/O is just going to hurt overall 
 performance based on context switching.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CASSANDRA-1825) Separation of Data (Cached/Non-Cached)

2010-12-06 Thread Chris Goffinet (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12968395#action_12968395
 ] 

Chris Goffinet commented on CASSANDRA-1825:
---

When I am talking about cache, I refer to page cache. If we had a stage that 
would be responsible for checking page cache, we would be forced to call 
mincore(), a system call on every request. This could get expensive very quick. 
It's worth prototyping to verify.

 Separation of Data (Cached/Non-Cached)
 --

 Key: CASSANDRA-1825
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1825
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Chris Goffinet
 Fix For: 0.8


 At the moment Cassandra goes through the ROW-READ stage to fetch data from 
 the page cache, and if it's not in the page cache, it goes to disk.
 Data that is currently hot (in page cache) will block if all I/O threads are 
 busy reading from disk. We should seriously look at implementing a buffer 
 pool similar to MySQL for storing data in-memory, and our I/O threads be 
 dedicated to just going to disk.  I suggest studying how InnoDB does 
 scheduling as well, they have good lessons to learn from.
 Scaling I/O by thread's isn't going to be a good solution here either. I 
 would argue that going past 64 threads for I/O is just going to hurt overall 
 performance based on context switching.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (CASSANDRA-1826) system_create_cf() makes a SuperCF when column_type is Standard and subcolumn_comparator_type is present

2010-12-06 Thread Tyler Hobbs (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tyler Hobbs updated CASSANDRA-1826:
---

Description: If you create a CF with system_create_column_family() and the 
CfDef has column_type = 'Standard' and subcomparator_type is present, it 
creates a SuperCF.  I would expect an InvalidRequestException, instead.  (was: 
If you create a CF with system_create_column_family() and the CfDef has 
column_type = 'Standard' and subcolumn_comparator_type is present, it creates a 
SuperCF.  I would expect an InvalidRequestException, instead.)

 system_create_cf() makes a SuperCF when column_type is Standard and 
 subcolumn_comparator_type is present
 

 Key: CASSANDRA-1826
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1826
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.7.0 rc 1
Reporter: Tyler Hobbs

 If you create a CF with system_create_column_family() and the CfDef has 
 column_type = 'Standard' and subcomparator_type is present, it creates a 
 SuperCF.  I would expect an InvalidRequestException, instead.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (CASSANDRA-1826) system_create_cf() makes a SuperCF when column_type is Standard and subcolumn_comparator_type is present

2010-12-06 Thread Tyler Hobbs (JIRA)
system_create_cf() makes a SuperCF when column_type is Standard and 
subcolumn_comparator_type is present


 Key: CASSANDRA-1826
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1826
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.7.0 rc 1
Reporter: Tyler Hobbs


If you create a CF with system_create_column_family() and the CfDef has 
column_type = 'Standard' and subcolumn_comparator_type is present, it creates a 
SuperCF.  I would expect an InvalidRequestException, instead.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (CASSANDRA-1083) Improvement to CompactionManger's submitMinorIfNeeded

2010-12-06 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis reassigned CASSANDRA-1083:
-

Assignee: Tyler Hobbs  (was: Ryan King)

Tyler, do you see any low-hanging fruit here?  We also have CASSANDRA-1608 open 
for more invasive changes, but if there is a free lunch lurking here let's grab 
that first.

 Improvement to CompactionManger's submitMinorIfNeeded
 -

 Key: CASSANDRA-1083
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1083
 Project: Cassandra
  Issue Type: Improvement
Reporter: Ryan King
Assignee: Tyler Hobbs
Priority: Minor
 Fix For: 0.7.1

 Attachments: 1083-configurable-compaction-thresholds.patch, 
 compaction_simulation.rb


 We've discovered that we are unable to tune compaction the way we want for 
 our production cluster. I think the current algorithm doesn't do this as well 
 as it could, since it doesn't sort the sstables by size before doing the 
 bucketing, which means the tuning parameters have unpredictable results.
 I looked at CASSANDRA-792, but it seems like overkill. Here's an alternative 
 proposal:
 config operations:
  minimumCompactionThreshold
  maximumCompactionThreshold
  targetSSTableCount
 The first two would mean what they currently mean: the bounds on how many 
 sstables to compact in one compaction operation. The 3rd is a target for how 
 many SSTables you'd like to have.
 Pseudo code algorithm for determining whether or not to do a minor compaction:
 {noformat} 
 if sstables.length + minimumCompactionThreshold -1  targetSSTableCount
   sort sstables from smallest to largest
   compact the up to maximumCompactionThreshold smallest tables
 {noformat} 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (CASSANDRA-1337) parallelize fetching rows for low-cardinality indexes

2010-12-06 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis reassigned CASSANDRA-1337:
-

Assignee: T Jake Luciani

 parallelize fetching rows for low-cardinality indexes
 -

 Key: CASSANDRA-1337
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1337
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jonathan Ellis
Assignee: T Jake Luciani
Priority: Minor
 Fix For: 0.7.1


 currently, we read the indexed rows from the first node (in partitioner 
 order); if that does not have enough matching rows, we read the rows from the 
 next, and so forth.
 we should use the statistics fom CASSANDRA-1155 to query multiple nodes in 
 parallel, such that we have a high chance of getting enough rows w/o having 
 to do another round of queries (but, if our estimate is incorrect, we do need 
 to loop and do more rounds until we have enough data or we have fetched from 
 each node).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (CASSANDRA-959) Allow different timeouts for different classes of operation

2010-12-06 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-959:
-

Fix Version/s: (was: 0.7.1)
   0.8
 Assignee: (was: Jonathan Ellis)

 Allow different timeouts for different classes of operation
 ---

 Key: CASSANDRA-959
 URL: https://issues.apache.org/jira/browse/CASSANDRA-959
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Priority: Minor
 Fix For: 0.8


 Currently we have one rpc timeout for intra-node operations.  But applying 
 the same timeout to read one row, read multiple rows [by key], and range 
 query multiple rows feels like an increasingly uncomfortable fit.  (See e.g. 
 CASSANDRA-919.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (CASSANDRA-959) Allow different timeouts for different classes of operation

2010-12-06 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis reassigned CASSANDRA-959:


Assignee: T Jake Luciani

 Allow different timeouts for different classes of operation
 ---

 Key: CASSANDRA-959
 URL: https://issues.apache.org/jira/browse/CASSANDRA-959
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: T Jake Luciani
Priority: Minor
 Fix For: 0.8


 Currently we have one rpc timeout for intra-node operations.  But applying 
 the same timeout to read one row, read multiple rows [by key], and range 
 query multiple rows feels like an increasingly uncomfortable fit.  (See e.g. 
 CASSANDRA-919.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (CASSANDRA-1718) cassandra should chdir / when daemonizing

2010-12-06 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis reassigned CASSANDRA-1718:
-

Assignee: Eric Evans

 cassandra should chdir / when daemonizing
 -

 Key: CASSANDRA-1718
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1718
 Project: Cassandra
  Issue Type: Bug
  Components: Packaging
 Environment: Debian squeeze, Cassandra 0.7.0-beta3 and trunk 
 (r1032649)
Reporter: paul cannon
Assignee: Eric Evans
Priority: Minor
 Fix For: 0.7.1


 Common practice when daemonizing is to cd / to avoid pinning a filesystem.  
 For example, if the oper happens to start Cassandra (by itself, or with a 
 manual jsvc invocation, or with the initscript) in /mnt/usb-storage, and 
 there is something mounted there, then the oper will not be able to unmount 
 the usb device that was mounted at that location, since the cassandra process 
 has it open as its cwd.
 evidence that this isn't being done already:
 {noformat}
 ~% sudo lsof -p 9775 | awk '$4==cwd'
 jsvc9775 cassandra  cwdDIR8,1 4096 147675 
 /home/paul/packages/cassandra/trunk
 {noformat}
 (That instance was invoked using the Debian initscript.)
 Obviously chdir(/) isn't necessary when not daemonizing, although it 
 shouldn't hurt either.
 If there are concerns about Cassandra having an ongoing ability to open 
 filenames relative to its original working directory, then it should be 
 sufficient just to do a cd / in the initscript before starting Cassandra.  
 That case, at least, is particularly important.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (CASSANDRA-1827) Batching across stages

2010-12-06 Thread Chris Goffinet (JIRA)
Batching across stages
--

 Key: CASSANDRA-1827
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1827
 Project: Cassandra
  Issue Type: Improvement
Reporter: Chris Goffinet
 Fix For: 0.8


We might be able to get some improvement if we start batching tasks for every 
stage.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (CASSANDRA-1726) Update debian packaging to use alternatives

2010-12-06 Thread Nick Bailey (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Bailey resolved CASSANDRA-1726.


Resolution: Invalid

I misunderstood how some of the debian packaging works. Fortunately I 
discovered a problem with the rpm spec file. That will be addressed in 
CASSANDRA-1805

 Update debian packaging to use alternatives
 ---

 Key: CASSANDRA-1726
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1726
 Project: Cassandra
  Issue Type: Improvement
  Components: Packaging
Affects Versions: 0.7.0 rc 1
Reporter: Nick Bailey
Priority: Minor
 Fix For: 0.7.1


 We should update the debian packaging to install configuration using 
 alternatives. Additionally we can probably get rid of the custom 
 cassandra.in.sh for debian packaging.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (CASSANDRA-1828) Create a pig storefunc

2010-12-06 Thread Brandon Williams (JIRA)
Create a pig storefunc
--

 Key: CASSANDRA-1828
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1828
 Project: Cassandra
  Issue Type: New Feature
  Components: Contrib
Affects Versions: 0.7 beta 1
Reporter: Brandon Williams
Assignee: Brandon Williams
Priority: Minor
 Fix For: 0.7.1


Now that we have a ColumnFamilyOutputFormat, we can write data back to 
cassandra in mapreduce jobs, however we can only do this in java.  It would be 
nice if pig could also output to cassandra.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (CASSANDRA-1250) add option to repair to perform a real compaction instead of read-only

2010-12-06 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-1250.
---

Resolution: Not A Problem

 add option to repair to perform a real compaction instead of read-only
 --

 Key: CASSANDRA-1250
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1250
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Priority: Minor
 Fix For: 0.7.1


 most clusters run repair on a regular basis, and most run compact on a 
 regular basis -- would be nice to save them the i/o of reading every row twice

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (CASSANDRA-1132) Add min/max counter support on top of the incr/decr counters..

2010-12-06 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-1132:
--

Fix Version/s: (was: 0.7.1)

 Add min/max counter support on top of the incr/decr counters..
 --

 Key: CASSANDRA-1132
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1132
 Project: Cassandra
  Issue Type: Sub-task
  Components: Core
Reporter: Adam Samet
Assignee: Adam Samet
   Original Estimate: 10h
  Remaining Estimate: 10h

 I'd like to add support for min and max counters on top of Kelvin's incr/decr 
 counter implementation.  This will involve multiple resolution strategies for 
 clocks, and a bit of a refactoring to support multiple Reconciler / Context 
 classes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (CASSANDRA-1305) Slow query log

2010-12-06 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-1305.
---

   Resolution: Won't Fix
Fix Version/s: (was: 0.7.1)

closing since there does not seem to be any more interest in this

 Slow query log
 --

 Key: CASSANDRA-1305
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1305
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Daniel Kluesing
Priority: Minor
 Attachments: trunk-SlowQueryLog.txt


 If a query takes a long time, it's nice to know why

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CASSANDRA-1123) Allow tracing query details

2010-12-06 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12968409#action_12968409
 ] 

Jonathan Ellis commented on CASSANDRA-1123:
---

CASSANDRA-1305 explored storing times for different query steps in an in-memory 
structure for later retrieval (by jmx?)

 Allow tracing query details
 ---

 Key: CASSANDRA-1123
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1123
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
 Fix For: 0.8


 In the spirit of CASSANDRA-511, it would be useful to tracing on queries to 
 see where latency is coming from: how long did row cache lookup take?  key 
 search in the index?  merging the data from the sstables?  etc.
 The main difference vs setting debug logging is that debug logging is too big 
 of a hammer; by turning on the flood of logging for everyone, you actually 
 distort the information you're looking for.  This would be something you 
 could set per-query (or more likely per connection).
 We don't need to be as sophisticated as the techniques discussed in the 
 following papers but they are interesting reading:
 http://research.google.com/pubs/pub36356.html
 http://www.usenix.org/events/osdi04/tech/full_papers/barham/barham_html/
 http://www.usenix.org/event/nsdi07/tech/fonseca.html

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (CASSANDRA-1555) Considerations for larger bloom filters

2010-12-06 Thread Ryan King (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan King updated CASSANDRA-1555:
-

Attachment: CASSANDRA-1555v3.patch.gz

New patch with several changes based on Stu's feedback:

* renamed BloomFilter to LegacyBloomFilter and BigBloomFilter to BloomFilter
* moved maxBucketsPerElement to BloomCalculations
* removed emptybuckets
* cleaned up formatting in SSTableReader and BigBloomFilter

Finally I changed the serialization to read and write the long[] directly, 
which saves a lot of spaces for small filters (column filter for a 10 item row 
goes from 120 bytes to 16).

 Considerations for larger bloom filters
 ---

 Key: CASSANDRA-1555
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1555
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Stu Hood
Assignee: Ryan King
 Fix For: 0.8

 Attachments: cassandra-1555.tgz, CASSANDRA-1555v2.patch, 
 CASSANDRA-1555v3.patch.gz


 To (optimally) support SSTables larger than 143 million keys, we need to 
 support bloom filters larger than 2^31 bits, which java.util.BitSet can't 
 handle directly.
 A few options:
 * Switch to a BitSet class which supports 2^31 * 64 bits (Lucene's OpenBitSet)
 * Partition the java.util.BitSet behind our current BloomFilter
 ** Straightforward bit partitioning: bit N is in bitset N // 2^31
 ** Separate equally sized complete bloom filters for member ranges, which can 
 be used independently or OR'd together under memory pressure.
 All of these options require new approaches to serialization.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (CASSANDRA-1829) Nodetool move is broken

2010-12-06 Thread Nick Bailey (JIRA)
Nodetool move is broken
---

 Key: CASSANDRA-1829
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1829
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.7.0 rc 1
Reporter: Nick Bailey
Priority: Blocker
 Fix For: 0.7.0


The code from finishBootstrapping that finishes a move was removed. This means 
a move will leave a node stuck in a bootstrapping state forever.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CASSANDRA-1825) Separation of Data (Cached/Non-Cached)

2010-12-06 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12968424#action_12968424
 ] 

Jonathan Ellis commented on CASSANDRA-1825:
---

CASSANDRA-1379 is a similar issue but for row cache / BF check.

 Separation of Data (Cached/Non-Cached)
 --

 Key: CASSANDRA-1825
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1825
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Chris Goffinet
 Fix For: 0.8


 At the moment Cassandra goes through the ROW-READ stage to fetch data from 
 the page cache, and if it's not in the page cache, it goes to disk.
 Data that is currently hot (in page cache) will block if all I/O threads are 
 busy reading from disk. We should seriously look at implementing a buffer 
 pool similar to MySQL for storing data in-memory, and our I/O threads be 
 dedicated to just going to disk.  I suggest studying how InnoDB does 
 scheduling as well, they have good lessons to learn from.
 Scaling I/O by thread's isn't going to be a good solution here either. I 
 would argue that going past 64 threads for I/O is just going to hurt overall 
 performance based on context switching.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (CASSANDRA-1830) ReadResponseResolver might miss an inconsistency

2010-12-06 Thread Jonathan Ellis (JIRA)
ReadResponseResolver might miss an inconsistency


 Key: CASSANDRA-1830
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1830
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Jonathan Ellis
Priority: Minor
 Fix For: 0.7.0
 Attachments: 1830.txt

Rather than comparing the digests of all the digest requests to one another, 
the last one seen wins and is compared to the digest of each version seen 
from a data request.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CASSANDRA-982) read repair on quorum consistencylevel

2010-12-06 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12968426#action_12968426
 ] 

Jonathan Ellis commented on CASSANDRA-982:
--

moved this to CASSANDRA-1830 to keep this ticket open for full fixing of RR on 
strongRead path.

 read repair on quorum consistencylevel
 --

 Key: CASSANDRA-982
 URL: https://issues.apache.org/jira/browse/CASSANDRA-982
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: Matthew F. Dennis
Priority: Minor
 Fix For: 0.7.1

 Attachments: 
 0001-better-digest-checking-for-ReadResponseResolver.patch, 
 982-resolve-digests-v2.txt


 CASSANDRA-930 made read repair fuzzy optional, but this only helps with 
 ConsistencyLevel.ONE:
 - Quorum reads always send requests to all nodes
 - only the first Quorum's worth of responses get compared
 So what we'd like to do two changes:
 - only send read requests to the closest R live nodes
 - if read repair is enabled, also compare results from the other nodes in the 
 background

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (CASSANDRA-1830) ReadResponseResolver might miss an inconsistency

2010-12-06 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-1830:
--

Attachment: 1830.txt

Patch based on Randall Leeds's from CASSANDRA-982.

 ReadResponseResolver might miss an inconsistency
 

 Key: CASSANDRA-1830
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1830
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Jonathan Ellis
Priority: Minor
 Fix For: 0.7.0

 Attachments: 1830.txt


 Rather than comparing the digests of all the digest requests to one another, 
 the last one seen wins and is compared to the digest of each version seen 
 from a data request.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (CASSANDRA-1826) system_create_cf() makes a SuperCF when column_type is Standard and subcolumn_comparator_type is present

2010-12-06 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-1826.
---

   Resolution: Fixed
Fix Version/s: 0.7.0

fixed in rc2 by patches for CASSANDRA-1773 and CASSANDRA-1813

 system_create_cf() makes a SuperCF when column_type is Standard and 
 subcolumn_comparator_type is present
 

 Key: CASSANDRA-1826
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1826
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.7.0 rc 1
Reporter: Tyler Hobbs
 Fix For: 0.7.0


 If you create a CF with system_create_column_family() and the CfDef has 
 column_type = 'Standard' and subcomparator_type is present, it creates a 
 SuperCF.  I would expect an InvalidRequestException, instead.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[Cassandra Wiki] Update of FrontPage by PeterSchuller

2010-12-06 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Cassandra Wiki for 
change notification.

The FrontPage page has been changed by PeterSchuller.
The comment on this change is: Move Thrift API link to client lib dev section.
http://wiki.apache.org/cassandra/FrontPage?action=diffrev1=51rev2=52

--

   * [[ClientOptions|Client options: ways to access Cassandra]] -- interfaces 
for Ruby, Python, Scala and more
   * [[RunningCassandra|Running Cassandra]]
   * [[ArchitectureOverview|Architecture Overview]]
-  * [[API|Thrift API Documentation]] (In progress)
   * [[UseCases|Simple Use Cases and Solutions]] -- please help complete
   * [[FAQ]]
  
@@ -36, +35 @@

   * [[CassandraHardware|Cassandra Hardware]]
   * [[CloudConfig|Configuration on Rackspace or Amazon Web Services]]
  
+ == Client library developer information ==
+  * [[API|Thrift API Documentation]] (In progress)
+ 
- == Developer Documentation ==
+ == Cassandra developer Documentation ==
   * ArchitectureInternals
   * [[CLI Design]]
   * [[HowToContribute|How To Contribute?]]


[Cassandra Wiki] Update of API by PeterSchuller

2010-12-06 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Cassandra Wiki for 
change notification.

The API page has been changed by PeterSchuller.
The comment on this change is: Clarify low-level status of the thrift API; make 
more clear reference to higher-level clients.
http://wiki.apache.org/cassandra/API?action=diffrev1=70rev2=71

--

  == Overview ==
+ 
+ '''NOTE:''' This documents the low-level wire protocol used to communicate 
with Cassandra. This is not intended to be used directly in applications; 
rather it is highly recommended that application developers use one of the 
higher-level clients that are linked to from ClientOptions. That said, this 
page may still be useful for application developers wanting to better 
understand the data model.
+ 
  The Cassandra Thrift API changed substantially after [[API03|0.3]], with 
minor, backwards-compatible changes for [[API04|0.4]], 0.5 and [[API06|0.6]]; 
this document explains the 0.5 version with annotations for the changes in 0.6 
and 0.7.
  
- Cassandra's client API is built entirely on top of Thrift. It should be noted 
that these documents mention default values, but these are not generated in all 
of the languages that Thrift supports.  Full examples of using Cassandra from 
Thrift, including setup boilerplate, are found on ThriftExamples.  Higher-level 
clients are linked from ClientOptions.
+ Cassandra's client API is built entirely on top of Thrift. It should be noted 
that these documents mention default values, but these are not generated in all 
of the languages that Thrift supports.  Full examples of using Cassandra from 
Thrift, including setup boilerplate, are found on ThriftExamples.
  
  '''WARNING:''' Some SQL/RDBMS terms are used in this documentation for 
analogy purposes. They should be thought of as just that; analogies. There are 
few similarities between how data is managed in a traditional RDBMS and 
Cassandra. Please see DataModel for more information.
  


[Cassandra Wiki] Update of ClientOptions by PeterSchu ller

2010-12-06 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Cassandra Wiki for 
change notification.

The ClientOptions page has been changed by PeterSchuller.
http://wiki.apache.org/cassandra/ClientOptions?action=diffrev1=118rev2=119

--

  TableOfContents()
  
  = High level clients =
- Using one of these clients is strongly preferred to raw Thrift.  Here are the 
clients that support Cassandra 0.7.
+ Using one of these clients is strongly preferred to raw Thrift when 
developing applications (the Thrift API is primarily intended for client 
developers). What follows are clients that support Cassandra 0.7.
  
  If no high-level client exists for your environment, you may be able to 
update an [[ClientOptions06|older client]]; failing that, you'll have to use 
the raw Thrift [[API]].
  


[Cassandra Wiki] Update of ClientOptions by PeterSchu ller

2010-12-06 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Cassandra Wiki for 
change notification.

The ClientOptions page has been changed by PeterSchuller.
http://wiki.apache.org/cassandra/ClientOptions?action=diffrev1=119rev2=120

--

  Thrift is the Cassandra driver-level interface that the clients above build 
on.  You can use raw Thrift from just about any language, but it's not 
particularly idiomatic in any of them.  Some examples are given in 
ThriftExamples.
  
  = Internal API =
- The StorageProxy API is available to JVM-based clients, but you should use 
Thrift unless you have a very good reason not to.  (The most common reason is 
wanting to use the BinaryMemtable bulk-load interface.)
+ The StorageProxy API is available to JVM-based clients, but unless you really 
know that you need it you should probably be using a higher-level client listed 
above or, failing that, the Thrift API. The StorageProxy API is intended for 
internal use, and highly specialized use-cases. (The most common reason is 
wanting to use the BinaryMemtable bulk-load interface.)
  
  = Hadoop =
  Running Hadoop map/reduce jobs in Cassandra is described in HadoopSupport.


[jira] Assigned: (CASSANDRA-1791) Return name of snapshot directory after creating it

2010-12-06 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis reassigned CASSANDRA-1791:
-

Assignee: Nick Bailey

 Return name of snapshot directory after creating it
 ---

 Key: CASSANDRA-1791
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1791
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
 Environment: Debian Squeeze
Reporter: paul cannon
Assignee: Nick Bailey
Priority: Minor
 Fix For: 0.7.1


 When making a snapshot, the new directory is created with a timestamp and, 
 optionally, a user-supplied tag. For the sake of automated snapshot-creating 
 tools, it would be helpful to know unequivocally what the new snapshot 
 directory was named (otherwise, the tool must search for a directory similar 
 what it expects the name to be, which could be both error-prone and maybe 
 susceptible to attack).
 Recommend making takeSnapshot and takeAllSnapshot return a string, which is 
 the base component of the new snapshot's directory name.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[Cassandra Wiki] Update of StorageProxy by PeterSchul ler

2010-12-06 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Cassandra Wiki for 
change notification.

The StorageProxy page has been changed by PeterSchuller.
http://wiki.apache.org/cassandra/StorageProxy?action=diffrev1=2rev2=3

--

- !StorageProxy is the API that Thrift calls get translated into, so using it 
directly provides some efficiency gains.  There is an example in contrib at 
https://svn.apache.org/repos/asf/cassandra/trunk/contrib/client_only/, but do 
note that while it is relatively stable, less effort is made to keep it 
entirely backwards compatible from release to release than the official 
ThriftInterface.
+ !StorageProxy is the API that Thrift calls get translated into, so using it 
directly provides some efficiency gains - but please think twice before using 
it, and be sure that you really need to before doing so. While it is relatively 
stable, less effort is made to keep it entirely backwards compatible from 
release to release than the official ThriftInterface (which in turn is 
recommended against if you can use a higher-level client (see ClientOptions) 
instead).
  
+ There is an example in contrib at 
https://svn.apache.org/repos/asf/cassandra/trunk/contrib/client_only/
+ 


[Cassandra Wiki] Update of GettingStarted by PeterSch uller

2010-12-06 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Cassandra Wiki for 
change notification.

The GettingStarted page has been changed by PeterSchuller.
The comment on this change is: Thrift avoidance.
http://wiki.apache.org/cassandra/GettingStarted?action=diffrev1=48rev2=49

--

  
  Some people running OS X have trouble getting Java 6 to work. If you've kept 
up with Apple's updates, Java 6 should already be installed (it comes in Mac OS 
X 10.5 Update 1). Unfortunately, Apple does not default to using it. What you 
have to do is change your `JAVA_HOME` environment setting to 
`/System/Library/Frameworks/JavaVM.framework/Versions/1.6/Home` and add 
`/System/Library/Frameworks/JavaVM.framework/Versions/1.6/Home/bin` to the 
beginning of your `PATH`.
  
- And now for the moment of truth, start up Cassandra by invoking 
`bin/cassandra -f` from the command lineFootNote(To learn more about 
controlling the behavior of startup scripts, see RunningCassandra.). The 
service should start in the foreground and log gratuitously to standard-out. 
Assuming you don't see messages with scary words like error, or fatal, or 
anything that looks like a Java stack trace, then chances are you've succeeded. 
To be certain though, take some time to try out the examples in CassandraCli 
and ThriftInterface before moving on. Also, if you run into problems, Don't 
Panic, calmly proceed to [[#if_something_goes_wrong|If Something Goes Wrong]].
+ And now for the moment of truth, start up Cassandra by invoking 
`bin/cassandra -f` from the command lineFootNote(To learn more about 
controlling the behavior of startup scripts, see RunningCassandra.). The 
service should start in the foreground and log gratuitously to standard-out. 
Assuming you don't see messages with scary words like error, or fatal, or 
anything that looks like a Java stack trace, then chances are you've succeeded. 
To be certain though, take some time to try out the examples in CassandraCli 
before moving on. Also, if you run into problems, Don't Panic, calmly proceed 
to [[#if_something_goes_wrong|If Something Goes Wrong]].
  
  Users of recent Linux distributions and Mac OS X Snow Leopard should be able 
to start up Cassandra simply by untarring and invoking `bin/cassandra -f` with 
root privileges. Snow Leopard ships with Java 1.6.0 and does not require 
changing the `JAVA_HOME` environment variable or adding any directory to your 
`PATH`. On Linux just make sure you have a working Java JDK package installed 
such as the `openjdk-6-jdk` on Ubuntu Lucid Lynx.
  
@@ -69, +69 @@

  If you don't yet have access to hardware for a Cassandra cluster you can try 
it out on EC2 with [[CloudConfig]].
  
  == Step 4: Write your application ==
+ The recommended way to communicate with Cassandra in your application is to 
use a [[http://wiki.apache.org/cassandra/ClientOptions|higher-level client]]. 
These provide programming language specific API:s for talking to Cassandra in a 
variety of languages. The details will vary depending on programming language 
and client, but in general using a higher-level client will mean that you have 
to write less code and get several features for free that you would otherwise 
have to write yourself.
+ 
- Cassandra uses [[http://thrift.apache.org/|Thrift]] for its external 
client-facing API. Cassandra's main API/RPC/Thrift port is 9160. Thrift 
supports a [[http://svn.apache.org/viewvc/thrift/trunk/lib/|wide variety of 
languages]] so you can code your application to use Thrift directly, or use a 
[[http://wiki.apache.org/cassandra/ClientOptions|high-level client]] where 
available. Be sure to read the documentation on the 
[[http://wiki.apache.org/thrift|Thrift wiki]], and check out the 
Cassandra-specific examples in ThriftExamples before getting started.  
+ That said, it is useful to know that Cassandra uses 
[[http://thrift.apache.org/|Thrift]] for its external client-facing API. 
Cassandra's main API/RPC/Thrift port is 9160. Thrift supports a 
[[http://svn.apache.org/viewvc/thrift/trunk/lib/|wide variety of languages]] so 
you can code your application to use Thrift directly if you so chose (but again 
we recommend a [[http://wiki.apache.org/cassandra/ClientOptions|high-level 
client]] where available). Be sure to read the documentation on the 
[[http://wiki.apache.org/thrift|Thrift wiki]], and check out the 
Cassandra-specific examples in ThriftExamples before getting started.  
  
  Important note: you need to install the svn revision of thrift that matches 
the revision that your version of Cassandra uses. [[InstallThrift]]
  


svn commit: r1042824 - in /cassandra/branches/cassandra-0.6: ./ src/java/org/apache/cassandra/db/ src/java/org/apache/cassandra/service/ src/java/org/apache/cassandra/tools/

2010-12-06 Thread jbellis
Author: jbellis
Date: Mon Dec  6 22:32:09 2010
New Revision: 1042824

URL: http://svn.apache.org/viewvc?rev=1042824view=rev
Log:
add support for per-CF compaction
patch by Jon Hermes; reviewed by jbellis for CASSANDRA-1812

Modified:
cassandra/branches/cassandra-0.6/CHANGES.txt

cassandra/branches/cassandra-0.6/src/java/org/apache/cassandra/db/ColumnFamilyStore.java

cassandra/branches/cassandra-0.6/src/java/org/apache/cassandra/service/StorageService.java

cassandra/branches/cassandra-0.6/src/java/org/apache/cassandra/service/StorageServiceMBean.java

cassandra/branches/cassandra-0.6/src/java/org/apache/cassandra/tools/NodeCmd.java

cassandra/branches/cassandra-0.6/src/java/org/apache/cassandra/tools/NodeProbe.java

Modified: cassandra/branches/cassandra-0.6/CHANGES.txt
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.6/CHANGES.txt?rev=1042824r1=1042823r2=1042824view=diff
==
--- cassandra/branches/cassandra-0.6/CHANGES.txt (original)
+++ cassandra/branches/cassandra-0.6/CHANGES.txt Mon Dec  6 22:32:09 2010
@@ -12,6 +12,7 @@
with live subcolumn (CASSANDRA-1591)
  * clean up log messages for gossip token notifications (CASSANDRA-1518)
  * fix range queries against wrapped range (CASSANDRA-1781)
+ * add support for per-CF compaction (CASSANDRA-1812)
  * reduce fat client timeout (CASSANDRA-1730)
 
 

Modified: 
cassandra/branches/cassandra-0.6/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.6/src/java/org/apache/cassandra/db/ColumnFamilyStore.java?rev=1042824r1=1042823r2=1042824view=diff
==
--- 
cassandra/branches/cassandra-0.6/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
 (original)
+++ 
cassandra/branches/cassandra-0.6/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
 Mon Dec  6 22:32:09 2010
@@ -701,7 +701,7 @@ public class ColumnFamilyStore implement
 return maxFile;
 }
 
-void forceCleanup()
+public void forceCleanup()
 {
 CompactionManager.instance.submitCleanup(ColumnFamilyStore.this);
 }

Modified: 
cassandra/branches/cassandra-0.6/src/java/org/apache/cassandra/service/StorageService.java
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.6/src/java/org/apache/cassandra/service/StorageService.java?rev=1042824r1=1042823r2=1042824view=diff
==
--- 
cassandra/branches/cassandra-0.6/src/java/org/apache/cassandra/service/StorageService.java
 (original)
+++ 
cassandra/branches/cassandra-0.6/src/java/org/apache/cassandra/service/StorageService.java
 Mon Dec  6 22:32:09 2010
@@ -554,7 +554,7 @@ public class StorageService implements I
  * Handle node move to normal state. That is, node is entering token ring 
and participating
  * in reads.
  *
- * @param endPoint node
+ * @param endpoint node
  * @param pieces STATE_NORMAL,token[,other_state,token]
  */
 private void handleStateNormal(InetAddress endpoint, String[] pieces)
@@ -1034,11 +1034,12 @@ public class StorageService implements I
 table.forceCleanup();
 }
 }
-
-public void forceTableCleanup(String tableName) throws IOException
+public void forceTableCleanup(String tableName, String... columnFamilies) 
throws IOException
 {
-Table table = getValidTable(tableName);
-table.forceCleanup();
+for (ColumnFamilyStore cfStore : getValidColumnFamilies(tableName, 
columnFamilies))
+{
+cfStore.forceCleanup();
+}
 }
 
 public void forceTableCompaction() throws IOException
@@ -1046,11 +1047,12 @@ public class StorageService implements I
 for (Table table : Table.all())
 table.forceCompaction();
 }
-
-public void forceTableCompaction(String tableName) throws IOException
+public void forceTableCompaction(String ks, String... columnFamilies) 
throws IOException
 {
-Table table = getValidTable(tableName);
-table.forceCompaction();
+for (ColumnFamilyStore cfStore : getValidColumnFamilies(ks, 
columnFamilies))
+{
+cfStore.forceMajorCompaction();
+}
 }
 
 /**

Modified: 
cassandra/branches/cassandra-0.6/src/java/org/apache/cassandra/service/StorageServiceMBean.java
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.6/src/java/org/apache/cassandra/service/StorageServiceMBean.java?rev=1042824r1=1042823r2=1042824view=diff
==
--- 
cassandra/branches/cassandra-0.6/src/java/org/apache/cassandra/service/StorageServiceMBean.java
 (original)
+++ 
cassandra/branches/cassandra-0.6/src/java/org/apache/cassandra/service/StorageServiceMBean.java
 Mon Dec  6 

[jira] Commented: (CASSANDRA-1812) Allow per-CF compaction, repair, and cleanup

2010-12-06 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12968456#action_12968456
 ] 

Jonathan Ellis commented on CASSANDRA-1812:
---

also, fixed up help text

 Allow per-CF compaction, repair, and cleanup
 

 Key: CASSANDRA-1812
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1812
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Affects Versions: 0.6.8
Reporter: Tyler Hobbs
Assignee: Jon Hermes
Priority: Minor
 Fix For: 0.6.9, 0.7.0

 Attachments: 1812-all.txt, 1812-compact-2.txt, 1812-compact.txt


 It should be a pretty simple change to allow compaction, cleanup, or repair 
 of only one CF.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



buildbot success in ASF Buildbot on cassandra-0.6

2010-12-06 Thread buildbot
The Buildbot has detected a restored build of cassandra-0.6 on ASF Buildbot.
Full details are available at:
 http://ci.apache.org/builders/cassandra-0.6/builds/242

Buildbot URL: http://ci.apache.org/

Buildslave for this Build: isis_ubuntu

Build Reason: 
Build Source Stamp: [branch cassandra/branches/cassandra-0.6] 1042824
Blamelist: jbellis

Build succeeded!

sincerely,
 -The Buildbot



[Cassandra Wiki] Update of GettingStarted by PeterSch uller

2010-12-06 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Cassandra Wiki for 
change notification.

The GettingStarted page has been changed by PeterSchuller.
The comment on this change is: More thrift avoidance.
http://wiki.apache.org/cassandra/GettingStarted?action=diffrev1=49rev2=50

--

  == Step 4: Write your application ==
  The recommended way to communicate with Cassandra in your application is to 
use a [[http://wiki.apache.org/cassandra/ClientOptions|higher-level client]]. 
These provide programming language specific API:s for talking to Cassandra in a 
variety of languages. The details will vary depending on programming language 
and client, but in general using a higher-level client will mean that you have 
to write less code and get several features for free that you would otherwise 
have to write yourself.
  
- That said, it is useful to know that Cassandra uses 
[[http://thrift.apache.org/|Thrift]] for its external client-facing API. 
Cassandra's main API/RPC/Thrift port is 9160. Thrift supports a 
[[http://svn.apache.org/viewvc/thrift/trunk/lib/|wide variety of languages]] so 
you can code your application to use Thrift directly if you so chose (but again 
we recommend a [[http://wiki.apache.org/cassandra/ClientOptions|high-level 
client]] where available). Be sure to read the documentation on the 
[[http://wiki.apache.org/thrift|Thrift wiki]], and check out the 
Cassandra-specific examples in ThriftExamples before getting started.  
+ That said, it is useful to know that Cassandra uses 
[[http://thrift.apache.org/|Thrift]] for its external client-facing API. 
Cassandra's main API/RPC/Thrift port is 9160. Thrift supports a 
[[http://svn.apache.org/viewvc/thrift/trunk/lib/|wide variety of languages]] so 
you can code your application to use Thrift directly if you so chose (but again 
we recommend a [[http://wiki.apache.org/cassandra/ClientOptions|high-level 
client]] where available).
  
- Important note: you need to install the svn revision of thrift that matches 
the revision that your version of Cassandra uses. [[InstallThrift]]
+ Important note: If you intend to use thrift directly, you need to install a 
version of thrift that matches the revision that your version of Cassandra 
uses. [[InstallThrift]]
  
  Cassandra's main API/RPC/Thrift port is 9160. It is a common mistake for API 
clients to connect to the JMX port instead.
  


[Cassandra Wiki] Update of InstallThrift by PeterSchu ller

2010-12-06 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Cassandra Wiki for 
change notification.

The InstallThrift page has been changed by PeterSchuller.
The comment on this change is: Thrift avoidance.
http://wiki.apache.org/cassandra/InstallThrift?action=diffrev1=8rev2=9

--

+ '''NOTE:''' If you arrived here for the purpose of writing your first 
application, please consider using a [[ClientOptions|higher-level client]] 
instead of thrift directly.
+ 
  [[http://incubator.apache.org/thrift|Thrift]] expects to make an official 
release so that distributors can package it up nicely Real Soon Now.  In the 
meantime, installing thrift is a bit of a bitch.  We are sorry about that, but 
we don't know of a better way to support a vast number of clients mostly 
automagically.
  
  Important note: you need to install the svn revision of thrift that matches 
the revision that your version of Cassandra uses. This can be found in the 
Cassandra Home/lib directory - e.g. `libthrift-917130.jar` means that version 
of Cassandra uses svn revision 917130 of thrift.


[Cassandra Wiki] Update of FrontPage by PeterSchuller

2010-12-06 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Cassandra Wiki for 
change notification.

The FrontPage page has been changed by PeterSchuller.
http://wiki.apache.org/cassandra/FrontPage?action=diffrev1=52rev2=53

--

   * [[DataModel|A description of the Cassandra data model]]
   * [[CassandraLimitations|Cassandra Limitations]]: where Cassandra is not a 
good fit
  
- == User Documentation ==
+ == Application developer and operator documentation ==
   * [[GettingStarted|Getting Started]]
   * [[http://www.riptano.com/docs|Riptano's Cassandra documentation]]
   * [[ClientOptions|Client options: ways to access Cassandra]] -- interfaces 
for Ruby, Python, Scala and more


[Cassandra Wiki] Update of API by PeterSchuller

2010-12-06 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Cassandra Wiki for 
change notification.

The API page has been changed by PeterSchuller.
http://wiki.apache.org/cassandra/API?action=diffrev1=71rev2=72

--

  == Overview ==
  
- '''NOTE:''' This documents the low-level wire protocol used to communicate 
with Cassandra. This is not intended to be used directly in applications; 
rather it is highly recommended that application developers use one of the 
higher-level clients that are linked to from ClientOptions. That said, this 
page may still be useful for application developers wanting to better 
understand the data model.
+ '''NOTE:''' This documents the low-level wire protocol used to communicate 
with Cassandra. This is not intended to be used directly in applications; 
rather it is highly recommended that application developers use one of the 
higher-level clients that are linked to from ClientOptions. That said, this 
page may still be useful for application developers wanting to better 
understand the data model or the underlying operations that are available.
  
  The Cassandra Thrift API changed substantially after [[API03|0.3]], with 
minor, backwards-compatible changes for [[API04|0.4]], 0.5 and [[API06|0.6]]; 
this document explains the 0.5 version with annotations for the changes in 0.6 
and 0.7.
  


[Cassandra Wiki] Update of ThriftExamples by PeterSch uller

2010-12-06 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Cassandra Wiki for 
change notification.

The ThriftExamples page has been changed by PeterSchuller.
http://wiki.apache.org/cassandra/ThriftExamples?action=diffrev1=85rev2=86

--

  
  
  
- This page shows examples of using the low-level 
[[http://incubator.apache.org/thrift/|Thrift]] interface.  
+ This page shows examples of using the low-level 
[[http://incubator.apache.org/thrift/|Thrift]] interface, primarily intended 
for client library developers.  
  
  
  


[Cassandra Wiki] Update of FAQ by PeterSchuller

2010-12-06 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Cassandra Wiki for 
change notification.

The FAQ page has been changed by PeterSchuller.
The comment on this change is: Be slightly less thrift-centric.
http://wiki.apache.org/cassandra/FAQ?action=diffrev1=96rev2=97

--

   * [[#slows_down_after_lotso_inserts|Why does Cassandra slow down after doing 
a lot of inserts?]]
   * [[#existing_data_when_adding_new_nodes|What happens to existing data in my 
cluster when I add new nodes?]]
   * [[#modify_cf_config|Can I add/remove/rename Column Families on a working 
cluster?]]
-  * [[#node_clients_connect_to|Does it matter which node a Thrift client 
connects to?]]
+  * [[#node_clients_connect_to|Does it matter which node a Thrift or 
higher-level client connects to?]]
   * [[#what_kind_of_hardware_should_i_use|What kind of hardware should I run 
Cassandra on?]]
   * [[#architecture|What are SSTables and Memtables?]]
   * [[#working_with_timeuuid_in_java|Why is it so hard to work with 
TimeUUIDType in Java?]]
@@ -84, +84 @@

  
  Anchor(node_clients_connect_to)
  
- == Does it matter which node a Thrift client connects to? ==
+ == Does it matter which node a Thrift or higher-level client connects to? ==
- No, any node in the cluster will work; Cassandra nodes proxy your request as 
needed. This leaves you with a number of options for end point selection:
+ No, any node in the cluster will work; Cassandra nodes proxy your request as 
needed. This leaves the client with a number of options for end point selection:
  
   1. You can maintain a list of contact nodes (all or a subset of the nodes in 
the cluster), and configure your clients to choose among them.
   1. Use round-robin DNS and create a record that points to a set of contact 
nodes (recommended).
   1. Use the `get_string_property(token map)` RPC to obtain an 
update-to-date list of the nodes in the cluster and cycle through them.
   1. Deploy a load-balancer, proxy, etc.
+ 
+ When using a higher-level client you should investigate which, if any, 
options are implemented by your higher-level client to help you distribute your 
requests across nodes in a cluster.
  
  Anchor(what_kind_of_hardware_should_i_use)
  


[jira] Updated: (CASSANDRA-1567) Provide configurable encryption support for internode communication

2010-12-06 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-1567:
--

Fix Version/s: (was: 0.7.1)
   0.8

bumping to 0.8 for dependency on THRIFT-106 which is in (will be in?) Thrift 0.6

 Provide configurable encryption support for internode communication
 ---

 Key: CASSANDRA-1567
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1567
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Nirmal Ranganathan
Assignee: Nirmal Ranganathan
 Fix For: 0.8

 Attachments: 0002-Configurable-internode-encryption-option.patch, 
 0003-Default-Key-and-Certificate-for-internode-SSL.patch


 Provide the option to encrypt internode communication. The initial thought is 
 to use JSSE 
 (http://download.oracle.com/javase/6/docs/technotes/guides/security/jsse/JSSERefGuide.html)
  to wrap the existing ServerSocket  Sockets. This will only be an optional 
 configuration and not enabled by default. The defaults would be TLS V1, RSA 
 1024-bit keys for handshake and SSL_RSA_WITH_RC4_128_MD5 as the cipher suite. 
 Although this can be made configurable if the need arises. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CASSANDRA-1072) Increment counters

2010-12-06 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12968476#action_12968476
 ] 

Jonathan Ellis commented on CASSANDRA-1072:
---

bq. CounterMutation is a struct w/ 2 optional fields: Counter, and Deletion

looks reasonable to me.

bq. Not in favor of CounterDeletion w/o a timestamp

I think it's confusing from a user's perspective to have it required on delete 
when it is not on write.  If there's no value to letting the user provide 
timestamps other than the current time then let's leave it out.

 Increment counters
 --

 Key: CASSANDRA-1072
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1072
 Project: Cassandra
  Issue Type: Sub-task
  Components: Core
Reporter: Johan Oskarsson
Assignee: Kelvin Kakugawa
 Attachments: CASSANDRA-1072.112210.patch, 
 CASSANDRA-1072.120110.patch, CASSANDRA-1072.patch, increment_test.py, 
 Partitionedcountersdesigndoc.pdf


 Break out the increment counters out of CASSANDRA-580. Classes are shared 
 between the two features but without the plain version vector code the 
 changeset becomes smaller and more manageable.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (CASSANDRA-1567) Provide configurable encryption support for internode communication

2010-12-06 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-1567:
--

Comment: was deleted

(was: bumping to 0.8 for dependency on THRIFT-106 which is in (will be in?) 
Thrift 0.6)

 Provide configurable encryption support for internode communication
 ---

 Key: CASSANDRA-1567
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1567
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Nirmal Ranganathan
Assignee: Nirmal Ranganathan
 Fix For: 0.7.1

 Attachments: 0002-Configurable-internode-encryption-option.patch, 
 0003-Default-Key-and-Certificate-for-internode-SSL.patch


 Provide the option to encrypt internode communication. The initial thought is 
 to use JSSE 
 (http://download.oracle.com/javase/6/docs/technotes/guides/security/jsse/JSSERefGuide.html)
  to wrap the existing ServerSocket  Sockets. This will only be an optional 
 configuration and not enabled by default. The defaults would be TLS V1, RSA 
 1024-bit keys for handshake and SSL_RSA_WITH_RC4_128_MD5 as the cipher suite. 
 Although this can be made configurable if the need arises. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (CASSANDRA-1567) Provide configurable encryption support for internode communication

2010-12-06 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-1567:
--

Fix Version/s: (was: 0.8)
   0.7.1

 Provide configurable encryption support for internode communication
 ---

 Key: CASSANDRA-1567
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1567
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Nirmal Ranganathan
Assignee: Nirmal Ranganathan
 Fix For: 0.7.1

 Attachments: 0002-Configurable-internode-encryption-option.patch, 
 0003-Default-Key-and-Certificate-for-internode-SSL.patch


 Provide the option to encrypt internode communication. The initial thought is 
 to use JSSE 
 (http://download.oracle.com/javase/6/docs/technotes/guides/security/jsse/JSSERefGuide.html)
  to wrap the existing ServerSocket  Sockets. This will only be an optional 
 configuration and not enabled by default. The defaults would be TLS V1, RSA 
 1024-bit keys for handshake and SSL_RSA_WITH_RC4_128_MD5 as the cipher suite. 
 Although this can be made configurable if the need arises. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



svn commit: r1042851 - in /cassandra: branches/cassandra-0.7/contrib/word_count/src/WordCountSetup.java trunk/contrib/word_count/src/WordCountSetup.java

2010-12-06 Thread brandonwilliams
Author: brandonwilliams
Date: Mon Dec  6 23:24:17 2010
New Revision: 1042851

URL: http://svn.apache.org/viewvc?rev=1042851view=rev
Log:
Switch word_count CFs to AsciiType.  Patch by brandonwillliams

Modified:
cassandra/branches/cassandra-0.7/contrib/word_count/src/WordCountSetup.java
cassandra/trunk/contrib/word_count/src/WordCountSetup.java

Modified: 
cassandra/branches/cassandra-0.7/contrib/word_count/src/WordCountSetup.java
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/contrib/word_count/src/WordCountSetup.java?rev=1042851r1=1042850r2=1042851view=diff
==
--- cassandra/branches/cassandra-0.7/contrib/word_count/src/WordCountSetup.java 
(original)
+++ cassandra/branches/cassandra-0.7/contrib/word_count/src/WordCountSetup.java 
Mon Dec  6 23:24:17 2010
@@ -99,8 +99,14 @@ public class WordCountSetup
 private static void setupKeyspace(Cassandra.Iface client) throws 
TException, InvalidRequestException
 {
 ListCfDef cfDefList = new ArrayListCfDef();
-cfDefList.add(new CfDef(WordCount.KEYSPACE, WordCount.COLUMN_FAMILY));
-cfDefList.add(new CfDef(WordCount.KEYSPACE, 
WordCount.OUTPUT_COLUMN_FAMILY));
+CfDef input = new CfDef(WordCount.KEYSPACE, WordCount.COLUMN_FAMILY);
+   input.setComparator_type(AsciiType);
+   input.setDefault_validation_class(AsciiType);
+   cfDefList.add(input);
+CfDef output = new CfDef(WordCount.KEYSPACE, 
WordCount.OUTPUT_COLUMN_FAMILY);
+   output.setComparator_type(AsciiType);
+   output.setDefault_validation_class(AsciiType);
+cfDefList.add(output);
 
 client.system_add_keyspace(new KsDef(WordCount.KEYSPACE, 
org.apache.cassandra.locator.SimpleStrategy, 1, cfDefList));
 int magnitude = client.describe_ring(WordCount.KEYSPACE).size();

Modified: cassandra/trunk/contrib/word_count/src/WordCountSetup.java
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/contrib/word_count/src/WordCountSetup.java?rev=1042851r1=1042850r2=1042851view=diff
==
--- cassandra/trunk/contrib/word_count/src/WordCountSetup.java (original)
+++ cassandra/trunk/contrib/word_count/src/WordCountSetup.java Mon Dec  6 
23:24:17 2010
@@ -99,8 +99,14 @@ public class WordCountSetup
 private static void setupKeyspace(Cassandra.Iface client) throws 
TException, InvalidRequestException
 {
 ListCfDef cfDefList = new ArrayListCfDef();
-cfDefList.add(new CfDef(WordCount.KEYSPACE, WordCount.COLUMN_FAMILY));
-cfDefList.add(new CfDef(WordCount.KEYSPACE, 
WordCount.OUTPUT_COLUMN_FAMILY));
+CfDef input = new CfDef(WordCount.KEYSPACE, WordCount.COLUMN_FAMILY);
+   input.setComparator_type(AsciiType);
+   input.setDefault_validation_class(AsciiType);
+   cfDefList.add(input);
+CfDef output = new CfDef(WordCount.KEYSPACE, 
WordCount.OUTPUT_COLUMN_FAMILY);
+   output.setComparator_type(AsciiType);
+   output.setDefault_validation_class(AsciiType);
+cfDefList.add(output);
 
 client.system_add_keyspace(new KsDef(WordCount.KEYSPACE, 
org.apache.cassandra.locator.SimpleStrategy, 1, cfDefList));
 int magnitude = client.describe_ring(WordCount.KEYSPACE).size();




svn commit: r1042857 - in /cassandra: branches/cassandra-0.7/contrib/word_count/src/WordCount.java trunk/contrib/word_count/src/WordCount.java

2010-12-06 Thread brandonwilliams
Author: brandonwilliams
Date: Mon Dec  6 23:34:05 2010
New Revision: 1042857

URL: http://svn.apache.org/viewvc?rev=1042857view=rev
Log:
word_count uses better ks/cf names now that we're not piggybacking off storage 
definitions in the config.  Patch by brandonwilliams

Modified:
cassandra/branches/cassandra-0.7/contrib/word_count/src/WordCount.java
cassandra/trunk/contrib/word_count/src/WordCount.java

Modified: cassandra/branches/cassandra-0.7/contrib/word_count/src/WordCount.java
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/contrib/word_count/src/WordCount.java?rev=1042857r1=1042856r2=1042857view=diff
==
--- cassandra/branches/cassandra-0.7/contrib/word_count/src/WordCount.java 
(original)
+++ cassandra/branches/cassandra-0.7/contrib/word_count/src/WordCount.java Mon 
Dec  6 23:34:05 2010
@@ -54,11 +54,11 @@ public class WordCount extends Configure
 {
 private static final Logger logger = 
LoggerFactory.getLogger(WordCount.class);
 
-static final String KEYSPACE = Keyspace1;
-static final String COLUMN_FAMILY = Standard1;
+static final String KEYSPACE = wordcount;
+static final String COLUMN_FAMILY = input_words;
 
 static final String OUTPUT_REDUCER_VAR = output_reducer;
-static final String OUTPUT_COLUMN_FAMILY = Standard2;
+static final String OUTPUT_COLUMN_FAMILY = output_words;
 private static final String OUTPUT_PATH_PREFIX = /tmp/word_count;
 
 private static final String CONF_COLUMN_NAME = columnname;

Modified: cassandra/trunk/contrib/word_count/src/WordCount.java
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/contrib/word_count/src/WordCount.java?rev=1042857r1=1042856r2=1042857view=diff
==
--- cassandra/trunk/contrib/word_count/src/WordCount.java (original)
+++ cassandra/trunk/contrib/word_count/src/WordCount.java Mon Dec  6 23:34:05 
2010
@@ -54,11 +54,11 @@ public class WordCount extends Configure
 {
 private static final Logger logger = 
LoggerFactory.getLogger(WordCount.class);
 
-static final String KEYSPACE = Keyspace1;
-static final String COLUMN_FAMILY = Standard1;
+static final String KEYSPACE = wordcount;
+static final String COLUMN_FAMILY = input_words;
 
 static final String OUTPUT_REDUCER_VAR = output_reducer;
-static final String OUTPUT_COLUMN_FAMILY = Standard2;
+static final String OUTPUT_COLUMN_FAMILY = output_words;
 private static final String OUTPUT_PATH_PREFIX = /tmp/word_count;
 
 private static final String CONF_COLUMN_NAME = columnname;




svn commit: r1042863 - in /cassandra: branches/cassandra-0.7/contrib/word_count/README.txt trunk/contrib/word_count/README.txt

2010-12-06 Thread brandonwilliams
Author: brandonwilliams
Date: Mon Dec  6 23:46:44 2010
New Revision: 1042863

URL: http://svn.apache.org/viewvc?rev=1042863view=rev
Log:
Update word_count README

Modified:
cassandra/branches/cassandra-0.7/contrib/word_count/README.txt
cassandra/trunk/contrib/word_count/README.txt

Modified: cassandra/branches/cassandra-0.7/contrib/word_count/README.txt
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/contrib/word_count/README.txt?rev=1042863r1=1042862r2=1042863view=diff
==
--- cassandra/branches/cassandra-0.7/contrib/word_count/README.txt (original)
+++ cassandra/branches/cassandra-0.7/contrib/word_count/README.txt Mon Dec  6 
23:46:44 2010
@@ -13,15 +13,15 @@ contrib/word_count$ bin/word_count
 The output of the word count can now be configured. In the bin/word_count
 file, you can specify the OUTPUT_REDUCER. The two options are 'filesystem'
 and 'cassandra'. The filesystem option outputs to the /tmp/word_count*
-directories. The cassandra option outputs to the 'Standard2' column family.
+directories. The cassandra option outputs to the 'output_words' column family
+in the 'wordcount' keyspace.
 
-In order to view the results in Cassandra, one can use python/pycassa and
+In order to view the results in Cassandra, one can use bin/cassandra-cli and
 perform the following operations:
-$ python
- import pycassa
- con = pycassa.connect('Keyspace1')
- cf = pycassa.ColumnFamily(con, 'Standard2')
- list(cf.get_range())
+$ bin/cassandra-cli
+ connect localhost/9160
+ use wordcount;
+ list output_words;
 
 Read the code in src/ for more details.
 

Modified: cassandra/trunk/contrib/word_count/README.txt
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/contrib/word_count/README.txt?rev=1042863r1=1042862r2=1042863view=diff
==
--- cassandra/trunk/contrib/word_count/README.txt (original)
+++ cassandra/trunk/contrib/word_count/README.txt Mon Dec  6 23:46:44 2010
@@ -13,15 +13,15 @@ contrib/word_count$ bin/word_count
 The output of the word count can now be configured. In the bin/word_count
 file, you can specify the OUTPUT_REDUCER. The two options are 'filesystem'
 and 'cassandra'. The filesystem option outputs to the /tmp/word_count*
-directories. The cassandra option outputs to the 'Standard2' column family.
+directories. The cassandra option outputs to the 'output_words' column family
+in the 'wordcount' keyspace.
 
-In order to view the results in Cassandra, one can use python/pycassa and
+In order to view the results in Cassandra, one can use bin/cassandra-cli and
 perform the following operations:
-$ python
- import pycassa
- con = pycassa.connect('Keyspace1')
- cf = pycassa.ColumnFamily(con, 'Standard2')
- list(cf.get_range())
+$ bin/cassandra-cli
+ connect localhost/9160
+ use wordcount;
+ list output_words;
 
 Read the code in src/ for more details.
 




[jira] Updated: (CASSANDRA-1072) Increment counters

2010-12-06 Thread Kelvin Kakugawa (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kelvin Kakugawa updated CASSANDRA-1072:
---

Attachment: CASSANDRA-1072.120610.patch

API update:
removed timestamp from remove_counter and CounterDeletion

 Increment counters
 --

 Key: CASSANDRA-1072
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1072
 Project: Cassandra
  Issue Type: Sub-task
  Components: Core
Reporter: Johan Oskarsson
Assignee: Kelvin Kakugawa
 Attachments: CASSANDRA-1072.120110.patch, 
 CASSANDRA-1072.120610.patch, CASSANDRA-1072.patch, increment_test.py, 
 Partitionedcountersdesigndoc.pdf


 Break out the increment counters out of CASSANDRA-580. Classes are shared 
 between the two features but without the plain version vector code the 
 changeset becomes smaller and more manageable.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (CASSANDRA-1072) Increment counters

2010-12-06 Thread Kelvin Kakugawa (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kelvin Kakugawa updated CASSANDRA-1072:
---

Attachment: (was: CASSANDRA-1072.112210.patch)

 Increment counters
 --

 Key: CASSANDRA-1072
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1072
 Project: Cassandra
  Issue Type: Sub-task
  Components: Core
Reporter: Johan Oskarsson
Assignee: Kelvin Kakugawa
 Attachments: CASSANDRA-1072.120110.patch, 
 CASSANDRA-1072.120610.patch, CASSANDRA-1072.patch, increment_test.py, 
 Partitionedcountersdesigndoc.pdf


 Break out the increment counters out of CASSANDRA-580. Classes are shared 
 between the two features but without the plain version vector code the 
 changeset becomes smaller and more manageable.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (CASSANDRA-1072) Increment counters

2010-12-06 Thread Kelvin Kakugawa (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kelvin Kakugawa updated CASSANDRA-1072:
---

Attachment: (was: CASSANDRA-1072.120110.patch)

 Increment counters
 --

 Key: CASSANDRA-1072
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1072
 Project: Cassandra
  Issue Type: Sub-task
  Components: Core
Reporter: Johan Oskarsson
Assignee: Kelvin Kakugawa
 Attachments: CASSANDRA-1072.120610.patch, CASSANDRA-1072.patch, 
 increment_test.py, Partitionedcountersdesigndoc.pdf


 Break out the increment counters out of CASSANDRA-580. Classes are shared 
 between the two features but without the plain version vector code the 
 changeset becomes smaller and more manageable.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CASSANDRA-1812) Allow per-CF compaction, repair, and cleanup

2010-12-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12968496#action_12968496
 ] 

Hudson commented on CASSANDRA-1812:
---

Integrated in Cassandra-0.6 #17 (See 
[https://hudson.apache.org/hudson/job/Cassandra-0.6/17/])
add support for per-CF compaction
patch by Jon Hermes; reviewed by jbellis for CASSANDRA-1812


 Allow per-CF compaction, repair, and cleanup
 

 Key: CASSANDRA-1812
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1812
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Affects Versions: 0.6.8
Reporter: Tyler Hobbs
Assignee: Jon Hermes
Priority: Minor
 Fix For: 0.6.9, 0.7.0

 Attachments: 1812-all.txt, 1812-compact-2.txt, 1812-compact.txt


 It should be a pretty simple change to allow compaction, cleanup, or repair 
 of only one CF.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (CASSANDRA-1822) Row level coverage in LegacySSTableTest

2010-12-06 Thread Stu Hood (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stu Hood updated CASSANDRA-1822:


Attachment: legacy-sstables.tgz
1822.tgz

0001 - Refactors SSTableUtils to use the builder pattern to prevent 
proliferation of writeSSTable(*) method signatures
0002 - Uses a SSTableIterator to validate row contents
0003/0004 - Normalizes the 'b' and 'e' sstable formats (from git revisions 
12eb0571e2e65537ee17fbdab4859b429ad2189b and 
2d21f488cda970f6a595ed9ce2fbd799a6a019d9 specifically)

I'm also attaching a drop-in replacement for the test/data/legacy-sstables 
directory, in case the 0003/0004 binary patches cause trouble.

 Row level coverage in LegacySSTableTest
 ---

 Key: CASSANDRA-1822
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1822
 Project: Cassandra
  Issue Type: Improvement
Reporter: Stu Hood
Assignee: Stu Hood
Priority: Minor
 Fix For: 0.7.1

 Attachments: 1822.tgz, legacy-sstables.tgz


 LegacySSTableTest should check compatibility of content within rows.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (CASSANDRA-1555) Considerations for larger bloom filters

2010-12-06 Thread Stu Hood (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stu Hood updated CASSANDRA-1555:


Attachment: addendum-to-1555.txt

Almost there... I'm attaching an addendum that I needed to get the long-running 
unit tests building. Once I got them running, they were reporting an exception: 
run `ant clean long-test` to reproduce.

As is, this patch passes the row-level compatibility test on CASSANDRA-1822, so 
as soon as we figure out the false positive problem, I can give it a thumbs up.

 Considerations for larger bloom filters
 ---

 Key: CASSANDRA-1555
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1555
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Stu Hood
Assignee: Ryan King
 Fix For: 0.8

 Attachments: addendum-to-1555.txt, cassandra-1555.tgz, 
 CASSANDRA-1555v2.patch, CASSANDRA-1555v3.patch.gz


 To (optimally) support SSTables larger than 143 million keys, we need to 
 support bloom filters larger than 2^31 bits, which java.util.BitSet can't 
 handle directly.
 A few options:
 * Switch to a BitSet class which supports 2^31 * 64 bits (Lucene's OpenBitSet)
 * Partition the java.util.BitSet behind our current BloomFilter
 ** Straightforward bit partitioning: bit N is in bitset N // 2^31
 ** Separate equally sized complete bloom filters for member ranges, which can 
 be used independently or OR'd together under memory pressure.
 All of these options require new approaches to serialization.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (CASSANDRA-1555) Considerations for larger bloom filters

2010-12-06 Thread Ryan King (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan King updated CASSANDRA-1555:
-

Attachment: CASSANDRA-1555v4.patch.gz

Another round to fix the long tests.

And on the FP rate, it seems that its actually in expected ranges based on the 
table here: http://pages.cs.wisc.edu/~cao/papers/summary-cache/node8.html, 
though we should probably double-check that math.

 Considerations for larger bloom filters
 ---

 Key: CASSANDRA-1555
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1555
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Stu Hood
Assignee: Ryan King
 Fix For: 0.8

 Attachments: addendum-to-1555.txt, cassandra-1555.tgz, 
 CASSANDRA-1555v2.patch, CASSANDRA-1555v3.patch.gz, CASSANDRA-1555v4.patch.gz


 To (optimally) support SSTables larger than 143 million keys, we need to 
 support bloom filters larger than 2^31 bits, which java.util.BitSet can't 
 handle directly.
 A few options:
 * Switch to a BitSet class which supports 2^31 * 64 bits (Lucene's OpenBitSet)
 * Partition the java.util.BitSet behind our current BloomFilter
 ** Straightforward bit partitioning: bit N is in bitset N // 2^31
 ** Separate equally sized complete bloom filters for member ranges, which can 
 be used independently or OR'd together under memory pressure.
 All of these options require new approaches to serialization.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (CASSANDRA-1763) NodeCmd should be able to view Compaction Statistics

2010-12-06 Thread Edward Capriolo (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Edward Capriolo updated CASSANDRA-1763:
---

Attachment: cassandra-1763-2-branch-7.0.patch.txt

 NodeCmd should be able to view Compaction Statistics
 

 Key: CASSANDRA-1763
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1763
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Affects Versions: 0.7 beta 3
Reporter: Edward Capriolo
Assignee: Edward Capriolo
Priority: Minor
 Fix For: 0.7.1

 Attachments: cassandra-1763-2-branch-7.0.patch.txt, 
 cassandra-1763-patch.txt


 When joining a node, major compacting a node, running cleanup on a node, or 
 sometimes when a node seems slightly slow it would be helpful to get 
 compaction information with nodetool 
 nodetool compactionstats would produce:
 {noformat}
 compaction type: Major
 column family: standard1
 bytes compacted: 49925478
 bytes total in progress: 63555680
 pending tasks: 1
 {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.