[jira] [Issue Comment Edited] (CASSANDRA-2388) ColumnFamilyRecordReader fails for a given split because a host is down, even if records could reasonably be read from other replica.

2011-06-08 Thread Mck SembWever (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13045620#comment-13045620
 ] 

Mck SembWever edited comment on CASSANDRA-2388 at 6/8/11 6:04 AM:
--

Initial attempt at solution. Although I'm a little apprehensive to the 
additions to cassandra.thrift
(describe_rack(..) isn't used anywhere, it just made sense to add 
describe_datacenter(..) and describe_rack(..) at the same time).

I've tested that existing hadoop jobs work but the new functionality hasn't 
been tested (as i currently don't have any RF=2 data setup).

This patch does not include the required re-generated Cassandra.java

  was (Author: michaelsembwever):
Initial attempt at solution. Although I'm a little apprehensive to the 
additions to cassandra.thrift
(and describe_rack isn't used anywhere, it just made sense to add 
describe_datacenter(..) and describe_rack(..) at the same time).

I've tested that existing hadoop jobs work but the new functionality hasn't 
been tested (as i currently don't have any RF=2 data setup).

This patch does not include the required re-generated Cassandra.java
  
 ColumnFamilyRecordReader fails for a given split because a host is down, even 
 if records could reasonably be read from other replica.
 -

 Key: CASSANDRA-2388
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2388
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Reporter: Eldon Stegall
Assignee: Mck SembWever
  Labels: hadoop, inputformat
 Fix For: 0.8.1

 Attachments: 0002_On_TException_try_next_split.patch, 
 CASSANDRA-2388.patch


 ColumnFamilyRecordReader only tries the first location for a given split. We 
 should try multiple locations for a given split.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-2388) ColumnFamilyRecordReader fails for a given split because a host is down, even if records could reasonably be read from other replica.

2011-06-08 Thread Mck SembWever (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mck SembWever updated CASSANDRA-2388:
-

Attachment: (was: CASSANDRA-2388.patch)

 ColumnFamilyRecordReader fails for a given split because a host is down, even 
 if records could reasonably be read from other replica.
 -

 Key: CASSANDRA-2388
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2388
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Reporter: Eldon Stegall
Assignee: Mck SembWever
  Labels: hadoop, inputformat
 Fix For: 0.8.1

 Attachments: 0002_On_TException_try_next_split.patch


 ColumnFamilyRecordReader only tries the first location for a given split. We 
 should try multiple locations for a given split.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-1610) Pluggable Compaction

2011-06-08 Thread Alan Liang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Liang updated CASSANDRA-1610:
--

Attachment: 0001-pluggable-compaction.patch

Removed updateEstimatedCompactions() from strategy since it is no longer called.

 Pluggable Compaction
 

 Key: CASSANDRA-1610
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1610
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Chris Goffinet
Assignee: Alan Liang
Priority: Minor
  Labels: compaction
 Fix For: 1.0

 Attachments: 0001-move-compaction-code-into-own-package.patch, 
 0001-move-compaction-code-into-own-package.patch, 
 0001-move-compaction-code-into-own-package.patch, 
 0001-move-compaction-code-into-own-package.patch, 
 0001-move-compaction-code-into-own-package.patch, 
 0001-move-compaction-code-into-own-package.patch, 
 0001-pluggable-compaction.patch, 
 0002-Pluggable-Compaction-and-Expiration.patch, 
 0002-pluggable-compaction.patch, 0002-pluggable-compaction.patch, 
 0002-pluggable-compaction.patch, 0002-pluggable-compaction.patch, 
 0002-pluggable-compaction.patch, 0002-pluggable-compaction.patch, 
 0002-pluggable-compaction.patch


 In CASSANDRA-1608, I proposed some changes on how compaction works. I think 
 it also makes sense to allow the ability to have pluggable compaction per CF. 
 There could be many types of workloads where this makes sense. One example we 
 had at Digg was to completely throw away certain SSTables after N days.
 This ticket addresses making compaction pluggable only.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-2735) Timestamp Based Compaction Strategy

2011-06-08 Thread Alan Liang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Liang updated CASSANDRA-2735:
--

Attachment: 0002-timestamp-bucketed-compaction-strategy.patch

Rebased once again due to change from AbstractCompactionStrategy to 
ICompactionStrategy #1610

 Timestamp Based Compaction Strategy
 ---

 Key: CASSANDRA-2735
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2735
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Alan Liang
Assignee: Alan Liang
Priority: Minor
  Labels: compaction
 Attachments: 0002-timestamp-bucketed-compaction-strategy.patch, 
 0003-implemented-timestamp-bucketed-compaction-strategy-a.patch


 Compaction strategy implementation based on max timestamp ordering of the 
 sstables while satisfying max sstable size, min and max compaction 
 thresholds. It also handles expiration of sstables based on a timestamp.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-2468) Clean up after failed compaction

2011-06-08 Thread Aaron Morton (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Morton updated CASSANDRA-2468:


Attachment: 0001-clean-up-temp-files-after-failed-compaction-v08-3.patch

version 3 for v0.8 modified SSTable.delete() to raise an IOException so 
cleanupIfNecessary() can catch it. Also changes componentsFor to accept an 
enum. 

Do we want this in 0.7?

 Clean up after failed compaction
 

 Key: CASSANDRA-2468
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2468
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Jonathan Ellis
Assignee: Aaron Morton
Priority: Minor
 Fix For: 0.7.7

 Attachments: 
 0001-clean-up-temp-files-after-failed-compaction-v08-2.patch, 
 0001-clean-up-temp-files-after-failed-compaction-v08-3.patch, 
 0001-clean-up-temp-files-after-failed-compaction-v08.patch, 
 0001-cleanup-temp-files-after-failed-compaction-v07.patch


 (Started in CASSANDRA-2088.)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-2733) nodetool ring with EC2Snitch, NPE checking for the zone and dc

2011-06-08 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13045943#comment-13045943
 ] 

Jonathan Ellis commented on CASSANDRA-2733:
---

Ah, I see -- the test really isn't a separate patch and needs the awsApiCall 
refactor from the fix patch to work.

Committed with minor tweaks (made awsApiCall package local, added @Override to 
test version).

 nodetool ring with EC2Snitch, NPE checking for the zone and dc
 --

 Key: CASSANDRA-2733
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2733
 Project: Cassandra
  Issue Type: New Feature
  Components: Contrib
Affects Versions: 0.8.0
 Environment: Cassandra JVM
Reporter: Vijay
Assignee: Vijay
Priority: Minor
 Fix For: 0.8.1

 Attachments: EC2Snitch-Patch-2733.patch, EC2Snitch-test-2733.patch


 Existing EC2Snitch... compare is done via == instead of equals() while 
 comparing the IP's... 
 (endpoint == FBUtilities.getLocalAddress())
 It is ok to compare the Object Address as most of the code uses 
 FBU.getLocalAddress() and it returns the same object everywhere... but it 
 breaks nodetool ring.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


svn commit: r1133389 - in /cassandra/branches/cassandra-0.8: CHANGES.txt src/java/org/apache/cassandra/locator/Ec2Snitch.java

2011-06-08 Thread jbellis
Author: jbellis
Date: Wed Jun  8 13:16:14 2011
New Revision: 1133389

URL: http://svn.apache.org/viewvc?rev=1133389view=rev
Log:
fix nodetoolring use with Ec2Snitch
patch by Vijay; reviewed by jbellis for CASSANDRA-2733

Modified:
cassandra/branches/cassandra-0.8/CHANGES.txt

cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/locator/Ec2Snitch.java

Modified: cassandra/branches/cassandra-0.8/CHANGES.txt
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8/CHANGES.txt?rev=1133389r1=1133388r2=1133389view=diff
==
--- cassandra/branches/cassandra-0.8/CHANGES.txt (original)
+++ cassandra/branches/cassandra-0.8/CHANGES.txt Wed Jun  8 13:16:14 2011
@@ -41,6 +41,7 @@
  * workaround large resultsets causing large allocation retention
by nio sockets (CASSANDRA-2654)
  * restrict repair streaming to specific columnfamilies (CASSANDRA-2280)
+ * fix nodetool ring use with Ec2Snitch (CASSANDRA-2733)
 
 
 0.8.0-final

Modified: 
cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/locator/Ec2Snitch.java
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/locator/Ec2Snitch.java?rev=1133389r1=1133388r2=1133389view=diff
==
--- 
cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/locator/Ec2Snitch.java
 (original)
+++ 
cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/locator/Ec2Snitch.java
 Wed Jun  8 13:16:14 2011
@@ -25,11 +25,13 @@ import java.net.HttpURLConnection;
 import java.net.InetAddress;
 import java.net.URL;
 
+import com.google.common.base.Charsets;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
 import org.apache.cassandra.config.ConfigurationException;
 import org.apache.cassandra.gms.ApplicationState;
+import org.apache.cassandra.gms.EndpointState;
 import org.apache.cassandra.gms.Gossiper;
 import org.apache.cassandra.service.StorageService;
 import org.apache.cassandra.utils.FBUtilities;
@@ -41,47 +43,54 @@ import org.apache.cassandra.utils.FBUtil
 public class Ec2Snitch extends AbstractNetworkTopologySnitch
 {
 protected static Logger logger = LoggerFactory.getLogger(Ec2Snitch.class);
+protected static final String ZONE_NAME_QUERY_URL = 
http://169.254.169.254/latest/meta-data/placement/availability-zone;;
 protected String ec2zone;
 protected String ec2region;
 
 public Ec2Snitch() throws IOException, ConfigurationException
 {
-// Populate the region and zone by introspection, fail if 404 on 
metadata
-HttpURLConnection conn = (HttpURLConnection) new 
URL(http://169.254.169.254/latest/meta-data/placement/availability-zone;).openConnection();
-conn.setRequestMethod(GET);
-if (conn.getResponseCode() != 200)
-{
-throw new ConfigurationException(Ec2Snitch was unable to find 
region/zone data. Not an ec2 node?);
-}
-
-// Read the information. I wish I could say (String) conn.getContent() 
here...
-int cl = conn.getContentLength();
-byte[] b = new byte[cl];
-DataInputStream d = new 
DataInputStream((FilterInputStream)conn.getContent());
-d.readFully(b);
-
 // Split us-east-1a or asia-1a into us-east/1a and asia/1a.
-String azone = new String(b ,UTF-8);
-String[] splits = azone.split(-);
+String[] splits = awsApiCall(ZONE_NAME_QUERY_URL).split(-);
 ec2zone = splits[splits.length - 1];
-ec2region = splits.length  3 ? splits[0] : splits[0]+-+splits[1];
+ec2region = splits.length  3 ? splits[0] : splits[0] + - + 
splits[1];
 logger.info(EC2Snitch using region:  + ec2region + , zone:  + 
ec2zone + .);
 }
+
+String awsApiCall(String url) throws IOException, ConfigurationException
+{
+// Populate the region and zone by introspection, fail if 404 on 
metadata
+HttpURLConnection conn = (HttpURLConnection) new 
URL(url).openConnection();
+try
+{
+conn.setRequestMethod(GET);
+if (conn.getResponseCode() != 200)
+throw new ConfigurationException(Ec2Snitch was unable to 
execute the API call. Not an ec2 node?);
+
+// Read the information. I wish I could say (String) 
conn.getContent() here...
+int cl = conn.getContentLength();
+byte[] b = new byte[cl];
+DataInputStream d = new DataInputStream((FilterInputStream) 
conn.getContent());
+d.readFully(b);
+return new String(b, Charsets.UTF_8);
+}
+finally
+{
+conn.disconnect();
+}
+}
 
 public String getRack(InetAddress endpoint)
 {
-if (endpoint == FBUtilities.getLocalAddress())
+if (endpoint.equals(FBUtilities.getLocalAddress()))
 return ec2zone;
-else
-

svn commit: r1133390 - /cassandra/branches/cassandra-0.8/test/unit/org/apache/cassandra/locator/EC2SnitchTest.java

2011-06-08 Thread jbellis
Author: jbellis
Date: Wed Jun  8 13:16:34 2011
New Revision: 1133390

URL: http://svn.apache.org/viewvc?rev=1133390view=rev
Log:
add EC2SnitchTest.java

Added:

cassandra/branches/cassandra-0.8/test/unit/org/apache/cassandra/locator/EC2SnitchTest.java

Added: 
cassandra/branches/cassandra-0.8/test/unit/org/apache/cassandra/locator/EC2SnitchTest.java
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8/test/unit/org/apache/cassandra/locator/EC2SnitchTest.java?rev=1133390view=auto
==
--- 
cassandra/branches/cassandra-0.8/test/unit/org/apache/cassandra/locator/EC2SnitchTest.java
 (added)
+++ 
cassandra/branches/cassandra-0.8/test/unit/org/apache/cassandra/locator/EC2SnitchTest.java
 Wed Jun  8 13:16:34 2011
@@ -0,0 +1,51 @@
+package org.apache.cassandra.locator;
+
+import static org.junit.Assert.assertEquals;
+
+import java.io.IOException;
+import java.net.InetAddress;
+import java.util.Map;
+
+import org.apache.cassandra.config.ConfigurationException;
+import org.apache.cassandra.gms.ApplicationState;
+import org.apache.cassandra.gms.Gossiper;
+import org.apache.cassandra.gms.VersionedValue;
+import org.apache.cassandra.service.StorageService;
+import org.junit.Test;
+
+public class EC2SnitchTest
+{
+
+private class TestEC2Snitch extends Ec2Snitch
+{
+public TestEC2Snitch() throws IOException, ConfigurationException
+{
+super();
+}
+
+@Override
+String awsApiCall(String url) throws IOException, 
ConfigurationException
+{
+return us-east-1d;
+}
+}
+
+@Test
+public void testRac() throws IOException, ConfigurationException
+{
+Ec2Snitch snitch = new TestEC2Snitch();
+InetAddress local = InetAddress.getByName(127.0.0.1);
+InetAddress nonlocal = InetAddress.getByName(127.0.0.7);
+
+Gossiper.instance.addSavedEndpoint(nonlocal);
+MapApplicationState,VersionedValue stateMap = 
Gossiper.instance.getEndpointStateForEndpoint(nonlocal).getApplicationStateMap();
+stateMap.put(ApplicationState.DC, 
StorageService.instance.valueFactory.datacenter(us-west));
+stateMap.put(ApplicationState.RACK, 
StorageService.instance.valueFactory.datacenter(1a));
+
+assertEquals(us-west, snitch.getDatacenter(nonlocal));
+assertEquals(1a, snitch.getRack(nonlocal));
+
+assertEquals(us-east, snitch.getDatacenter(local));
+assertEquals(1d, snitch.getRack(local));
+}
+}




svn commit: r1133391 - in /cassandra/branches/cassandra-0.7: ./ interface/thrift/gen-java/org/apache/cassandra/thrift/ src/java/org/apache/cassandra/locator/ test/unit/org/apache/cassandra/locator/

2011-06-08 Thread jbellis
Author: jbellis
Date: Wed Jun  8 13:23:09 2011
New Revision: 1133391

URL: http://svn.apache.org/viewvc?rev=1133391view=rev
Log:
merge #2733 from 0.8

Added:

cassandra/branches/cassandra-0.7/test/unit/org/apache/cassandra/locator/EC2SnitchTest.java
  - copied unchanged from r1133390, 
cassandra/branches/cassandra-0.8/test/unit/org/apache/cassandra/locator/EC2SnitchTest.java
Modified:
cassandra/branches/cassandra-0.7/   (props changed)
cassandra/branches/cassandra-0.7/CHANGES.txt

cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java
   (props changed)

cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java
   (props changed)

cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/InvalidRequestException.java
   (props changed)

cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/NotFoundException.java
   (props changed)

cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/SuperColumn.java
   (props changed)

cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/locator/Ec2Snitch.java

Propchange: cassandra/branches/cassandra-0.7/
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Wed Jun  8 13:23:09 2011
@@ -1,6 +1,7 @@
 /cassandra/branches/cassandra-0.6:922689-1131291
 /cassandra/branches/cassandra-0.7:1026516,1035666,1050269
 /cassandra/branches/cassandra-0.7.0:1053690-1055654
+/cassandra/branches/cassandra-0.8:1133389-1133390
 /cassandra/tags/cassandra-0.7.0-rc3:1051699-1053689
 /cassandra/trunk:1026516-1026734,1028929
 /incubator/cassandra/branches/cassandra-0.3:774578-796573

Modified: cassandra/branches/cassandra-0.7/CHANGES.txt
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/CHANGES.txt?rev=1133391r1=1133390r2=1133391view=diff
==
--- cassandra/branches/cassandra-0.7/CHANGES.txt (original)
+++ cassandra/branches/cassandra-0.7/CHANGES.txt Wed Jun  8 13:23:09 2011
@@ -15,6 +15,7 @@
  * fix truncate/compaction race (CASSANDRA-2673)
  * workaround large resultsets causing large allocation retention
by nio sockets (CASSANDRA-2654)
+ * fix nodetool ring use with Ec2Snitch (CASSANDRA-2733)
 
 
 0.7.6

Propchange: 
cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Wed Jun  8 13:23:09 2011
@@ -1,6 +1,7 @@
 
/cassandra/branches/cassandra-0.6/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:922689-1131291
 
/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1026516,1035666,1050269
 
/cassandra/branches/cassandra-0.7.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1053690-1055654
+/cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1133389-1133390
 
/cassandra/tags/cassandra-0.7.0-rc3/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1051699-1053689
 
/cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1026516-1026734,1028929
 
/incubator/cassandra/branches/cassandra-0.3/interface/gen-java/org/apache/cassandra/service/Cassandra.java:774578-796573

Propchange: 
cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Wed Jun  8 13:23:09 2011
@@ -1,6 +1,7 @@
 
/cassandra/branches/cassandra-0.6/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:922689-1131291
 
/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1026516,1035666,1050269
 
/cassandra/branches/cassandra-0.7.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1053690-1055654
+/cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1133389-1133390
 
/cassandra/tags/cassandra-0.7.0-rc3/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1051699-1053689
 
/cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1026516-1026734,1028929
 
/incubator/cassandra/branches/cassandra-0.3/interface/gen-java/org/apache/cassandra/service/column_t.java:774578-792198

Propchange: 
cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/InvalidRequestException.java
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Wed Jun  8 13:23:09 2011
@@ -1,6 +1,7 @@
 

[jira] [Updated] (CASSANDRA-2733) nodetool ring with EC2Snitch, NPE checking for the zone and dc

2011-06-08 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-2733:
--

Affects Version/s: (was: 0.8.0)
   0.7.1
Fix Version/s: 0.7.7

also committed to 0.7 branch

 nodetool ring with EC2Snitch, NPE checking for the zone and dc
 --

 Key: CASSANDRA-2733
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2733
 Project: Cassandra
  Issue Type: New Feature
  Components: Contrib
Affects Versions: 0.7.1
 Environment: Cassandra JVM
Reporter: Vijay
Assignee: Vijay
Priority: Minor
 Fix For: 0.7.7, 0.8.1

 Attachments: EC2Snitch-Patch-2733.patch, EC2Snitch-test-2733.patch


 Existing EC2Snitch... compare is done via == instead of equals() while 
 comparing the IP's... 
 (endpoint == FBUtilities.getLocalAddress())
 It is ok to compare the Object Address as most of the code uses 
 FBU.getLocalAddress() and it returns the same object everywhere... but it 
 breaks nodetool ring.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (CASSANDRA-2749) fine-grained control over data directories

2011-06-08 Thread Jonathan Ellis (JIRA)
fine-grained control over data directories
--

 Key: CASSANDRA-2749
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2749
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Priority: Minor


Currently Cassandra supports multiple data directories but no way to control 
what sstables are placed where. Particularly for systems with mixed SSDs and 
rotational disks, it would be nice to pin frequently accessed columnfamilies to 
the SSDs.

Postgresql does this with tablespaces 
(http://www.postgresql.org/docs/9.0/static/manage-ag-tablespaces.html) but we 
should probably avoid using that name because of confusing similarity to 
keyspaces.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-2749) fine-grained control over data directories

2011-06-08 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13045953#comment-13045953
 ] 

Jonathan Ellis commented on CASSANDRA-2749:
---

We could also have a memory location that would be useful for temporary data.

 fine-grained control over data directories
 --

 Key: CASSANDRA-2749
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2749
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Priority: Minor

 Currently Cassandra supports multiple data directories but no way to control 
 what sstables are placed where. Particularly for systems with mixed SSDs and 
 rotational disks, it would be nice to pin frequently accessed columnfamilies 
 to the SSDs.
 Postgresql does this with tablespaces 
 (http://www.postgresql.org/docs/9.0/static/manage-ag-tablespaces.html) but we 
 should probably avoid using that name because of confusing similarity to 
 keyspaces.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-2749) fine-grained control over data directories

2011-06-08 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13045956#comment-13045956
 ] 

Jonathan Ellis commented on CASSANDRA-2749:
---

There's some tension between managing this cluster-wide and the actual data 
directory definitions being per-machine. Not sure what the best solution there 
is.

 fine-grained control over data directories
 --

 Key: CASSANDRA-2749
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2749
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Priority: Minor

 Currently Cassandra supports multiple data directories but no way to control 
 what sstables are placed where. Particularly for systems with mixed SSDs and 
 rotational disks, it would be nice to pin frequently accessed columnfamilies 
 to the SSDs.
 Postgresql does this with tablespaces 
 (http://www.postgresql.org/docs/9.0/static/manage-ag-tablespaces.html) but we 
 should probably avoid using that name because of confusing similarity to 
 keyspaces.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-2500) Ruby dbi client (for CQL) that conforms to AR:ConnectionAdapter

2011-06-08 Thread Jon Hermes (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13046015#comment-13046015
 ] 

Jon Hermes commented on CASSANDRA-2500:
---

The drivers should be tested functionally the same as the cli is currently. 
Ideally all the drivers, client libs, etc. should be hooked into a distributed 
testing framework, but for now it should be valid to just make sure it does the 
right thing on a single node system.

Regarding packaging and mixing in with the ruby cassandra gem (in this case), I 
see good reasons to both combine the gems and to keep them seperate:

In the latter case, it's as jbellis pointed out, we don't want to confuse 
people about which gem they need for a certain feature set (there's little 
overlap in the usage between this and the ruby cassandra gem). Also, it's 
possible to commit this to trunk, whereas client libs have historically been 
kept out of the repo.

In the former, and from a dev perspective, there is a TON of reused code 
between the ruby cassandra gem and this driver, as there is likewise a TON of 
reused code between pycassa and the python cql driver. It would be nice to 
combine them/depend on them just for sanity purposes, and because it would 
force both gems/libs to be kept up to date and to be generally better.


Overall I'd call it an executive call, and it appears one has already been made.

 Ruby dbi client (for CQL) that conforms to AR:ConnectionAdapter
 ---

 Key: CASSANDRA-2500
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2500
 Project: Cassandra
  Issue Type: Task
Reporter: Jon Hermes
Assignee: Jon Hermes
 Attachments: 2500.txt, genthriftrb.txt


 Create a ruby driver for CQL.
 Lacking something standard (such as py-dbapi), going with something common 
 instead -- RoR ActiveRecord Connection Adapter 
 (http://api.rubyonrails.org/classes/ActiveRecord/ConnectionAdapters/AbstractAdapter.html).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-833) fix consistencylevel during bootstrap

2011-06-08 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13046017#comment-13046017
 ] 

Sylvain Lebresne commented on CASSANDRA-833:


+1

 fix consistencylevel during bootstrap
 -

 Key: CASSANDRA-833
 URL: https://issues.apache.org/jira/browse/CASSANDRA-833
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.5
Reporter: Jonathan Ellis
Assignee: Sylvain Lebresne
 Fix For: 0.8.1

 Attachments: 0001-Increase-CL-with-boostrapping-leaving-node.patch, 
 833-v2.txt


 As originally designed, bootstrap nodes should *always* get *all* writes 
 under any consistencylevel, so when bootstrap finishes the operator can run 
 cleanup on the old nodes w/o fear that he might lose data.
 but if a bootstrap operation fails or is aborted, that means all writes will 
 fail until the ex-bootstrapping node is decommissioned.  so starting in 
 CASSANDRA-722, we just ignore dead nodes in consistencylevel calculations.
 but this breaks the original design.  CASSANDRA-822 adds a partial fix for 
 this (just adding bootstrap targets into the RF targets and hinting 
 normally), but this is still broken under certain conditions.  The real fix 
 is to consider consistencylevel for two sets of nodes:
   1. the RF targets as currently existing (no pending ranges)
   2.  the RF targets as they will exist after all movement ops are done
 If we satisfy CL for both sets then we will always be in good shape.
 I'm not sure if we can easily calculate 2. from the current TokenMetadata, 
 though.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


svn commit: r1133443 - in /cassandra/branches/cassandra-0.8: ./ src/java/org/apache/cassandra/locator/ src/java/org/apache/cassandra/service/ test/unit/org/apache/cassandra/locator/ test/unit/org/apac

2011-06-08 Thread jbellis
Author: jbellis
Date: Wed Jun  8 15:45:54 2011
New Revision: 1133443

URL: http://svn.apache.org/viewvc?rev=1133443view=rev
Log:
fix inconsistency window duringbootstrap
patch by slebresne; reviewed by jbellis for CASSANDRA-833

Modified:
cassandra/branches/cassandra-0.8/CHANGES.txt

cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java

cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/locator/TokenMetadata.java

cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/service/AbstractWriteResponseHandler.java

cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/service/DatacenterSyncWriteResponseHandler.java

cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/service/DatacenterWriteResponseHandler.java

cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/service/StorageProxy.java

cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/service/WriteResponseHandler.java

cassandra/branches/cassandra-0.8/test/unit/org/apache/cassandra/locator/SimpleStrategyTest.java

cassandra/branches/cassandra-0.8/test/unit/org/apache/cassandra/service/ConsistencyLevelTest.java

cassandra/branches/cassandra-0.8/test/unit/org/apache/cassandra/service/LeaveAndBootstrapTest.java

cassandra/branches/cassandra-0.8/test/unit/org/apache/cassandra/service/MoveTest.java

Modified: cassandra/branches/cassandra-0.8/CHANGES.txt
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8/CHANGES.txt?rev=1133443r1=1133442r2=1133443view=diff
==
--- cassandra/branches/cassandra-0.8/CHANGES.txt (original)
+++ cassandra/branches/cassandra-0.8/CHANGES.txt Wed Jun  8 15:45:54 2011
@@ -42,6 +42,7 @@
by nio sockets (CASSANDRA-2654)
  * restrict repair streaming to specific columnfamilies (CASSANDRA-2280)
  * fix nodetool ring use with Ec2Snitch (CASSANDRA-2733)
+ * fix inconsistency window during bootstrap (CASSANDRA-833)
 
 
 0.8.0-final

Modified: 
cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java?rev=1133443r1=1133442r2=1133443view=diff
==
--- 
cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java
 (original)
+++ 
cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java
 Wed Jun  8 15:45:54 2011
@@ -24,6 +24,7 @@ import java.net.InetAddress;
 import java.util.*;
 
 import com.google.common.collect.HashMultimap;
+import com.google.common.collect.Iterables;
 import com.google.common.collect.Multimap;
 import org.apache.cassandra.gms.Gossiper;
 import org.slf4j.Logger;
@@ -119,20 +120,21 @@ public abstract class AbstractReplicatio
  */
 public abstract ListInetAddress calculateNaturalEndpoints(Token 
searchToken, TokenMetadata tokenMetadata) throws IllegalStateException;
 
-public IWriteResponseHandler 
getWriteResponseHandler(CollectionInetAddress writeEndpoints,
+public IWriteResponseHandler getWriteResponseHandler(IterableInetAddress 
writeEndpoints,
  MultimapInetAddress, 
InetAddress hintedEndpoints,
+ IterableInetAddress 
pendingEndpoints,
  ConsistencyLevel 
consistency_level)
 {
 if (consistency_level == ConsistencyLevel.LOCAL_QUORUM)
 {
 // block for in this context will be localnodes block.
-return DatacenterWriteResponseHandler.create(writeEndpoints, 
hintedEndpoints, consistency_level, table);
+return DatacenterWriteResponseHandler.create(writeEndpoints, 
hintedEndpoints, pendingEndpoints, consistency_level, table);
 }
 else if (consistency_level == ConsistencyLevel.EACH_QUORUM)
 {
-return DatacenterSyncWriteResponseHandler.create(writeEndpoints, 
hintedEndpoints, consistency_level, table);
+return DatacenterSyncWriteResponseHandler.create(writeEndpoints, 
hintedEndpoints, pendingEndpoints, consistency_level, table);
 }
-return WriteResponseHandler.create(writeEndpoints, hintedEndpoints, 
consistency_level, table);
+return WriteResponseHandler.create(writeEndpoints, hintedEndpoints, 
pendingEndpoints, consistency_level, table);
 }
 
 /**
@@ -148,9 +150,10 @@ public abstract class AbstractReplicatio
  * as the destination, it is a hinted write, and will need to be sent to
  * the ultimate target when it becomes alive again.
  */
-public MultimapInetAddress, InetAddress 

[jira] [Updated] (CASSANDRA-2480) Named keys / virtual columns

2011-06-08 Thread Pavel Yaskevich (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Yaskevich updated CASSANDRA-2480:
---

Attachment: CASSANDRA-2480.patch

work branch: cassandra-0.8, the latest commit 
5bd1258a0c328dd3317e48fc2bf0281216830780
 
Key alias support for CREATE COLUMNFAMILY, INSERT, UPDATE, DELETE, SELECT 
statements.

Example:
{noformat}
CREATE COLUMNFAMILY KeyAliasCF ('id' varint PRIMARY KEY, 'username' text) WITH 
default_validation = ascii;

INSERT INTO KeyAliasCF (KEY, username) VALUES (1, jbellis);

UPDATE KeyAliasCF SET username = 'xedin' WHERE id = 2;

SELECT * FROM KeyAliasCF WHERE id = 2;
SELECT username, id FROM KeyAliasCF WHERE id = 2;

DELETE FROM KeyAliasCF WHERE id = 2;
{noformat}

CQL doc and tests are updated.


 Named keys / virtual columns
 

 Key: CASSANDRA-2480
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2480
 Project: Cassandra
  Issue Type: Sub-task
  Components: API, Core
Reporter: Eric Evans
Assignee: Pavel Yaskevich
  Labels: cql
 Fix For: 1.0

 Attachments: CASSANDRA-2480.patch


 With the completion of CASSANDRA-2396, it is now possible to attach a name to 
 keys (column family-wide).  This could be utilized to introduce the concept 
 of virtual columns in CQL. Here's how that would work:
 Typically you would use the CQL keyword {{KEY}} to specify a row key, for 
 example:
 {code:SQL|title=CQL 1.0}
 INSERT INTO cf (KEY, name1, name2) VALUES (key1, value1, value2)
 -- or alternately
 UPDATE cf SET name1 = value1, name2 = value2 WHERE KEY = key1
 SELECT name1,name2 FROM cf WHERE KEY = key1
 {code}
 For CQL 1.1, that syntax would continue to work, but upon the completion of 
 this issue it should also be possible to assign a name to the key and treat 
 as if it were another column.  For example:
 {code:SQL|title=CQL 1.1}
 INSERT INTO cf (keyname, name1, name2) VALUES (key1, value1, value2)
 -- or alternately
 UPDATE cf SET name1 = value1, name2 = value2 WHERE keyname = key1
 -- Note how the keyname can now be used in the projection
 SELECT keyname, name1, name2 FROM cf WHERE keyname = key1
 -- And, there is no restriction on the order
 SELECT name1, name2, keyname FROM cf WHERE keyname = key1 AND name2 = value2
 {code}
 The semantics will be such that the existing behavior is maintained (read: 
 when using the {{KEY}} keyword), but if the key is named, and the name is 
 used in a {{SELECT}}, the key's name and value will be returned in the column 
 results, sorted according to the comparator (_Note: we'll need to figure out 
 what that means with respect to differently typed keys_).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-2480) Named keys / virtual columns

2011-06-08 Thread Pavel Yaskevich (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Yaskevich updated CASSANDRA-2480:
---

Attachment: (was: CASSANDRA-2480.patch)

 Named keys / virtual columns
 

 Key: CASSANDRA-2480
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2480
 Project: Cassandra
  Issue Type: Sub-task
  Components: API, Core
Reporter: Eric Evans
Assignee: Pavel Yaskevich
  Labels: cql
 Fix For: 0.8.1, 1.0

 Attachments: CASSANDRA-2480.patch


 With the completion of CASSANDRA-2396, it is now possible to attach a name to 
 keys (column family-wide).  This could be utilized to introduce the concept 
 of virtual columns in CQL. Here's how that would work:
 Typically you would use the CQL keyword {{KEY}} to specify a row key, for 
 example:
 {code:SQL|title=CQL 1.0}
 INSERT INTO cf (KEY, name1, name2) VALUES (key1, value1, value2)
 -- or alternately
 UPDATE cf SET name1 = value1, name2 = value2 WHERE KEY = key1
 SELECT name1,name2 FROM cf WHERE KEY = key1
 {code}
 For CQL 1.1, that syntax would continue to work, but upon the completion of 
 this issue it should also be possible to assign a name to the key and treat 
 as if it were another column.  For example:
 {code:SQL|title=CQL 1.1}
 INSERT INTO cf (keyname, name1, name2) VALUES (key1, value1, value2)
 -- or alternately
 UPDATE cf SET name1 = value1, name2 = value2 WHERE keyname = key1
 -- Note how the keyname can now be used in the projection
 SELECT keyname, name1, name2 FROM cf WHERE keyname = key1
 -- And, there is no restriction on the order
 SELECT name1, name2, keyname FROM cf WHERE keyname = key1 AND name2 = value2
 {code}
 The semantics will be such that the existing behavior is maintained (read: 
 when using the {{KEY}} keyword), but if the key is named, and the name is 
 used in a {{SELECT}}, the key's name and value will be returned in the column 
 results, sorted according to the comparator (_Note: we'll need to figure out 
 what that means with respect to differently typed keys_).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-2480) Named keys / virtual columns

2011-06-08 Thread Pavel Yaskevich (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Yaskevich updated CASSANDRA-2480:
---

Attachment: CASSANDRA-2480.patch

 Named keys / virtual columns
 

 Key: CASSANDRA-2480
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2480
 Project: Cassandra
  Issue Type: Sub-task
  Components: API, Core
Reporter: Eric Evans
Assignee: Pavel Yaskevich
  Labels: cql
 Fix For: 0.8.1, 1.0

 Attachments: CASSANDRA-2480.patch


 With the completion of CASSANDRA-2396, it is now possible to attach a name to 
 keys (column family-wide).  This could be utilized to introduce the concept 
 of virtual columns in CQL. Here's how that would work:
 Typically you would use the CQL keyword {{KEY}} to specify a row key, for 
 example:
 {code:SQL|title=CQL 1.0}
 INSERT INTO cf (KEY, name1, name2) VALUES (key1, value1, value2)
 -- or alternately
 UPDATE cf SET name1 = value1, name2 = value2 WHERE KEY = key1
 SELECT name1,name2 FROM cf WHERE KEY = key1
 {code}
 For CQL 1.1, that syntax would continue to work, but upon the completion of 
 this issue it should also be possible to assign a name to the key and treat 
 as if it were another column.  For example:
 {code:SQL|title=CQL 1.1}
 INSERT INTO cf (keyname, name1, name2) VALUES (key1, value1, value2)
 -- or alternately
 UPDATE cf SET name1 = value1, name2 = value2 WHERE keyname = key1
 -- Note how the keyname can now be used in the projection
 SELECT keyname, name1, name2 FROM cf WHERE keyname = key1
 -- And, there is no restriction on the order
 SELECT name1, name2, keyname FROM cf WHERE keyname = key1 AND name2 = value2
 {code}
 The semantics will be such that the existing behavior is maintained (read: 
 when using the {{KEY}} keyword), but if the key is named, and the name is 
 used in a {{SELECT}}, the key's name and value will be returned in the column 
 results, sorted according to the comparator (_Note: we'll need to figure out 
 what that means with respect to differently typed keys_).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-2480) Named keys / virtual columns

2011-06-08 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13046038#comment-13046038
 ] 

Jonathan Ellis commented on CASSANDRA-2480:
---

Shouldn't this result in an error?

+# try do insert/update
+cursor.execute(INSERT INTO KeyAliasCF (KEY, username) VALUES (1, 
jbellis))


 Named keys / virtual columns
 

 Key: CASSANDRA-2480
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2480
 Project: Cassandra
  Issue Type: Sub-task
  Components: API, Core
Reporter: Eric Evans
Assignee: Pavel Yaskevich
  Labels: cql
 Fix For: 0.8.1, 1.0

 Attachments: CASSANDRA-2480.patch


 With the completion of CASSANDRA-2396, it is now possible to attach a name to 
 keys (column family-wide).  This could be utilized to introduce the concept 
 of virtual columns in CQL. Here's how that would work:
 Typically you would use the CQL keyword {{KEY}} to specify a row key, for 
 example:
 {code:SQL|title=CQL 1.0}
 INSERT INTO cf (KEY, name1, name2) VALUES (key1, value1, value2)
 -- or alternately
 UPDATE cf SET name1 = value1, name2 = value2 WHERE KEY = key1
 SELECT name1,name2 FROM cf WHERE KEY = key1
 {code}
 For CQL 1.1, that syntax would continue to work, but upon the completion of 
 this issue it should also be possible to assign a name to the key and treat 
 as if it were another column.  For example:
 {code:SQL|title=CQL 1.1}
 INSERT INTO cf (keyname, name1, name2) VALUES (key1, value1, value2)
 -- or alternately
 UPDATE cf SET name1 = value1, name2 = value2 WHERE keyname = key1
 -- Note how the keyname can now be used in the projection
 SELECT keyname, name1, name2 FROM cf WHERE keyname = key1
 -- And, there is no restriction on the order
 SELECT name1, name2, keyname FROM cf WHERE keyname = key1 AND name2 = value2
 {code}
 The semantics will be such that the existing behavior is maintained (read: 
 when using the {{KEY}} keyword), but if the key is named, and the name is 
 used in a {{SELECT}}, the key's name and value will be returned in the column 
 results, sorted according to the comparator (_Note: we'll need to figure out 
 what that means with respect to differently typed keys_).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-2480) Named keys / virtual columns

2011-06-08 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13046041#comment-13046041
 ] 

Pavel Yaskevich commented on CASSANDRA-2480:


Everything should be working as previously, that was in the task description, 
key alias is optional, so you can use KEY even if you have alias set up. This 
is how I understand given description...

 Named keys / virtual columns
 

 Key: CASSANDRA-2480
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2480
 Project: Cassandra
  Issue Type: Sub-task
  Components: API, Core
Reporter: Eric Evans
Assignee: Pavel Yaskevich
  Labels: cql
 Fix For: 0.8.1, 1.0

 Attachments: CASSANDRA-2480.patch


 With the completion of CASSANDRA-2396, it is now possible to attach a name to 
 keys (column family-wide).  This could be utilized to introduce the concept 
 of virtual columns in CQL. Here's how that would work:
 Typically you would use the CQL keyword {{KEY}} to specify a row key, for 
 example:
 {code:SQL|title=CQL 1.0}
 INSERT INTO cf (KEY, name1, name2) VALUES (key1, value1, value2)
 -- or alternately
 UPDATE cf SET name1 = value1, name2 = value2 WHERE KEY = key1
 SELECT name1,name2 FROM cf WHERE KEY = key1
 {code}
 For CQL 1.1, that syntax would continue to work, but upon the completion of 
 this issue it should also be possible to assign a name to the key and treat 
 as if it were another column.  For example:
 {code:SQL|title=CQL 1.1}
 INSERT INTO cf (keyname, name1, name2) VALUES (key1, value1, value2)
 -- or alternately
 UPDATE cf SET name1 = value1, name2 = value2 WHERE keyname = key1
 -- Note how the keyname can now be used in the projection
 SELECT keyname, name1, name2 FROM cf WHERE keyname = key1
 -- And, there is no restriction on the order
 SELECT name1, name2, keyname FROM cf WHERE keyname = key1 AND name2 = value2
 {code}
 The semantics will be such that the existing behavior is maintained (read: 
 when using the {{KEY}} keyword), but if the key is named, and the name is 
 used in a {{SELECT}}, the key's name and value will be returned in the column 
 results, sorted according to the comparator (_Note: we'll need to figure out 
 what that means with respect to differently typed keys_).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-2749) fine-grained control over data directories

2011-06-08 Thread JIRA

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13046045#comment-13046045
 ] 

Héctor Izquierdo commented on CASSANDRA-2749:
-

What about being configurable in a separate file like the network topology? 
Could that work as a first approximation?

 fine-grained control over data directories
 --

 Key: CASSANDRA-2749
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2749
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Priority: Minor

 Currently Cassandra supports multiple data directories but no way to control 
 what sstables are placed where. Particularly for systems with mixed SSDs and 
 rotational disks, it would be nice to pin frequently accessed columnfamilies 
 to the SSDs.
 Postgresql does this with tablespaces 
 (http://www.postgresql.org/docs/9.0/static/manage-ag-tablespaces.html) but we 
 should probably avoid using that name because of confusing similarity to 
 keyspaces.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-2749) fine-grained control over data directories

2011-06-08 Thread Ryan King (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13046069#comment-13046069
 ] 

Ryan King commented on CASSANDRA-2749:
--

Since each keyspace is stored in a different sub-directory of the 
DataDiretories, you can already split the storage of different keyspaces with 
some clever mount options. Maybe we could give column families the same 
treatment?

 fine-grained control over data directories
 --

 Key: CASSANDRA-2749
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2749
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Priority: Minor

 Currently Cassandra supports multiple data directories but no way to control 
 what sstables are placed where. Particularly for systems with mixed SSDs and 
 rotational disks, it would be nice to pin frequently accessed columnfamilies 
 to the SSDs.
 Postgresql does this with tablespaces 
 (http://www.postgresql.org/docs/9.0/static/manage-ag-tablespaces.html) but we 
 should probably avoid using that name because of confusing similarity to 
 keyspaces.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-2749) fine-grained control over data directories

2011-06-08 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13046093#comment-13046093
 ] 

Jonathan Ellis commented on CASSANDRA-2749:
---

Ryan's idea sounds like the simplest way to get something good enough to me.

 fine-grained control over data directories
 --

 Key: CASSANDRA-2749
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2749
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Priority: Minor

 Currently Cassandra supports multiple data directories but no way to control 
 what sstables are placed where. Particularly for systems with mixed SSDs and 
 rotational disks, it would be nice to pin frequently accessed columnfamilies 
 to the SSDs.
 Postgresql does this with tablespaces 
 (http://www.postgresql.org/docs/9.0/static/manage-ag-tablespaces.html) but we 
 should probably avoid using that name because of confusing similarity to 
 keyspaces.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-2749) fine-grained control over data directories

2011-06-08 Thread Peter Schuller (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13046102#comment-13046102
 ] 

Peter Schuller commented on CASSANDRA-2749:
---

+1 on that. We have been discussing the same thing, for the same purpose. The 
only kink is that you don't want to do something like having a per-cf setting 
that is tied to local node details like paths. But simply placing CF:s in a 
named subdirectory (similar to the pg tablespace) which can, on a per-node 
basis, by a symlink or a mountpoint, avoids that.

This means there's no problem doing a rolling re-configuration of a cluster, 
and there is no need to realize before hand that you might want to move some 
particular CF and do something like assign it to a tablespace (to get the level 
of indirection). It all just works by default, and you can move CF:s at any 
time on any node without co-ordination other than the node being down for a bit.

I can foresee it being easier to accidentally start a node which seems to work 
but has some CF:s be completely empty, because Cassandra won't be able to 
distinguish between an actual empty CF and a directory that wasn't mounted (or 
a symlink pointing to a non-mounted directory). Something simple like creating 
a marker of some kind on CF creation might help with that; on start-up CF:s 
that are missing the marker could be rejected. But - I suppose this is overkill 
at least initially.


 fine-grained control over data directories
 --

 Key: CASSANDRA-2749
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2749
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Priority: Minor

 Currently Cassandra supports multiple data directories but no way to control 
 what sstables are placed where. Particularly for systems with mixed SSDs and 
 rotational disks, it would be nice to pin frequently accessed columnfamilies 
 to the SSDs.
 Postgresql does this with tablespaces 
 (http://www.postgresql.org/docs/9.0/static/manage-ag-tablespaces.html) but we 
 should probably avoid using that name because of confusing similarity to 
 keyspaces.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (CASSANDRA-2749) fine-grained control over data directories

2011-06-08 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis reassigned CASSANDRA-2749:
-

Assignee: Pavel Yaskevich

 fine-grained control over data directories
 --

 Key: CASSANDRA-2749
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2749
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: Pavel Yaskevich
Priority: Minor

 Currently Cassandra supports multiple data directories but no way to control 
 what sstables are placed where. Particularly for systems with mixed SSDs and 
 rotational disks, it would be nice to pin frequently accessed columnfamilies 
 to the SSDs.
 Postgresql does this with tablespaces 
 (http://www.postgresql.org/docs/9.0/static/manage-ag-tablespaces.html) but we 
 should probably avoid using that name because of confusing similarity to 
 keyspaces.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-2480) Named keys / virtual columns

2011-06-08 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13046107#comment-13046107
 ] 

Jonathan Ellis commented on CASSANDRA-2480:
---

Key alias should be optional in the sense that the default is KEY not in the 
sense that you can use either one.  (Agreed, the description gets this wrong -- 
sorry for not looking at that closer.)

 Named keys / virtual columns
 

 Key: CASSANDRA-2480
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2480
 Project: Cassandra
  Issue Type: Sub-task
  Components: API, Core
Reporter: Eric Evans
Assignee: Pavel Yaskevich
  Labels: cql
 Fix For: 0.8.1, 1.0

 Attachments: CASSANDRA-2480.patch


 With the completion of CASSANDRA-2396, it is now possible to attach a name to 
 keys (column family-wide).  This could be utilized to introduce the concept 
 of virtual columns in CQL. Here's how that would work:
 Typically you would use the CQL keyword {{KEY}} to specify a row key, for 
 example:
 {code:SQL|title=CQL 1.0}
 INSERT INTO cf (KEY, name1, name2) VALUES (key1, value1, value2)
 -- or alternately
 UPDATE cf SET name1 = value1, name2 = value2 WHERE KEY = key1
 SELECT name1,name2 FROM cf WHERE KEY = key1
 {code}
 For CQL 1.1, that syntax would continue to work, but upon the completion of 
 this issue it should also be possible to assign a name to the key and treat 
 as if it were another column.  For example:
 {code:SQL|title=CQL 1.1}
 INSERT INTO cf (keyname, name1, name2) VALUES (key1, value1, value2)
 -- or alternately
 UPDATE cf SET name1 = value1, name2 = value2 WHERE keyname = key1
 -- Note how the keyname can now be used in the projection
 SELECT keyname, name1, name2 FROM cf WHERE keyname = key1
 -- And, there is no restriction on the order
 SELECT name1, name2, keyname FROM cf WHERE keyname = key1 AND name2 = value2
 {code}
 The semantics will be such that the existing behavior is maintained (read: 
 when using the {{KEY}} keyword), but if the key is named, and the name is 
 used in a {{SELECT}}, the key's name and value will be returned in the column 
 results, sorted according to the comparator (_Note: we'll need to figure out 
 what that means with respect to differently typed keys_).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (CASSANDRA-2750) Slightly better Localhost lookup in GuidGenerator.java

2011-06-08 Thread Davanum Srinivas (JIRA)
Slightly better Localhost lookup in GuidGenerator.java
--

 Key: CASSANDRA-2750
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2750
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Davanum Srinivas
Priority: Minor


We have a box where InetAddress.getLocalHost() fails but 
InetAddress.getByName(null) succeeds, Can you please consider this patch that 
adds a bit of code to the existing catch to try the alternate lookup.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-2750) Slightly better Localhost lookup in GuidGenerator.java

2011-06-08 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13046121#comment-13046121
 ] 

Jonathan Ellis commented on CASSANDRA-2750:
---

Weird -- what is the difference in semantics between those two?

Do you know what kind of configuration causes getLocalHost to throw UHE?

 Slightly better Localhost lookup in GuidGenerator.java
 --

 Key: CASSANDRA-2750
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2750
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Davanum Srinivas
Priority: Minor

 We have a box where InetAddress.getLocalHost() fails but 
 InetAddress.getByName(null) succeeds, Can you please consider this patch 
 that adds a bit of code to the existing catch to try the alternate lookup.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (CASSANDRA-2751) Improved Metrics collection

2011-06-08 Thread Ryan King (JIRA)
Improved Metrics collection
---

 Key: CASSANDRA-2751
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2751
 Project: Cassandra
  Issue Type: Improvement
Reporter: Ryan King
Assignee: Ryan King


Collecting metrics in cassandra needs to be easier. Currently the amount of 
work required to expose one new metric in the server and consume it outside the 
server is way to high.

In my mind, collecting a new metric in the server should be a single line of 
code and consuming it should be easily doable from any programming language.

There are several options for better metrics collection on the JVM:

https://github.com/twitter/ostrich
https://github.com/codahale/metrics/

We should look at these

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-2480) Named keys / virtual columns

2011-06-08 Thread Pavel Yaskevich (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Yaskevich updated CASSANDRA-2480:
---

Attachment: CASSANDRA-2480-v2.patch

v2 introduces following behavior: when alias key was set you won't be able to 
use KEY keyword in any of the operations for that CF - UPDATE, INSERT, SELECT, 
DELETE.

 Named keys / virtual columns
 

 Key: CASSANDRA-2480
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2480
 Project: Cassandra
  Issue Type: Sub-task
  Components: API, Core
Reporter: Eric Evans
Assignee: Pavel Yaskevich
  Labels: cql
 Fix For: 0.8.1, 1.0

 Attachments: CASSANDRA-2480-v2.patch, CASSANDRA-2480.patch


 With the completion of CASSANDRA-2396, it is now possible to attach a name to 
 keys (column family-wide).  This could be utilized to introduce the concept 
 of virtual columns in CQL. Here's how that would work:
 Typically you would use the CQL keyword {{KEY}} to specify a row key, for 
 example:
 {code:SQL|title=CQL 1.0}
 INSERT INTO cf (KEY, name1, name2) VALUES (key1, value1, value2)
 -- or alternately
 UPDATE cf SET name1 = value1, name2 = value2 WHERE KEY = key1
 SELECT name1,name2 FROM cf WHERE KEY = key1
 {code}
 For CQL 1.1, that syntax would continue to work, but upon the completion of 
 this issue it should also be possible to assign a name to the key and treat 
 as if it were another column.  For example:
 {code:SQL|title=CQL 1.1}
 INSERT INTO cf (keyname, name1, name2) VALUES (key1, value1, value2)
 -- or alternately
 UPDATE cf SET name1 = value1, name2 = value2 WHERE keyname = key1
 -- Note how the keyname can now be used in the projection
 SELECT keyname, name1, name2 FROM cf WHERE keyname = key1
 -- And, there is no restriction on the order
 SELECT name1, name2, keyname FROM cf WHERE keyname = key1 AND name2 = value2
 {code}
 The semantics will be such that the existing behavior is maintained (read: 
 when using the {{KEY}} keyword), but if the key is named, and the name is 
 used in a {{SELECT}}, the key's name and value will be returned in the column 
 results, sorted according to the comparator (_Note: we'll need to figure out 
 what that means with respect to differently typed keys_).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-1610) Pluggable Compaction

2011-06-08 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13046177#comment-13046177
 ] 

Jonathan Ellis commented on CASSANDRA-1610:
---

- I think Ben's selection of methods for the CompactionStrategy is an 
improvement, but I do like having an abstract class so it's obvious what the 
contract is for us vs having to inject parameters post-construction.
- I'd like to move away from minor/major terms as too tied to the old 
compaction internals. Perhaps background/maximal instead?
- We should also make user defined compactions part of ACS -- for some 
strategies (e.g. leveldb) we want to be able to reject user requests that would 
break strategy invariants.  Note that this should probably return a single 
Task, rather than a list.  (Maximal will also usually return a single task, 
but it's cleaner to represent nothing to do as an empty list, than as null.)
- handleInsufficientSpaceForCompaction is a bad encapsulation; it means both it 
and its caller have to deal with find a place for an sstable.  suggest 
leaving it up to CT.execute to deal with.

Here's what I think ACS should end up looking like with these changes:
{code}
/**
 * Puggable compaction strategy determines how SSTables get merged.
 *
 * There are two main goals:
 *  - perform background compaction constantly as needed; this typically makes 
a tradeoff between
 *i/o done by compaction, and merging done at read time.
 *  - perform a full (maximum possible) compaction if requested by the user
 */
public abstract class AbstractCompactionStrategy
{
protected final ColumnFamilyStore cfs;
protected final MapString, String options;

protected AbstractCompactionStrategy(ColumnFamilyStore cfs, MapString, 
String options)
{
this.cfs = cfs;
this.options = options;
}

/**
 * @return a list of compaction tasks that should run in the background to 
get the sstable
 * count down to desired parameters.
 * @param gcBefore throw away tombstones older than this
 */
public abstract ListAbstractCompactionTask getBackgroundTasks(final int 
gcBefore);

/**
 * @return the number of background tasks estimated to still be needed for 
this columnfamilystore
 */
public abstract int getEstimatedRemainingTasks();

/**
 * @return a list of compaction tasks that should be run to compact this 
columnfamilystore
 * as much as possible.
 * @param gcBefore throw away tombstones older than this
 */
public abstract ListAbstractCompactionTask getMaximalTasks(final int 
gcBefore);

/**
 * @return a compaction task corresponding to the requested sstables
 * @param gcBefore throw away tombstones older than this
 */
public abstract AbstractCompactionTask 
getUserDefinedTasks(ListSSTableReader sstables, final int gcBefore);
}
{code}

- Finally, can you update to conform with 
http://wiki.apache.org/cassandra/CodeStyle, especially the part about multiline 
statements?

 Pluggable Compaction
 

 Key: CASSANDRA-1610
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1610
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Chris Goffinet
Assignee: Alan Liang
Priority: Minor
  Labels: compaction
 Fix For: 1.0

 Attachments: 0001-move-compaction-code-into-own-package.patch, 
 0001-move-compaction-code-into-own-package.patch, 
 0001-move-compaction-code-into-own-package.patch, 
 0001-move-compaction-code-into-own-package.patch, 
 0001-move-compaction-code-into-own-package.patch, 
 0001-move-compaction-code-into-own-package.patch, 
 0001-pluggable-compaction.patch, 
 0002-Pluggable-Compaction-and-Expiration.patch, 
 0002-pluggable-compaction.patch, 0002-pluggable-compaction.patch, 
 0002-pluggable-compaction.patch, 0002-pluggable-compaction.patch, 
 0002-pluggable-compaction.patch, 0002-pluggable-compaction.patch, 
 0002-pluggable-compaction.patch


 In CASSANDRA-1608, I proposed some changes on how compaction works. I think 
 it also makes sense to allow the ability to have pluggable compaction per CF. 
 There could be many types of workloads where this makes sense. One example we 
 had at Digg was to completely throw away certain SSTables after N days.
 This ticket addresses making compaction pluggable only.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-1805) refactor and remove contrib/

2011-06-08 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13046185#comment-13046185
 ] 

Jonathan Ellis commented on CASSANDRA-1805:
---

Consensus seems to be just move it into the main cassandra jar because you 
can't do much without the AbstractType classes there anyway.

 refactor and remove contrib/
 

 Key: CASSANDRA-1805
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1805
 Project: Cassandra
  Issue Type: Task
Reporter: Jonathan Ellis
Priority: Minor
 Fix For: 0.8.1

 Attachments: 1805-sstabledebug.txt


 Contrib is a mix of examples, tools, and miscellanea that probably doesn't 
 belong in our source tree.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-1610) Pluggable Compaction

2011-06-08 Thread Alan Liang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13046199#comment-13046199
 ] 

Alan Liang commented on CASSANDRA-1610:
---

bq. I think Ben's selection of methods for the CompactionStrategy is an 
improvement, but I do like having an abstract class so it's obvious what the 
contract is for us vs having to inject parameters post-construction.

I agree, I'll go back to the Abstract class approach.

bq. I'd like to move away from minor/major terms as too tied to the old 
compaction internals. Perhaps background/maximal instead?

Sounds good to me.

bq. We should also make user defined compactions part of ACS – for some 
strategies (e.g. leveldb) we want to be able to reject user requests that would 
break strategy invariants. Note that this should probably return a single Task, 
rather than a list. (Maximal will also usually return a single task, but it's 
cleaner to represent nothing to do as an empty list, than as null.)

Sounds good to me.

bq. handleInsufficientSpaceForCompaction is a bad encapsulation; it means both 
it and its caller have to deal with find a place for an sstable. suggest 
leaving it up to CT.execute to deal with.

Sounds good to me.


I'll resubmit a patch with all these suggestions. Thanks!

 Pluggable Compaction
 

 Key: CASSANDRA-1610
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1610
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Chris Goffinet
Assignee: Alan Liang
Priority: Minor
  Labels: compaction
 Fix For: 1.0

 Attachments: 0001-move-compaction-code-into-own-package.patch, 
 0001-move-compaction-code-into-own-package.patch, 
 0001-move-compaction-code-into-own-package.patch, 
 0001-move-compaction-code-into-own-package.patch, 
 0001-move-compaction-code-into-own-package.patch, 
 0001-move-compaction-code-into-own-package.patch, 
 0001-pluggable-compaction.patch, 
 0002-Pluggable-Compaction-and-Expiration.patch, 
 0002-pluggable-compaction.patch, 0002-pluggable-compaction.patch, 
 0002-pluggable-compaction.patch, 0002-pluggable-compaction.patch, 
 0002-pluggable-compaction.patch, 0002-pluggable-compaction.patch, 
 0002-pluggable-compaction.patch


 In CASSANDRA-1608, I proposed some changes on how compaction works. I think 
 it also makes sense to allow the ability to have pluggable compaction per CF. 
 There could be many types of workloads where this makes sense. One example we 
 had at Digg was to completely throw away certain SSTables after N days.
 This ticket addresses making compaction pluggable only.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Issue Comment Edited] (CASSANDRA-1610) Pluggable Compaction

2011-06-08 Thread Alan Liang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13046199#comment-13046199
 ] 

Alan Liang edited comment on CASSANDRA-1610 at 6/8/11 9:10 PM:
---

bq. I think Ben's selection of methods for the CompactionStrategy is an 
improvement, but I do like having an abstract class so it's obvious what the 
contract is for us vs having to inject parameters post-construction.

I agree, I'll go back to the Abstract class approach.

bq. I'd like to move away from minor/major terms as too tied to the old 
compaction internals. Perhaps background/maximal instead?

Sounds good to me.

bq. We should also make user defined compactions part of ACS – for some 
strategies (e.g. leveldb) we want to be able to reject user requests that would 
break strategy invariants. Note that this should probably return a single Task, 
rather than a list. (Maximal will also usually return a single task, but it's 
cleaner to represent nothing to do as an empty list, than as null.)

Sounds good to me.

bq. handleInsufficientSpaceForCompaction is a bad encapsulation; it means both 
it and its caller have to deal with find a place for an sstable. suggest 
leaving it up to CT.execute to deal with.

Sounds good to me. So if a strategy wants to customize the behavior of handling 
insufficient space, they'd have to implement their own CompactionTask (or 
override the existing one). What do you think about that? Another thing is... 
since space is always a race condition, I could leave it up to the strategy to 
ensure the sstable it has selected has a reasonable amount of space for 
compaction.


I'll resubmit a patch with all these suggestions. Thanks!

  was (Author: alanliang):
bq. I think Ben's selection of methods for the CompactionStrategy is an 
improvement, but I do like having an abstract class so it's obvious what the 
contract is for us vs having to inject parameters post-construction.

I agree, I'll go back to the Abstract class approach.

bq. I'd like to move away from minor/major terms as too tied to the old 
compaction internals. Perhaps background/maximal instead?

Sounds good to me.

bq. We should also make user defined compactions part of ACS – for some 
strategies (e.g. leveldb) we want to be able to reject user requests that would 
break strategy invariants. Note that this should probably return a single Task, 
rather than a list. (Maximal will also usually return a single task, but it's 
cleaner to represent nothing to do as an empty list, than as null.)

Sounds good to me.

bq. handleInsufficientSpaceForCompaction is a bad encapsulation; it means both 
it and its caller have to deal with find a place for an sstable. suggest 
leaving it up to CT.execute to deal with.

Sounds good to me.


I'll resubmit a patch with all these suggestions. Thanks!
  
 Pluggable Compaction
 

 Key: CASSANDRA-1610
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1610
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Chris Goffinet
Assignee: Alan Liang
Priority: Minor
  Labels: compaction
 Fix For: 1.0

 Attachments: 0001-move-compaction-code-into-own-package.patch, 
 0001-move-compaction-code-into-own-package.patch, 
 0001-move-compaction-code-into-own-package.patch, 
 0001-move-compaction-code-into-own-package.patch, 
 0001-move-compaction-code-into-own-package.patch, 
 0001-move-compaction-code-into-own-package.patch, 
 0001-pluggable-compaction.patch, 
 0002-Pluggable-Compaction-and-Expiration.patch, 
 0002-pluggable-compaction.patch, 0002-pluggable-compaction.patch, 
 0002-pluggable-compaction.patch, 0002-pluggable-compaction.patch, 
 0002-pluggable-compaction.patch, 0002-pluggable-compaction.patch, 
 0002-pluggable-compaction.patch


 In CASSANDRA-1608, I proposed some changes on how compaction works. I think 
 it also makes sense to allow the ability to have pluggable compaction per CF. 
 There could be many types of workloads where this makes sense. One example we 
 had at Digg was to completely throw away certain SSTables after N days.
 This ticket addresses making compaction pluggable only.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-1805) refactor and remove contrib/

2011-06-08 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13046259#comment-13046259
 ] 

Brandon Williams commented on CASSANDRA-1805:
-

Also some db-specifics, like Column, etc.  So +1

 refactor and remove contrib/
 

 Key: CASSANDRA-1805
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1805
 Project: Cassandra
  Issue Type: Task
Reporter: Jonathan Ellis
Priority: Minor
 Fix For: 0.8.1

 Attachments: 1805-sstabledebug.txt


 Contrib is a mix of examples, tools, and miscellanea that probably doesn't 
 belong in our source tree.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-2590) row delete breaks read repair

2011-06-08 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-2590:
--

 Reviewer: jbellis
Affects Version/s: (was: 0.7.5)
   (was: 0.8 beta 1)
Fix Version/s: 0.8.1
   0.7.7

 row delete breaks read repair 
 --

 Key: CASSANDRA-2590
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2590
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Aaron Morton
Assignee: Aaron Morton
Priority: Minor
 Fix For: 0.7.7, 0.8.1

 Attachments: 0001-2590-v3.patch, 
 0001-cf-resolve-test-and-possible-solution-for-read-repai.patch, 2590-v2.txt


 related to CASSANDRA-2589 
 Working at CL ALL can get inconsistent reads after row deletion. Reproduced 
 on the 0.7 and 0.8 source. 
 Steps to reproduce:
 # two node cluster with rf 2 and HH turned off
 # insert rows via cli 
 # flush both nodes 
 # shutdown node 1
 # connect to node 2 via cli and delete one row
 # bring up node 1
 # connect to node 1 via cli and issue get with CL ALL 
 # first get returns the deleted row, second get returns zero rows.
 RowRepairResolver.resolveSuperSet() resolves a local CF with the old row 
 columns, and the remote CF which is marked for deletion. CF.resolve() does 
 not pay attention to the deletion flags and the resolved CF has both 
 markedForDeletion set and a column with a lower timestamp. The return from 
 resolveSuperSet() is used as the return for the read without checking if the 
 cols are relevant. 
 Also when RowRepairResolver.mabeScheduleRepairs() runs it sends two 
 mutations. Node 1 is given the row level deletation, and Node 2 is given a 
 mutation to write the old (and now deleted) column from node 2. I have some 
 log traces for this if needed. 
 A quick fix is to check for relevant columns in the RowRepairResolver, will 
 attach shortly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


svn commit: r1133610 - in /cassandra/branches/cassandra-0.8: ./ src/java/org/apache/cassandra/locator/ src/java/org/apache/cassandra/service/ test/unit/org/apache/cassandra/locator/ test/unit/org/apac

2011-06-08 Thread jbellis
Author: jbellis
Date: Thu Jun  9 00:16:27 2011
New Revision: 1133610

URL: http://svn.apache.org/viewvc?rev=1133610view=rev
Log:
revert 1133443

Modified:
cassandra/branches/cassandra-0.8/CHANGES.txt

cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java

cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/locator/TokenMetadata.java

cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/service/AbstractWriteResponseHandler.java

cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/service/DatacenterSyncWriteResponseHandler.java

cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/service/DatacenterWriteResponseHandler.java

cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/service/StorageProxy.java

cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/service/WriteResponseHandler.java

cassandra/branches/cassandra-0.8/test/unit/org/apache/cassandra/locator/SimpleStrategyTest.java

cassandra/branches/cassandra-0.8/test/unit/org/apache/cassandra/service/ConsistencyLevelTest.java

cassandra/branches/cassandra-0.8/test/unit/org/apache/cassandra/service/LeaveAndBootstrapTest.java

cassandra/branches/cassandra-0.8/test/unit/org/apache/cassandra/service/MoveTest.java

Modified: cassandra/branches/cassandra-0.8/CHANGES.txt
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8/CHANGES.txt?rev=1133610r1=1133609r2=1133610view=diff
==
--- cassandra/branches/cassandra-0.8/CHANGES.txt (original)
+++ cassandra/branches/cassandra-0.8/CHANGES.txt Thu Jun  9 00:16:27 2011
@@ -42,7 +42,6 @@
by nio sockets (CASSANDRA-2654)
  * restrict repair streaming to specific columnfamilies (CASSANDRA-2280)
  * fix nodetool ring use with Ec2Snitch (CASSANDRA-2733)
- * fix inconsistency window during bootstrap (CASSANDRA-833)
 
 
 0.8.0-final

Modified: 
cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java?rev=1133610r1=1133609r2=1133610view=diff
==
--- 
cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java
 (original)
+++ 
cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java
 Thu Jun  9 00:16:27 2011
@@ -24,7 +24,6 @@ import java.net.InetAddress;
 import java.util.*;
 
 import com.google.common.collect.HashMultimap;
-import com.google.common.collect.Iterables;
 import com.google.common.collect.Multimap;
 import org.apache.cassandra.gms.Gossiper;
 import org.slf4j.Logger;
@@ -120,21 +119,20 @@ public abstract class AbstractReplicatio
  */
 public abstract ListInetAddress calculateNaturalEndpoints(Token 
searchToken, TokenMetadata tokenMetadata) throws IllegalStateException;
 
-public IWriteResponseHandler getWriteResponseHandler(IterableInetAddress 
writeEndpoints,
+public IWriteResponseHandler 
getWriteResponseHandler(CollectionInetAddress writeEndpoints,
  MultimapInetAddress, 
InetAddress hintedEndpoints,
- IterableInetAddress 
pendingEndpoints,
  ConsistencyLevel 
consistency_level)
 {
 if (consistency_level == ConsistencyLevel.LOCAL_QUORUM)
 {
 // block for in this context will be localnodes block.
-return DatacenterWriteResponseHandler.create(writeEndpoints, 
hintedEndpoints, pendingEndpoints, consistency_level, table);
+return DatacenterWriteResponseHandler.create(writeEndpoints, 
hintedEndpoints, consistency_level, table);
 }
 else if (consistency_level == ConsistencyLevel.EACH_QUORUM)
 {
-return DatacenterSyncWriteResponseHandler.create(writeEndpoints, 
hintedEndpoints, pendingEndpoints, consistency_level, table);
+return DatacenterSyncWriteResponseHandler.create(writeEndpoints, 
hintedEndpoints, consistency_level, table);
 }
-return WriteResponseHandler.create(writeEndpoints, hintedEndpoints, 
pendingEndpoints, consistency_level, table);
+return WriteResponseHandler.create(writeEndpoints, hintedEndpoints, 
consistency_level, table);
 }
 
 /**
@@ -150,10 +148,9 @@ public abstract class AbstractReplicatio
  * as the destination, it is a hinted write, and will need to be sent to
  * the ultimate target when it becomes alive again.
  */
-public MultimapInetAddress, InetAddress 
getHintedEndpoints(IterableInetAddress targets)
+public MultimapInetAddress, InetAddress 

[jira] [Commented] (CASSANDRA-2386) sstable2json does not work on snapshot without moving the files

2011-06-08 Thread Patricio Echague (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13046297#comment-13046297
 ] 

Patricio Echague commented on CASSANDRA-2386:
-

The cause of this failure is that 
{code} Descriptor.java {code}
{code}
public static PairDescriptor,String fromFilename(File directory, String 
name)
{
// name of parent directory is keyspace name
String ksname = directory.getName();
{code}

For a snapshot path like this:
{code}/var/lib/cassandra/data/Keyspace1/snapshots/1307575216104{code}

produces an output: ksname == 1307575216104 (which is wrong).
It should be Keyspace1

 sstable2json does not work on snapshot without moving the files
 ---

 Key: CASSANDRA-2386
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2386
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
 Environment: Redhat Linux
Reporter: Aslak Dirdal
Assignee: Patricio Echague
Priority: Minor
 Fix For: 0.8.1


 sstable2json 
 ../data/MyKeyspace/snapshots/1301066898131-mysnapshot/dockeys-10-Data.db
 {
   Exception in thread main java.lang.NullPointerException: Unknown 
 ColumnFamily dockeys in keyspace 1301066898131-mysnapshot
 at 
 org.apache.cassandra.config.DatabaseDescriptor.getComparator(DatabaseDescriptor.java:1169)
 at org.apache.cassandra.db.ColumnFamily.create(ColumnFamily.java:68)
 at 
 org.apache.cassandra.io.SSTableReader.makeColumnFamily(SSTableReader.java:582)
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeFromSSTable(ColumnFamilySerializer.java:158)
 at 
 org.apache.cassandra.io.IteratingRow.getColumnFamily(IteratingRow.java:79)
 at 
 org.apache.cassandra.tools.SSTableExport.serializeRow(SSTableExport.java:110)
 at 
 org.apache.cassandra.tools.SSTableExport.export(SSTableExport.java:270)
 at 
 org.apache.cassandra.tools.SSTableExport.export(SSTableExport.java:302)
 at 
 org.apache.cassandra.tools.SSTableExport.export(SSTableExport.java:326)
 at 
 org.apache.cassandra.tools.SSTableExport.main(SSTableExport.java:370)
 sstable2json seem to think that the foldername 1301066898131-mysnapshot is 
 the Keyspace name.
 Moving the *.db files to a folder with the same name as the Keyspace is a 
 workaround.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-833) fix consistencylevel during bootstrap

2011-06-08 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13046315#comment-13046315
 ] 

Hudson commented on CASSANDRA-833:
--

Integrated in Cassandra-0.8 #158 (See 
[https://builds.apache.org/job/Cassandra-0.8/158/])
fix inconsistency window duringbootstrap
patch by slebresne; reviewed by jbellis for CASSANDRA-833

jbellis : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1133443
Files : 
* /cassandra/branches/cassandra-0.8/CHANGES.txt
* 
/cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/service/DatacenterSyncWriteResponseHandler.java
* 
/cassandra/branches/cassandra-0.8/test/unit/org/apache/cassandra/locator/SimpleStrategyTest.java
* 
/cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/service/StorageProxy.java
* 
/cassandra/branches/cassandra-0.8/test/unit/org/apache/cassandra/service/LeaveAndBootstrapTest.java
* 
/cassandra/branches/cassandra-0.8/test/unit/org/apache/cassandra/service/MoveTest.java
* 
/cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java
* 
/cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/service/AbstractWriteResponseHandler.java
* 
/cassandra/branches/cassandra-0.8/test/unit/org/apache/cassandra/service/ConsistencyLevelTest.java
* 
/cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/service/WriteResponseHandler.java
* 
/cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/service/DatacenterWriteResponseHandler.java
* 
/cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/locator/TokenMetadata.java


 fix consistencylevel during bootstrap
 -

 Key: CASSANDRA-833
 URL: https://issues.apache.org/jira/browse/CASSANDRA-833
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.5
Reporter: Jonathan Ellis
Assignee: Sylvain Lebresne
 Fix For: 0.8.1

 Attachments: 0001-Increase-CL-with-boostrapping-leaving-node.patch, 
 833-v2.txt


 As originally designed, bootstrap nodes should *always* get *all* writes 
 under any consistencylevel, so when bootstrap finishes the operator can run 
 cleanup on the old nodes w/o fear that he might lose data.
 but if a bootstrap operation fails or is aborted, that means all writes will 
 fail until the ex-bootstrapping node is decommissioned.  so starting in 
 CASSANDRA-722, we just ignore dead nodes in consistencylevel calculations.
 but this breaks the original design.  CASSANDRA-822 adds a partial fix for 
 this (just adding bootstrap targets into the RF targets and hinting 
 normally), but this is still broken under certain conditions.  The real fix 
 is to consider consistencylevel for two sets of nodes:
   1. the RF targets as currently existing (no pending ranges)
   2.  the RF targets as they will exist after all movement ops are done
 If we satisfy CL for both sets then we will always be in good shape.
 I'm not sure if we can easily calculate 2. from the current TokenMetadata, 
 though.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira