[jira] [Created] (CASSANDRA-4869) repair cannot complete
Zenek Kraweznik created CASSANDRA-4869: -- Summary: repair cannot complete Key: CASSANDRA-4869 URL: https://issues.apache.org/jira/browse/CASSANDRA-4869 Project: Cassandra Issue Type: Bug Affects Versions: 1.1.6 Environment: oracle java 1.6u35, linux 2.6.32 Reporter: Zenek Kraweznik Priority: Blocker ERROR [ValidationExecutor:3] 2012-10-24 16:41:24,310 AbstractCassandraDaemon.java (line 135) Exception in thread Thread[ValidationExecutor:3,1,main] java.lang.AssertionError: row DecoratedKey(73897609397945816944009475942793437831, 37663364336664642d353532642d343736392d613437662d376339656339623637386138) received out of order wrt DecoratedKey(104298997089014216307657576199722210661, 65396135326632372d343065352d343864372d616362372d326665373738616434366334) at org.apache.cassandra.service.AntiEntropyService$Validator.add(AntiEntropyService.java:349) at org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:716) at org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:69) at org.apache.cassandra.db.compaction.CompactionManager$8.call(CompactionManager.java:442) at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) I've also found in log file: ERROR [CompactionExecutor:2974] 2012-10-26 02:59:21,384 LeveledManifest.java (line 223) At level 3, SSTableReader(path='/var/lib/cassandra/data/Archive/Messages/Archive-Messages-hf-97756-Data.db') [DecoratedKey(72089347517688459378407444327185359483, 31343563626135612d323439342d346633312d616130382d653133623336303032363464), DecoratedKey(104298997089014216307657576199722210661, 65396135326632372d343065352d343864372d616362372d326665373738616434366334)] overlaps SSTableReader(path='/var/lib/cassandra/data/Archive/Messages/Archive-Messages-hf-97519-Data.db') [DecoratedKey(73897609397945816944009475942793437831, 37663364336664642d353532642d343736392d613437662d376339656339623637386138), DecoratedKey(104617537605721927860229960625668693343, 39613736316364352d313361392d343163322d383361642d623635366163623032376430)]. This is caused by a bug in Cassandra 1.1.0 .. 1.1.3. Sending back to L0. If you have not yet run scrub, you should do so since you may also have rows out-of-order within an sstable but I'd have run scrub on all nodes before repair action -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-4870) CQL3 documentation does not use double quotes for case sensitivity
Christoph Werres created CASSANDRA-4870: --- Summary: CQL3 documentation does not use double quotes for case sensitivity Key: CASSANDRA-4870 URL: https://issues.apache.org/jira/browse/CASSANDRA-4870 Project: Cassandra Issue Type: Bug Components: Documentation website Affects Versions: 1.1.6, 1.1.5 Reporter: Christoph Werres Priority: Minor CQL3 documentation is misleading considering the example for CREATE KEYSPACE. If not enclosed by double quotes (as in the example), data center names are interpreted as case insensitive. The example topology file cassandra-topology.properties within the conf directory of each Cassandra distribution uses capital letters for data center names. This leads to a cluster, where no data center seems to be available for keyspaces that were created according to the example. Data center names are defined case sensitive within cassandra-topology.properties, but are all lower case in created column families metadata using CQL3 without using double quotes. See: http://cassandra.apache.org/doc/cql3/CQL.html#createKeyspaceStmt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-4870) CQL3 documentation does not use double quotes for case sensitivity
[ https://issues.apache.org/jira/browse/CASSANDRA-4870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christoph Werres updated CASSANDRA-4870: Description: CQL3 documentation is misleading considering the example for CREATE KEYSPACE. If not enclosed by double quotes (as in the example), data center names are interpreted as case insensitive. The example topology file cassandra-topology.properties within the conf directory of each Cassandra distribution uses capital letters for data center names. This leads to a cluster, where no data center seems to be available for keyspaces that were created according to the example. Data center names are defined case sensitive within cassandra-topology.properties, but are all lower case in created keyspaces metadata using CQL3 without using double quotes. See: http://cassandra.apache.org/doc/cql3/CQL.html#createKeyspaceStmt was: CQL3 documentation is misleading considering the example for CREATE KEYSPACE. If not enclosed by double quotes (as in the example), data center names are interpreted as case insensitive. The example topology file cassandra-topology.properties within the conf directory of each Cassandra distribution uses capital letters for data center names. This leads to a cluster, where no data center seems to be available for keyspaces that were created according to the example. Data center names are defined case sensitive within cassandra-topology.properties, but are all lower case in created column families metadata using CQL3 without using double quotes. See: http://cassandra.apache.org/doc/cql3/CQL.html#createKeyspaceStmt CQL3 documentation does not use double quotes for case sensitivity -- Key: CASSANDRA-4870 URL: https://issues.apache.org/jira/browse/CASSANDRA-4870 Project: Cassandra Issue Type: Bug Components: Documentation website Affects Versions: 1.1.5, 1.1.6 Reporter: Christoph Werres Priority: Minor Labels: documentation CQL3 documentation is misleading considering the example for CREATE KEYSPACE. If not enclosed by double quotes (as in the example), data center names are interpreted as case insensitive. The example topology file cassandra-topology.properties within the conf directory of each Cassandra distribution uses capital letters for data center names. This leads to a cluster, where no data center seems to be available for keyspaces that were created according to the example. Data center names are defined case sensitive within cassandra-topology.properties, but are all lower case in created keyspaces metadata using CQL3 without using double quotes. See: http://cassandra.apache.org/doc/cql3/CQL.html#createKeyspaceStmt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4468) Temporally unreachable Dynamic Composite column names.
[ https://issues.apache.org/jira/browse/CASSANDRA-4468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485927#comment-13485927 ] Sylvain Lebresne commented on CASSANDRA-4468: - bq. If I'm right, everything you do with CQL could be made, with more effort, with Thrift but not the opposite, isn't? No. In theory, there is nothing you can do with thrift that cannot be done with CQL3 (and nothing you can do with CQL3 that cannot be done with thrift). I say in theory because I won't pretend having tested everything and there may a few bugs here and there that limit things (typically I haven't tested using DynamicCompositeType with CQL3, but bug excluded, this should be possible. What is true is that CQL3 doesn't provide any help whatsoever to work with DynamicCompositeType, so using it for that won't have any advantage over thrift: you will have to encode/decode the composite names manually (or have your client library do it for you, as would be the case with hector)). The difference is that for a large amount of use cases, CQL3 provides a much more convenient API. Temporally unreachable Dynamic Composite column names. -- Key: CASSANDRA-4468 URL: https://issues.apache.org/jira/browse/CASSANDRA-4468 Project: Cassandra Issue Type: Bug Affects Versions: 1.1.0, 1.1.1, 1.1.2 Environment: linux, Reporter: Cesare Cugnasco Priority: Minor Labels: persistence Attachments: BugFinder.java I was working on a Column family with a DynamicComposite column sorter when I noticed that sometimes, after the insertion of a column with a column name composed by a single string (eg 's@step'),it was possible to be reach the column only by slice query but not by direct access. For example using the cassandra-cli it is possible to query: get frame[int(26)]; RowKey: 001a (column=i@19, value=0013, timestamp=1343495134729000) = (column=s@step, value=746573742076616c7565, timestamp=134349513468) but typing 'get frame[int(26)]['s@step']' I got no result. I tested this behavior using also other clients such as Hector, Astyanax, Pycassa and directly Thrift. I wrote this java code with hector-core-1.1.0 to reproduce this bug. public static void main(String[] args) { String kname = testspace3; Cluster myCluster = HFactory.getOrCreateCluster(Test-cluster, System.getProperty(location, localhost:9160)); //creating the keyspace and Column family if (myCluster.describeKeyspace(kname) != null) { myCluster.dropKeyspace(kname, true); } ColumnFamilyDefinition cfd = HFactory.createColumnFamilyDefinition(kname, frame, ComparatorType.DYNAMICCOMPOSITETYPE); cfd.setComparatorTypeAlias(DynamicComposite.DEFAULT_DYNAMIC_COMPOSITE_ALIASES); KeyspaceDefinition kdf = HFactory.createKeyspaceDefinition(kname, SimpleStrategy, 1, Arrays.asList(cfd)); myCluster.addKeyspace(kdf, true); Keyspace ksp = HFactory.createKeyspace(kname, myCluster); //Hector template definition ColumnFamilyTemplateInteger, DynamicComposite template = new ThriftColumnFamilyTemplateInteger, DynamicComposite( ksp, frame, IntegerSerializer.get(), DynamicCompositeSerializer.get()); DynamicComposite dc = new DynamicComposite(); dc.addComponent(step, StringSerializer.get()); DynamicComposite numdc = new DynamicComposite(); numdc.addComponent(BigInteger.valueOf(62), BigIntegerSerializer.get()); ColumnFamilyUpdaterInteger, DynamicComposite cf = template.createUpdater(26); cf.setString(dc, test value); cf.setString(numdc, altro valore); template.update(cf); //without this parts it works. It works also with less then 4 insertions cf = template.createUpdater(26); for (int i = 0; i 4; i++) { DynamicComposite num = new DynamicComposite(); num.addComponent(BigInteger.valueOf(i), BigIntegerSerializer.get()); cf.setInteger(num, i); } template.update(cf); // end part HColumnDynamicComposite, String res = template.querySingleColumn(26, dc, StringSerializer.get()); if (res == null) { System.out.println([FAIL] Row not found); } else { System.out.println([SUCCESS] Returned name + res.getName().get(0).toString() + - with value: + res.getValue()); } } The code acts three tasks: configure keyspace an CF, insert the data and try to retrieve it. After running the code the data are visible (by list for example)
git commit: Fix DynamicCompositeType same type comparison
Updated Branches: refs/heads/cassandra-1.1 d51643cad - 5e15927ff Fix DynamicCompositeType same type comparison patch by slebresne; reviewed by jbellis for CASSANDRA-4711 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/5e15927f Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/5e15927f Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/5e15927f Branch: refs/heads/cassandra-1.1 Commit: 5e15927b9aa47b69a7089b1aa2d8f3f1f093 Parents: d51643c Author: Sylvain Lebresne sylv...@datastax.com Authored: Mon Oct 29 11:12:48 2012 +0100 Committer: Sylvain Lebresne sylv...@datastax.com Committed: Mon Oct 29 11:12:48 2012 +0100 -- CHANGES.txt|1 + .../cassandra/db/marshal/DynamicCompositeType.java | 20 ++ 2 files changed, 15 insertions(+), 6 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/5e15927f/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 191c935..05b7ef3 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -9,6 +9,7 @@ * fix potential infinite loop in get_count (CASSANDRA-4833) * fix compositeType.{get/from}String methods (CASSANDRA-4842) * (CQL) fix CREATE COLUMNFAMILY permissions check (CASSANDRA-4864) + * Fix DynamicCompositeType same type comparison (CASSANDRA-4711) 1.1.6 http://git-wip-us.apache.org/repos/asf/cassandra/blob/5e15927f/src/java/org/apache/cassandra/db/marshal/DynamicCompositeType.java -- diff --git a/src/java/org/apache/cassandra/db/marshal/DynamicCompositeType.java b/src/java/org/apache/cassandra/db/marshal/DynamicCompositeType.java index 0f0127a..06ecdfc 100644 --- a/src/java/org/apache/cassandra/db/marshal/DynamicCompositeType.java +++ b/src/java/org/apache/cassandra/db/marshal/DynamicCompositeType.java @@ -120,15 +120,15 @@ public class DynamicCompositeType extends AbstractCompositeType * We compare component of different types by comparing the * comparator class names. We start with the simple classname * first because that will be faster in almost all cases, but - * allback on the full name if necessary -*/ + * fallback on the full name if necessary + */ int cmp = comp1.getClass().getSimpleName().compareTo(comp2.getClass().getSimpleName()); if (cmp != 0) -return cmp 0 ? FixedValueComparator.instance : ReversedType.getInstance(FixedValueComparator.instance); +return cmp 0 ? FixedValueComparator.alwaysLesserThan : FixedValueComparator.alwaysGreaterThan; cmp = comp1.getClass().getName().compareTo(comp2.getClass().getName()); if (cmp != 0) -return cmp 0 ? FixedValueComparator.instance : ReversedType.getInstance(FixedValueComparator.instance); +return cmp 0 ? FixedValueComparator.alwaysLesserThan : FixedValueComparator.alwaysGreaterThan; // if cmp == 0, we're actually having the same type, but one that // did not have a singleton instance. It's ok (though inefficient). @@ -307,11 +307,19 @@ public class DynamicCompositeType extends AbstractCompositeType */ private static class FixedValueComparator extends AbstractTypeVoid { -public static final FixedValueComparator instance = new FixedValueComparator(); +public static final FixedValueComparator alwaysLesserThan = new FixedValueComparator(-1); +public static final FixedValueComparator alwaysGreaterThan = new FixedValueComparator(1); + +private final int cmp; + +public FixedValueComparator(int cmp) +{ +this.cmp = cmp; +} public int compare(ByteBuffer v1, ByteBuffer v2) { -return -1; +return cmp; } public Void compose(ByteBuffer bytes)
[jira] [Commented] (CASSANDRA-4781) Sometimes Cassandra starts compacting system-shema_columns cf repeatedly until the node is killed
[ https://issues.apache.org/jira/browse/CASSANDRA-4781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485940#comment-13485940 ] Sylvain Lebresne commented on CASSANDRA-4781: - Looks overall ok, but a few small remaining remarks: * I'm good with making the tombstone compaction interval configurable but I would have gone with a longer default one (say a day, or at least a few hours). The tombstone compaction can be fairly improductive if we do them too often: compacting a sstable for tombstones, even if you do collect tombstone, is kinda bad if you're going to compact the sstable a short time later. In other words, I don't think this interval is only useful for avoiding an infinite loop. Besides, if it's configurable, the few people that have very heavy delete/expiring workload can set that lower if that helps them. * It's probably not worth exposing an interval in milliseconds, seconds would be more than good enough. I also don't dislike putting the unit the option use in the name too, so tombstone_compaction_interval_seconds, though maybe it's too long a name. * Could be nice to validate the user input for the new option (should be 0). Sometimes Cassandra starts compacting system-shema_columns cf repeatedly until the node is killed - Key: CASSANDRA-4781 URL: https://issues.apache.org/jira/browse/CASSANDRA-4781 Project: Cassandra Issue Type: Bug Affects Versions: 1.2.0 beta 1 Environment: Ubuntu 12.04, single-node Cassandra cluster Reporter: Aleksey Yeschenko Assignee: Yuki Morishita Fix For: 1.2.0 beta 2 Attachments: 4781.txt, 4781-v2.txt Cassandra starts flushing system-schema_columns cf in a seemingly infinite loop: INFO [CompactionExecutor:7] 2012-10-09 17:55:46,804 CompactionTask.java (line 239) Compacted to [/var/lib/cassandra/data/system/schema_columns/system-schema_columns-ia-32107-Data.db,]. 3,827 to 3,827 (~100% of original) bytes for 3 keys at 0.202762MB/s. Time: 18ms. INFO [CompactionExecutor:7] 2012-10-09 17:55:46,804 CompactionTask.java (line 119) Compacting [SSTableReader(path='/var/lib/cassandra/data/system/schema_columns/system-schema_columns-ia-32107-Data.db')] INFO [CompactionExecutor:7] 2012-10-09 17:55:46,824 CompactionTask.java (line 239) Compacted to [/var/lib/cassandra/data/system/schema_columns/system-schema_columns-ia-32108-Data.db,]. 3,827 to 3,827 (~100% of original) bytes for 3 keys at 0.182486MB/s. Time: 20ms. INFO [CompactionExecutor:7] 2012-10-09 17:55:46,825 CompactionTask.java (line 119) Compacting [SSTableReader(path='/var/lib/cassandra/data/system/schema_columns/system-schema_columns-ia-32108-Data.db')] INFO [CompactionExecutor:7] 2012-10-09 17:55:46,864 CompactionTask.java (line 239) Compacted to [/var/lib/cassandra/data/system/schema_columns/system-schema_columns-ia-32109-Data.db,]. 3,827 to 3,827 (~100% of original) bytes for 3 keys at 0.096045MB/s. Time: 38ms. INFO [CompactionExecutor:7] 2012-10-09 17:55:46,864 CompactionTask.java (line 119) Compacting [SSTableReader(path='/var/lib/cassandra/data/system/schema_columns/system-schema_columns-ia-32109-Data.db')] INFO [CompactionExecutor:7] 2012-10-09 17:55:46,894 CompactionTask.java (line 239) Compacted to [/var/lib/cassandra/data/system/schema_columns/system-schema_columns-ia-32110-Data.db,]. 3,827 to 3,827 (~100% of original) bytes for 3 keys at 0.121657MB/s. Time: 30ms. INFO [CompactionExecutor:7] 2012-10-09 17:55:46,894 CompactionTask.java (line 119) Compacting [SSTableReader(path='/var/lib/cassandra/data/system/schema_columns/system-schema_columns-ia-32110-Data.db')] INFO [CompactionExecutor:7] 2012-10-09 17:55:46,914 CompactionTask.java (line 239) Compacted to [/var/lib/cassandra/data/system/schema_columns/system-schema_columns-ia-32111-Data.db,]. 3,827 to 3,827 (~100% of original) bytes for 3 keys at 0.202762MB/s. Time: 18ms. INFO [CompactionExecutor:7] 2012-10-09 17:55:46,914 CompactionTask.java (line 119) Compacting [SSTableReader(path='/var/lib/cassandra/data/system/schema_columns/system-schema_columns-ia-32111-Data.db')] . Don't know what's causing it. Don't know a way to predictably trigger this behaviour. It just happens sometimes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-4865) Off-heap bloom filters
[ https://issues.apache.org/jira/browse/CASSANDRA-4865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vijay updated CASSANDRA-4865: - Attachment: 0001-CASSANDRA-4865.patch Good news, The new implementation uses byte's instead of longs which gives us approximately 20% better performance. In Practice it should save some GC time too. Time taken for approx 200 Million iterations each. |Open BS set's|Open BS get's|Offheap BS set's|Offheap BS get's| |502||371||311||64| |507||444||257||366| |496||478||310||367| |504||473||306||369| |490||481||305||367| |502||472||314||363| |489||476||305||367| |486||474||303||364| |489||474||307||365| |492||477||305||365| |490||475||307||367| Attached patch enables offheap BS only for SSTable BF's and leaves the Promoted indexes alone. Attached patch will break the existing 1.2-beta users, New BF serialization is changed. (is that expected or ok to do?) Pending/ TODO's: 1) have to figure out why Scrub test trys to free memory twice. 2) Have to regenerate Corrupted SST's (unit test failure). Off-heap bloom filters -- Key: CASSANDRA-4865 URL: https://issues.apache.org/jira/browse/CASSANDRA-4865 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Jonathan Ellis Assignee: Vijay Fix For: 1.2.1 Attachments: 0001-CASSANDRA-4865.patch Bloom filters are the major user of heap as dataset grows. It's probably worth it to move these off heap. No extra refcounting needs to be done since we already refcount SSTableReader. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CASSANDRA-4869) repair cannot complete
[ https://issues.apache.org/jira/browse/CASSANDRA-4869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis resolved CASSANDRA-4869. --- Resolution: Invalid You need to run offline scrub to fix overlapping-sstables-within-a-level. repair cannot complete -- Key: CASSANDRA-4869 URL: https://issues.apache.org/jira/browse/CASSANDRA-4869 Project: Cassandra Issue Type: Bug Affects Versions: 1.1.6 Environment: oracle java 1.6u35, linux 2.6.32 Reporter: Zenek Kraweznik Priority: Blocker ERROR [ValidationExecutor:3] 2012-10-24 16:41:24,310 AbstractCassandraDaemon.java (line 135) Exception in thread Thread[ValidationExecutor:3,1,main] java.lang.AssertionError: row DecoratedKey(73897609397945816944009475942793437831, 37663364336664642d353532642d343736392d613437662d376339656339623637386138) received out of order wrt DecoratedKey(104298997089014216307657576199722210661, 65396135326632372d343065352d343864372d616362372d326665373738616434366334) at org.apache.cassandra.service.AntiEntropyService$Validator.add(AntiEntropyService.java:349) at org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:716) at org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:69) at org.apache.cassandra.db.compaction.CompactionManager$8.call(CompactionManager.java:442) at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) I've also found in log file: ERROR [CompactionExecutor:2974] 2012-10-26 02:59:21,384 LeveledManifest.java (line 223) At level 3, SSTableReader(path='/var/lib/cassandra/data/Archive/Messages/Archive-Messages-hf-97756-Data.db') [DecoratedKey(72089347517688459378407444327185359483, 31343563626135612d323439342d346633312d616130382d653133623336303032363464), DecoratedKey(104298997089014216307657576199722210661, 65396135326632372d343065352d343864372d616362372d326665373738616434366334)] overlaps SSTableReader(path='/var/lib/cassandra/data/Archive/Messages/Archive-Messages-hf-97519-Data.db') [DecoratedKey(73897609397945816944009475942793437831, 37663364336664642d353532642d343736392d613437662d376339656339623637386138), DecoratedKey(104617537605721927860229960625668693343, 39613736316364352d313361392d343163322d383361642d623635366163623032376430)]. This is caused by a bug in Cassandra 1.1.0 .. 1.1.3. Sending back to L0. If you have not yet run scrub, you should do so since you may also have rows out-of-order within an sstable but I'd have run scrub on all nodes before repair action -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-4871) get_paged_slice does not obey SlicePredicate
Scott Fines created CASSANDRA-4871: -- Summary: get_paged_slice does not obey SlicePredicate Key: CASSANDRA-4871 URL: https://issues.apache.org/jira/browse/CASSANDRA-4871 Project: Cassandra Issue Type: Improvement Components: API, Hadoop Affects Versions: 1.1.6 Reporter: Scott Fines When experimenting with WideRow support, I noticed that it is not possible to specify a bounding SlicePredicate. This means that, no matter what you may wish, the entire Column Family will be used during a get_paged_slice call. This is unfortunate, if (for example) you are attempting to do MapReduce over a subset of your column range. get_paged_slice should support a SlicePredicate, which will bound the column range over which data is returned. It seems like this SlicePredicate should be optional, so that existing code is not broken--when the SlicePredicate is not specified, have it default to going over the entire column range. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4865) Off-heap bloom filters
[ https://issues.apache.org/jira/browse/CASSANDRA-4865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13486022#comment-13486022 ] Jonathan Ellis commented on CASSANDRA-4865: --- Nit: could use unsafe.setMemory as a faster clear Off-heap bloom filters -- Key: CASSANDRA-4865 URL: https://issues.apache.org/jira/browse/CASSANDRA-4865 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Jonathan Ellis Assignee: Vijay Fix For: 1.2.1 Attachments: 0001-CASSANDRA-4865.patch Bloom filters are the major user of heap as dataset grows. It's probably worth it to move these off heap. No extra refcounting needs to be done since we already refcount SSTableReader. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[Cassandra Wiki] Update of ClientOptions by RaulRaja
Dear Wiki user, You have subscribed to a wiki page or wiki category on Cassandra Wiki for change notification. The ClientOptions page has been changed by RaulRaja: http://wiki.apache.org/cassandra/ClientOptions?action=diffrev1=158rev2=159 Comment: Added Firebrand client library * Pycassa: http://github.com/pycassa/pycassa * Telephus: http://github.com/driftx/Telephus (Twisted) * Java: + * Firebrand: +* Site: http://firebrandocm.org +* Docs: http://firebrandocm.org/ +* Sources https://github.com/47deg/firebrand * PlayOrm: https://github.com/deanhiller/playorm * Astyanax: https://github.com/Netflix/astyanax/wiki/Getting-Started * Hector:
[jira] [Commented] (CASSANDRA-4784) Create separate sstables for each token range handled by a node
[ https://issues.apache.org/jira/browse/CASSANDRA-4784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13486081#comment-13486081 ] Benjamin Coverston commented on CASSANDRA-4784: --- I have a working implementation of this for STCS, one issue is it has the unfortunate (or fortunate) side effect of also partitioning up the SSTables for LCS as I put the implementation inside the CompactionTask making the currently (small) SSTables much smaller. I feel like this puts us at a crossroads: Should we create a completely partitioned data strategy for vnodes (a directory per vnode), or should we continue to mix the data files in a single data directory? L0 to L1 compactions become particularly hairy if we do that unless we first partition the L0 SSTables then subsequently compact the partitioned L0 with L1 for the vnode. Create separate sstables for each token range handled by a node --- Key: CASSANDRA-4784 URL: https://issues.apache.org/jira/browse/CASSANDRA-4784 Project: Cassandra Issue Type: Improvement Components: Core Reporter: sankalp kohli Assignee: Benjamin Coverston Priority: Minor Labels: perfomance Currently, each sstable has data for all the ranges that node is handling. If we change that and rather have separate sstables for each range that node is handling, it can lead to some improvements. Improvements 1) Node rebuild will be very fast as sstables can be directly copied over to the bootstrapping node. It will minimize any application level logic. We can directly use Linux native methods to transfer sstables without using CPU and putting less pressure on the serving node. I think in theory it will be the fastest way to transfer data. 2) Backup can only transfer sstables for a node which belong to its primary keyrange. 3) ETL process can only copy one replica of data and will be much faster. Changes: We can split the writes into multiple memtables for each range it is handling. The sstables being flushed from these can have details of which range of data it is handling. There will be no change I think for any reads as they work with interleaved data anyway. But may be we can improve there as well? Complexities: The change does not look very complicated. I am not taking into account how it will work when ranges are being changed for nodes. Vnodes might make this work more complicated. We can also have a bit on each sstable which says whether it is primary data or not. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-4872) Move manifest into sstable metadata
Jonathan Ellis created CASSANDRA-4872: - Summary: Move manifest into sstable metadata Key: CASSANDRA-4872 URL: https://issues.apache.org/jira/browse/CASSANDRA-4872 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Jonathan Ellis Priority: Minor Fix For: 1.3 Now that we have a metadata component it would be better to keep sstable level there, than in a separate manifest. With information per-sstable we don't need to do a full re-level if there is corruption. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4872) Move manifest into sstable metadata
[ https://issues.apache.org/jira/browse/CASSANDRA-4872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13486100#comment-13486100 ] Jonathan Ellis commented on CASSANDRA-4872: --- Possible complexity: - loading external or from-snapshot sstables will need to ignore the level and start at L0 instead - there may be code paths where we promote an sstable, unmodified. We'd need to update the level then, making the metadata not-quite-immutable. Move manifest into sstable metadata --- Key: CASSANDRA-4872 URL: https://issues.apache.org/jira/browse/CASSANDRA-4872 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Jonathan Ellis Priority: Minor Fix For: 1.3 Now that we have a metadata component it would be better to keep sstable level there, than in a separate manifest. With information per-sstable we don't need to do a full re-level if there is corruption. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-4699) Add TRACE support to binary protocol
[ https://issues.apache.org/jira/browse/CASSANDRA-4699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-4699: Attachment: 0002-Add-tracing-to-protocol.txt 0001-Separate-ClientState-and-QueryState.txt Attaching 2 patches for this. The first one is not directly related to tracing but is preparatory. The thing is that tracing uses ClientState to store the tracing UUID if the next query must be traced. However, the binary protocol is asynchronous and you can have more than one query per connection at the same time. In other words, ClientState applies to the connection, but that's too coarse grained for the binary protocol. In fact, that's a problem even before tracing because ClientState.getTimestamp() is kind of broken for the binary protocol (in fact that getTimestamp() bug is probably the main reason for fixing this, tracing could do without this tbh). Anyway, that first patch splits ClientState into a ClientState per-connection and a QueryState per-query (again, same for thrift, not so for binary proto). This also allow keeping a few things that are thrift only (like the CQL2 prepared statements) out of the binary protocol path, which is a bonus. The 2nd patch adds the tracing bits. I'll suggest putting that in 1.2 (rather than 1.2.1) because I'm not looking forward in breaking the protocol in the first minor release of 1.2. Add TRACE support to binary protocol Key: CASSANDRA-4699 URL: https://issues.apache.org/jira/browse/CASSANDRA-4699 Project: Cassandra Issue Type: Sub-task Components: API Affects Versions: 1.2.0 beta 1 Reporter: Jonathan Ellis Assignee: Sylvain Lebresne Fix For: 1.2.0 Attachments: 0001-Separate-ClientState-and-QueryState.txt, 0002-Add-tracing-to-protocol.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4815) Make CQL work naturally with wide rows
[ https://issues.apache.org/jira/browse/CASSANDRA-4815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13486115#comment-13486115 ] Edward Capriolo commented on CASSANDRA-4815: I do not think set and get are syntactic features that should be out of this discussion. I was doing some blogging this weekend and came to the re-realization that, BigTable just provides a simple low level API. So its fairly hard for us to argue that Cassandra should not have a simple set and get. Thinking further into this I think the new transport only being able to execute CQL queries is a huge defect. We are going to continually have these discussions about what we can and can't do in CQL, that we can do in thrift. We should not have to spend time designing CQL features to solve impedance mismatches between RPC and query languages, and we should not be redesigning Cassandra so every operation fits into a CQL language. We have to face a reality, it is going to be quite awkward for to clients to maintain multiple connection pools for client requests, 1 for thrift, one for cql2, and one for cql3, one for cql4, etc. The new transport should be able to piggyback thrift requests somehow, this way a user only needs to maintain a single client connection. Make CQL work naturally with wide rows -- Key: CASSANDRA-4815 URL: https://issues.apache.org/jira/browse/CASSANDRA-4815 Project: Cassandra Issue Type: Wish Reporter: Edward Capriolo Attachments: cql feature set updated.png, table.png I find that CQL3 is quite obtuse and does not provide me a language useful for accessing my data. First, lets point out how we should design Cassandra data. 1) Denormalize 2) Eliminate seeks 3) Design for read 4) optimize for blind writes So here is a schema that abides by these tried and tested rules large production uses are employing today. Say we have a table of movie objects: Movie Name Description - tags (string) - credits composite(role string, name string ) -1 likesToday -1 blacklisted The above structure is a movie notice it hold a mix of static and dynamic columns, but the other all number of columns is not very large. (even if it was larger this is OK as well) Notice this table is not just a single one to many relationship, it has 1 to 1 data and it has two sets of 1 to many data. The schema today is declared something like this: create column family movies with default_comparator=UTF8Type and column_metadata = [ {column_name: blacklisted, validation_class: int}, {column_name: likestoday, validation_class: long}, {column_name: description, validation_class: UTF8Type} ]; We should be able to insert data like this: set ['Cassandra Database, not looking for a seQL']['blacklisted']=1; set ['Cassandra Database, not looking for a seQL']['likesToday']=34; set ['Cassandra Database, not looking for a seQL']['credits-dir']='director:asf'; set ['Cassandra Database, not looking for a seQL']['credits-jir]='jiraguy:bob'; set ['Cassandra Database, not looking for a seQL']['tags-action']=''; set ['Cassandra Database, not looking for a seQL']['tags-adventure']=''; set ['Cassandra Database, not looking for a seQL']['tags-romance']=''; set ['Cassandra Database, not looking for a seQL']['tags-programming']=''; This is the correct way to do it. 1 seek to find all the information related to a movie. As long as this row does not get large there is no reason to optimize by breaking data into other column families. (Notice you can not transpose this because movies is two 1-to-many relationships of potentially different types) Lets look at the CQL3 way to do this design: First, contrary to the original design of cassandra CQL does not like wide rows. It also does not have a good way to dealing with dynamic rows together with static rows either. You have two options: Option 1: lose all schema create table movies ( name string, column blob, value blob, primary key(name)) with compact storage. This method is not so hot we have not lost all our validators, and by the way you have to physically shutdown everything and rename files and recreate your schema if you want to inform cassandra that a current table should be compact. This could at very least be just a metadata change. Also you can not add column schema either. Option 2 Normalize (is even worse) create table movie (name String, description string, likestoday int, blacklisted int); create table movecredits( name string, role string, personname string, primary key(name,role) ); create table movetags( name string, tag string, primary key (name,tag) ); This is a terrible design, of the 4 key characteristics how cassandra data should be designed it fails 3: It does
[jira] [Updated] (CASSANDRA-4781) Sometimes Cassandra starts compacting system-shema_columns cf repeatedly until the node is killed
[ https://issues.apache.org/jira/browse/CASSANDRA-4781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuki Morishita updated CASSANDRA-4781: -- Attachment: 4781-v3.txt v3 attached. Default interval is now 1 day in second unit with validation. Sometimes Cassandra starts compacting system-shema_columns cf repeatedly until the node is killed - Key: CASSANDRA-4781 URL: https://issues.apache.org/jira/browse/CASSANDRA-4781 Project: Cassandra Issue Type: Bug Affects Versions: 1.2.0 beta 1 Environment: Ubuntu 12.04, single-node Cassandra cluster Reporter: Aleksey Yeschenko Assignee: Yuki Morishita Fix For: 1.2.0 beta 2 Attachments: 4781.txt, 4781-v2.txt, 4781-v3.txt Cassandra starts flushing system-schema_columns cf in a seemingly infinite loop: INFO [CompactionExecutor:7] 2012-10-09 17:55:46,804 CompactionTask.java (line 239) Compacted to [/var/lib/cassandra/data/system/schema_columns/system-schema_columns-ia-32107-Data.db,]. 3,827 to 3,827 (~100% of original) bytes for 3 keys at 0.202762MB/s. Time: 18ms. INFO [CompactionExecutor:7] 2012-10-09 17:55:46,804 CompactionTask.java (line 119) Compacting [SSTableReader(path='/var/lib/cassandra/data/system/schema_columns/system-schema_columns-ia-32107-Data.db')] INFO [CompactionExecutor:7] 2012-10-09 17:55:46,824 CompactionTask.java (line 239) Compacted to [/var/lib/cassandra/data/system/schema_columns/system-schema_columns-ia-32108-Data.db,]. 3,827 to 3,827 (~100% of original) bytes for 3 keys at 0.182486MB/s. Time: 20ms. INFO [CompactionExecutor:7] 2012-10-09 17:55:46,825 CompactionTask.java (line 119) Compacting [SSTableReader(path='/var/lib/cassandra/data/system/schema_columns/system-schema_columns-ia-32108-Data.db')] INFO [CompactionExecutor:7] 2012-10-09 17:55:46,864 CompactionTask.java (line 239) Compacted to [/var/lib/cassandra/data/system/schema_columns/system-schema_columns-ia-32109-Data.db,]. 3,827 to 3,827 (~100% of original) bytes for 3 keys at 0.096045MB/s. Time: 38ms. INFO [CompactionExecutor:7] 2012-10-09 17:55:46,864 CompactionTask.java (line 119) Compacting [SSTableReader(path='/var/lib/cassandra/data/system/schema_columns/system-schema_columns-ia-32109-Data.db')] INFO [CompactionExecutor:7] 2012-10-09 17:55:46,894 CompactionTask.java (line 239) Compacted to [/var/lib/cassandra/data/system/schema_columns/system-schema_columns-ia-32110-Data.db,]. 3,827 to 3,827 (~100% of original) bytes for 3 keys at 0.121657MB/s. Time: 30ms. INFO [CompactionExecutor:7] 2012-10-09 17:55:46,894 CompactionTask.java (line 119) Compacting [SSTableReader(path='/var/lib/cassandra/data/system/schema_columns/system-schema_columns-ia-32110-Data.db')] INFO [CompactionExecutor:7] 2012-10-09 17:55:46,914 CompactionTask.java (line 239) Compacted to [/var/lib/cassandra/data/system/schema_columns/system-schema_columns-ia-32111-Data.db,]. 3,827 to 3,827 (~100% of original) bytes for 3 keys at 0.202762MB/s. Time: 18ms. INFO [CompactionExecutor:7] 2012-10-09 17:55:46,914 CompactionTask.java (line 119) Compacting [SSTableReader(path='/var/lib/cassandra/data/system/schema_columns/system-schema_columns-ia-32111-Data.db')] . Don't know what's causing it. Don't know a way to predictably trigger this behaviour. It just happens sometimes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (CASSANDRA-4815) Make CQL work naturally with wide rows
[ https://issues.apache.org/jira/browse/CASSANDRA-4815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13486115#comment-13486115 ] Edward Capriolo edited comment on CASSANDRA-4815 at 10/29/12 4:18 PM: -- I do not think set and get are syntactic features that should be out of this discussion. I was doing some blogging this weekend and came to the re-realization that, BigTable just provides a simple low level API. Being that Cassandra is based on bigtable, it is strange to argue that simple set has no place, and that everything needs to be a query. Thinking further into this I think the new transport only being able to execute CQL queries is a huge defect. We are going to continually have these discussions about what we can and can't do in CQL, that we can do in thrift. We should not have to spend time designing CQL features to solve impedance mismatches between RPC and query languages, and we should not be redesigning Cassandra so every operation fits into a CQL language. We have to face a reality, it is going to be quite awkward for to clients to maintain multiple connection pools for client requests, 1 for thrift, one for cql2, and one for cql3, one for cql4, etc. The new transport should be able to piggyback thrift requests somehow, this way a user only needs to maintain a single client connection. was (Author: appodictic): I do not think set and get are syntactic features that should be out of this discussion. I was doing some blogging this weekend and came to the re-realization that, BigTable just provides a simple low level API. So its fairly hard for us to argue that Cassandra should not have a simple set and get. Thinking further into this I think the new transport only being able to execute CQL queries is a huge defect. We are going to continually have these discussions about what we can and can't do in CQL, that we can do in thrift. We should not have to spend time designing CQL features to solve impedance mismatches between RPC and query languages, and we should not be redesigning Cassandra so every operation fits into a CQL language. We have to face a reality, it is going to be quite awkward for to clients to maintain multiple connection pools for client requests, 1 for thrift, one for cql2, and one for cql3, one for cql4, etc. The new transport should be able to piggyback thrift requests somehow, this way a user only needs to maintain a single client connection. Make CQL work naturally with wide rows -- Key: CASSANDRA-4815 URL: https://issues.apache.org/jira/browse/CASSANDRA-4815 Project: Cassandra Issue Type: Wish Reporter: Edward Capriolo Attachments: cql feature set updated.png, table.png I find that CQL3 is quite obtuse and does not provide me a language useful for accessing my data. First, lets point out how we should design Cassandra data. 1) Denormalize 2) Eliminate seeks 3) Design for read 4) optimize for blind writes So here is a schema that abides by these tried and tested rules large production uses are employing today. Say we have a table of movie objects: Movie Name Description - tags (string) - credits composite(role string, name string ) -1 likesToday -1 blacklisted The above structure is a movie notice it hold a mix of static and dynamic columns, but the other all number of columns is not very large. (even if it was larger this is OK as well) Notice this table is not just a single one to many relationship, it has 1 to 1 data and it has two sets of 1 to many data. The schema today is declared something like this: create column family movies with default_comparator=UTF8Type and column_metadata = [ {column_name: blacklisted, validation_class: int}, {column_name: likestoday, validation_class: long}, {column_name: description, validation_class: UTF8Type} ]; We should be able to insert data like this: set ['Cassandra Database, not looking for a seQL']['blacklisted']=1; set ['Cassandra Database, not looking for a seQL']['likesToday']=34; set ['Cassandra Database, not looking for a seQL']['credits-dir']='director:asf'; set ['Cassandra Database, not looking for a seQL']['credits-jir]='jiraguy:bob'; set ['Cassandra Database, not looking for a seQL']['tags-action']=''; set ['Cassandra Database, not looking for a seQL']['tags-adventure']=''; set ['Cassandra Database, not looking for a seQL']['tags-romance']=''; set ['Cassandra Database, not looking for a seQL']['tags-programming']=''; This is the correct way to do it. 1 seek to find all the information related to a movie. As long as this row does not get large there is no reason to optimize by breaking data into other column families. (Notice you can
[jira] [Commented] (CASSANDRA-4872) Move manifest into sstable metadata
[ https://issues.apache.org/jira/browse/CASSANDRA-4872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13486124#comment-13486124 ] Sylvain Lebresne commented on CASSANDRA-4872: - bq. With information per-sstable we don't need to do a full re-level if there is corruption. I'm not completely sure what you mean by that? Even if we do that ticket, we will need to reconstruct the manifest in-memory, and in doing that we might get corruption as well. I'm not saying I'm completely opposed to the idea, but I'm not I understand the benefits and it does seem to introduce some complexity (you'd have to reconstruct the manifest info from spread out sources). Move manifest into sstable metadata --- Key: CASSANDRA-4872 URL: https://issues.apache.org/jira/browse/CASSANDRA-4872 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Jonathan Ellis Priority: Minor Fix For: 1.3 Now that we have a metadata component it would be better to keep sstable level there, than in a separate manifest. With information per-sstable we don't need to do a full re-level if there is corruption. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4679) Fix binary protocol NEW_NODE event
[ https://issues.apache.org/jira/browse/CASSANDRA-4679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13486132#comment-13486132 ] Yuki Morishita commented on CASSANDRA-4679: --- Patch looks good, though AntiEntropyServiceStandardTest/AntiEntropyServiceCounterTest is failing after applying this patch. Fix binary protocol NEW_NODE event -- Key: CASSANDRA-4679 URL: https://issues.apache.org/jira/browse/CASSANDRA-4679 Project: Cassandra Issue Type: Bug Affects Versions: 1.2.0 beta 1 Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Priority: Minor Fix For: 1.2.0 beta 2 Attachments: 0001-4679.txt, 0002-Start-RPC-binary-protocol-before-gossip.txt As discussed on CASSANDRA-4480, the NEW_NODE/REMOVED_NODE of the binary protocol are not correctly fired (NEW_NODE is fired on node UP basically). This ticket is to fix that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4872) Move manifest into sstable metadata
[ https://issues.apache.org/jira/browse/CASSANDRA-4872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13486142#comment-13486142 ] Jonathan Ellis commented on CASSANDRA-4872: --- bq. I'm not completely sure what you mean by that? What I mean is, if you lose or corrupt the .manifest right now you're SOL and have to put everything in L0 and start over. If it's per-sstable then you don't have this extra non-sstable component causing fragility. Move manifest into sstable metadata --- Key: CASSANDRA-4872 URL: https://issues.apache.org/jira/browse/CASSANDRA-4872 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Jonathan Ellis Priority: Minor Fix For: 1.3 Now that we have a metadata component it would be better to keep sstable level there, than in a separate manifest. With information per-sstable we don't need to do a full re-level if there is corruption. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4815) Make CQL work naturally with wide rows
[ https://issues.apache.org/jira/browse/CASSANDRA-4815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13486158#comment-13486158 ] Jonathan Ellis commented on CASSANDRA-4815: --- I'm still not sure we're on the same page as far as GET and SET go. I'm saying that functionally, if you have {code} create column family test; {code} (all cli defaults -- everything is bytes), then {code} set test['ff']['dd'] = 'cc'; {code} in the cli (translation to Thrift left as an exercise for the reader) is EXACTLY the same as {code} insert into test(key, column1, value) values ('ff', 'dd', 'cc'); {code} in cql. If you think we're missing functionality here then let's clear that up. But if you're hung up on the syntax then we'll have to agree to disagree. Make CQL work naturally with wide rows -- Key: CASSANDRA-4815 URL: https://issues.apache.org/jira/browse/CASSANDRA-4815 Project: Cassandra Issue Type: Wish Reporter: Edward Capriolo Attachments: cql feature set updated.png, table.png I find that CQL3 is quite obtuse and does not provide me a language useful for accessing my data. First, lets point out how we should design Cassandra data. 1) Denormalize 2) Eliminate seeks 3) Design for read 4) optimize for blind writes So here is a schema that abides by these tried and tested rules large production uses are employing today. Say we have a table of movie objects: Movie Name Description - tags (string) - credits composite(role string, name string ) -1 likesToday -1 blacklisted The above structure is a movie notice it hold a mix of static and dynamic columns, but the other all number of columns is not very large. (even if it was larger this is OK as well) Notice this table is not just a single one to many relationship, it has 1 to 1 data and it has two sets of 1 to many data. The schema today is declared something like this: create column family movies with default_comparator=UTF8Type and column_metadata = [ {column_name: blacklisted, validation_class: int}, {column_name: likestoday, validation_class: long}, {column_name: description, validation_class: UTF8Type} ]; We should be able to insert data like this: set ['Cassandra Database, not looking for a seQL']['blacklisted']=1; set ['Cassandra Database, not looking for a seQL']['likesToday']=34; set ['Cassandra Database, not looking for a seQL']['credits-dir']='director:asf'; set ['Cassandra Database, not looking for a seQL']['credits-jir]='jiraguy:bob'; set ['Cassandra Database, not looking for a seQL']['tags-action']=''; set ['Cassandra Database, not looking for a seQL']['tags-adventure']=''; set ['Cassandra Database, not looking for a seQL']['tags-romance']=''; set ['Cassandra Database, not looking for a seQL']['tags-programming']=''; This is the correct way to do it. 1 seek to find all the information related to a movie. As long as this row does not get large there is no reason to optimize by breaking data into other column families. (Notice you can not transpose this because movies is two 1-to-many relationships of potentially different types) Lets look at the CQL3 way to do this design: First, contrary to the original design of cassandra CQL does not like wide rows. It also does not have a good way to dealing with dynamic rows together with static rows either. You have two options: Option 1: lose all schema create table movies ( name string, column blob, value blob, primary key(name)) with compact storage. This method is not so hot we have not lost all our validators, and by the way you have to physically shutdown everything and rename files and recreate your schema if you want to inform cassandra that a current table should be compact. This could at very least be just a metadata change. Also you can not add column schema either. Option 2 Normalize (is even worse) create table movie (name String, description string, likestoday int, blacklisted int); create table movecredits( name string, role string, personname string, primary key(name,role) ); create table movetags( name string, tag string, primary key (name,tag) ); This is a terrible design, of the 4 key characteristics how cassandra data should be designed it fails 3: It does not: 1) Denormalize 2) Eliminate seeks 3) Design for read Why is Cassandra steering toward this course, by making a language that does not understand wide rows? So what can be done? My suggestions: Cassandra needs to lose the COMPACT STORAGE conversions. Each table needs a virtual view that is compact storage with no work to migrate data and recreate schemas. Every table should have a compact view for the schemaless, or a simple query hint like /*transposed*/ should make this change. Metadata
[jira] [Updated] (CASSANDRA-4679) Fix binary protocol NEW_NODE event
[ https://issues.apache.org/jira/browse/CASSANDRA-4679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-4679: Attachment: 0003-Remove-hardcoded-initServer-from-AntiEntropyServiceTes.txt Interesting. This is due to the hardcoded call to StorageServer.initServer() in AntiEntropyServiceTestAbstract. But I have absolutely no clue what we have that call. In fact, removing that call (patch 3 attached) fixes the test. I'm not totally sure why the test was working previously, maybe the 2 patch of this ticket just changed the timing of the server initialization triggering that issue? Fix binary protocol NEW_NODE event -- Key: CASSANDRA-4679 URL: https://issues.apache.org/jira/browse/CASSANDRA-4679 Project: Cassandra Issue Type: Bug Affects Versions: 1.2.0 beta 1 Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Priority: Minor Fix For: 1.2.0 beta 2 Attachments: 0001-4679.txt, 0002-Start-RPC-binary-protocol-before-gossip.txt, 0003-Remove-hardcoded-initServer-from-AntiEntropyServiceTes.txt As discussed on CASSANDRA-4480, the NEW_NODE/REMOVED_NODE of the binary protocol are not correctly fired (NEW_NODE is fired on node UP basically). This ticket is to fix that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4417) invalid counter shard detected
[ https://issues.apache.org/jira/browse/CASSANDRA-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13486232#comment-13486232 ] Jonathan Ellis commented on CASSANDRA-4417: --- On a bootstrap sounds more like CASSANDRA-4071. invalid counter shard detected --- Key: CASSANDRA-4417 URL: https://issues.apache.org/jira/browse/CASSANDRA-4417 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.1.1 Environment: Amazon Linux Reporter: Senthilvel Rangaswamy Seeing errors like these: 2012-07-06_07:00:27.22662 ERROR 07:00:27,226 invalid counter shard detected; (17bfd850-ac52-11e1--6ecd0b5b61e7, 1, 13) and (17bfd850-ac52-11e1--6ecd0b5b61e7, 1, 1) differ only in count; will pick highest to self-heal; this indicates a bug or corruption generated a bad counter shard What does it mean ? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-4873) Source side retries for streaming
Nick Bailey created CASSANDRA-4873: -- Summary: Source side retries for streaming Key: CASSANDRA-4873 URL: https://issues.apache.org/jira/browse/CASSANDRA-4873 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Nick Bailey This should help make streaming more robust but also it is necessary for tools like the bulk loader to be able to retry streaming. Currently if streaming fails the cassandra nodes will attempt to initiate a retry with the node that is bulk loading, but since that node is not a ring/gossip member, it will not succeed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CASSANDRA-4873) Source side retries for streaming
[ https://issues.apache.org/jira/browse/CASSANDRA-4873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Bailey resolved CASSANDRA-4873. Resolution: Invalid Bad analysis of the problem. I'm actually seeing CASSANDRA-4813 Source side retries for streaming - Key: CASSANDRA-4873 URL: https://issues.apache.org/jira/browse/CASSANDRA-4873 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Nick Bailey This should help make streaming more robust but also it is necessary for tools like the bulk loader to be able to retry streaming. Currently if streaming fails the cassandra nodes will attempt to initiate a retry with the node that is bulk loading, but since that node is not a ring/gossip member, it will not succeed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3306) Error in LeveledCompactionStrategy
[ https://issues.apache.org/jira/browse/CASSANDRA-3306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuki Morishita updated CASSANDRA-3306: -- Attachment: 0002-fail-stream-session-for-invalid-request.patch 0001-change-DataTracker.View-s-sstables-from-List-to-Set.patch Attaching first attempt. I changed DataTracker.View's sstables to Set, and made stream fail when file arrives after StreamInSession failed. Changing List to Set for sstables sometimes makes CollationControllerTest fail. It was introduced in CASSANDRA-4116, and I think the test and CollationController#collectAllData expect sstables to be ordered by timestamp. I'm not sure if the test is obsolete or we really need sstables to be sorted all the time. 0002 patch alone will fix the issue, so we can apply that for now. Error in LeveledCompactionStrategy -- Key: CASSANDRA-3306 URL: https://issues.apache.org/jira/browse/CASSANDRA-3306 Project: Cassandra Issue Type: Bug Affects Versions: 1.0.0 Reporter: Radim Kolar Assignee: Yuki Morishita Fix For: 1.1.7 Attachments: 0001-CASSANDRA-3306-test.patch, 0001-change-DataTracker.View-s-sstables-from-List-to-Set.patch, 0002-fail-stream-session-for-invalid-request.patch during stress testing, i always get this error making leveledcompaction strategy unusable. Should be easy to reproduce - just write fast. ERROR [CompactionExecutor:6] 2011-10-04 15:48:52,179 AbstractCassandraDaemon.java (line 133) Fatal exception in thread Thread[CompactionExecutor:6,5,main] java.lang.AssertionError at org.apache.cassandra.db.DataTracker$View.newSSTables(DataTracker.java:580) at org.apache.cassandra.db.DataTracker$View.replace(DataTracker.java:546) at org.apache.cassandra.db.DataTracker.replace(DataTracker.java:268) at org.apache.cassandra.db.DataTracker.replaceCompactedSSTables(DataTracker.java:232) at org.apache.cassandra.db.ColumnFamilyStore.replaceCompactedSSTables(ColumnFamilyStore.java:960) at org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:199) at org.apache.cassandra.db.compaction.LeveledCompactionTask.execute(LeveledCompactionTask.java:47) at org.apache.cassandra.db.compaction.CompactionManager$1.call(CompactionManager.java:131) at org.apache.cassandra.db.compaction.CompactionManager$1.call(CompactionManager.java:114) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) and this is in json data for table: { generations : [ { generation : 0, members : [ 459, 460, 461, 462, 463, 464, 465, 466, 467, 468, 469, 470, 471, 472, 473, 474, 475, 476, 477, 478, 479, 480, 481, 482, 483, 484 ] }, { generation : 1, members : [ ] }, { generation : 2, members : [ ] }, { generation : 3, members : [ ] }, { generation : 4, members : [ ] }, { generation : 5, members : [ ] }, { generation : 6, members : [ ] }, { generation : 7, members : [ ] } ] } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4679) Fix binary protocol NEW_NODE event
[ https://issues.apache.org/jira/browse/CASSANDRA-4679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13486354#comment-13486354 ] Yuki Morishita commented on CASSANDRA-4679: --- I found the difference between patched and trunk. Your initServer tries to join ring even after server is initialized, whilst in trunk it is guarded by initialized check. I think it is better to check if initialized before calling matbeJoinRing in your patch. Fix binary protocol NEW_NODE event -- Key: CASSANDRA-4679 URL: https://issues.apache.org/jira/browse/CASSANDRA-4679 Project: Cassandra Issue Type: Bug Affects Versions: 1.2.0 beta 1 Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Priority: Minor Fix For: 1.2.0 beta 2 Attachments: 0001-4679.txt, 0002-Start-RPC-binary-protocol-before-gossip.txt, 0003-Remove-hardcoded-initServer-from-AntiEntropyServiceTes.txt As discussed on CASSANDRA-4480, the NEW_NODE/REMOVED_NODE of the binary protocol are not correctly fired (NEW_NODE is fired on node UP basically). This ticket is to fix that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (CASSANDRA-4865) Off-heap bloom filters
[ https://issues.apache.org/jira/browse/CASSANDRA-4865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13486429#comment-13486429 ] Vijay edited comment on CASSANDRA-4865 at 10/29/12 10:20 PM: - Attached patch includes the nit and updated tests. re-ran the test on 32 cores and shows bigger diffrence. ||Open BS set's|Open BS get's|Open BS clear's|Offheap BS set's|Offheap BS get's|Offheap BS clear's| ||593|495|658|408|132|392| ||606|479|749|394|130|556| ||542|452|736|394|130|554| ||543|460|736|395|129|555| ||542|452|736|391|129|554| ||541|465|735|391|129|555| ||542|454|737|392|129|555| ||541|452|736|391|130|554| ||542|452|735|391|129|554| ||542|452|736|391|130|554| ||542|519|736|391|130|554| ran stress just for fun before: aprox 2750 after: aprox 2800 Its a big win for a small change i guess :) was (Author: vijay2...@yahoo.com): Attached patch includes the nit and updated tests. re-ran the test on 32 cores and shows bigger diffrence. ||593|495|658|408|132|392| ||606|479|749|394|130|556| ||542|452|736|394|130|554| ||543|460|736|395|129|555| ||542|452|736|391|129|554| ||541|465|735|391|129|555| ||542|454|737|392|129|555| ||541|452|736|391|130|554| ||542|452|735|391|129|554| ||542|452|736|391|130|554| ||542|519|736|391|130|554| ran stress just for fun before: aprox 2750 after: aprox 2800 Its a big win for a small change i guess :) Off-heap bloom filters -- Key: CASSANDRA-4865 URL: https://issues.apache.org/jira/browse/CASSANDRA-4865 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Jonathan Ellis Assignee: Vijay Fix For: 1.2.1 Attachments: 0001-CASSANDRA-4865.patch, 0001-CASSANDRA-4865-v2.patch Bloom filters are the major user of heap as dataset grows. It's probably worth it to move these off heap. No extra refcounting needs to be done since we already refcount SSTableReader. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-4865) Off-heap bloom filters
[ https://issues.apache.org/jira/browse/CASSANDRA-4865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vijay updated CASSANDRA-4865: - Attachment: 0001-CASSANDRA-4865-v2.patch Attached patch includes the nit and updated tests. re-ran the test on 32 cores and shows bigger diffrence. ||593|495|658|408|132|392| ||606|479|749|394|130|556| ||542|452|736|394|130|554| ||543|460|736|395|129|555| ||542|452|736|391|129|554| ||541|465|735|391|129|555| ||542|454|737|392|129|555| ||541|452|736|391|130|554| ||542|452|735|391|129|554| ||542|452|736|391|130|554| ||542|519|736|391|130|554| ran stress just for fun before: aprox 2750 after: aprox 2800 Its a big win for a small change i guess :) Off-heap bloom filters -- Key: CASSANDRA-4865 URL: https://issues.apache.org/jira/browse/CASSANDRA-4865 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Jonathan Ellis Assignee: Vijay Fix For: 1.2.1 Attachments: 0001-CASSANDRA-4865.patch, 0001-CASSANDRA-4865-v2.patch Bloom filters are the major user of heap as dataset grows. It's probably worth it to move these off heap. No extra refcounting needs to be done since we already refcount SSTableReader. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (CASSANDRA-4865) Off-heap bloom filters
[ https://issues.apache.org/jira/browse/CASSANDRA-4865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13486429#comment-13486429 ] Vijay edited comment on CASSANDRA-4865 at 10/29/12 10:23 PM: - Attached patch includes the nit and updated tests. re-ran the test on 32 cores and shows bigger diffrence. ||Open BS set's|Open BS get's|Open BS clear's|Offheap BS set's|Offheap BS get's|Offheap BS clear's| ||593|495|658|408|132|392| ||606|479|749|394|130|556| ||542|452|736|394|130|554| ||543|460|736|395|129|555| ||542|452|736|391|129|554| ||541|465|735|391|129|555| ||542|454|737|392|129|555| ||541|452|736|391|130|554| ||542|452|735|391|129|554| ||542|452|736|391|130|554| ||542|519|736|391|130|554| ran stress just for fun before: aprox 2750/sec after: aprox 2800/sec Its a big win for a small change i guess :) was (Author: vijay2...@yahoo.com): Attached patch includes the nit and updated tests. re-ran the test on 32 cores and shows bigger diffrence. ||Open BS set's|Open BS get's|Open BS clear's|Offheap BS set's|Offheap BS get's|Offheap BS clear's| ||593|495|658|408|132|392| ||606|479|749|394|130|556| ||542|452|736|394|130|554| ||543|460|736|395|129|555| ||542|452|736|391|129|554| ||541|465|735|391|129|555| ||542|454|737|392|129|555| ||541|452|736|391|130|554| ||542|452|735|391|129|554| ||542|452|736|391|130|554| ||542|519|736|391|130|554| ran stress just for fun before: aprox 2750 after: aprox 2800 Its a big win for a small change i guess :) Off-heap bloom filters -- Key: CASSANDRA-4865 URL: https://issues.apache.org/jira/browse/CASSANDRA-4865 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Jonathan Ellis Assignee: Vijay Fix For: 1.2.1 Attachments: 0001-CASSANDRA-4865.patch, 0001-CASSANDRA-4865-v2.patch Bloom filters are the major user of heap as dataset grows. It's probably worth it to move these off heap. No extra refcounting needs to be done since we already refcount SSTableReader. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4781) Sometimes Cassandra starts compacting system-shema_columns cf repeatedly until the node is killed
[ https://issues.apache.org/jira/browse/CASSANDRA-4781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13486461#comment-13486461 ] Mina Naguib commented on CASSANDRA-4781: FWIW I've just hit this bug, on the very first startup of a node after it was upgraded from 1.1.2 to 1.1.6 Sometimes Cassandra starts compacting system-shema_columns cf repeatedly until the node is killed - Key: CASSANDRA-4781 URL: https://issues.apache.org/jira/browse/CASSANDRA-4781 Project: Cassandra Issue Type: Bug Affects Versions: 1.2.0 beta 1 Environment: Ubuntu 12.04, single-node Cassandra cluster Reporter: Aleksey Yeschenko Assignee: Yuki Morishita Fix For: 1.2.0 beta 2 Attachments: 4781.txt, 4781-v2.txt, 4781-v3.txt Cassandra starts flushing system-schema_columns cf in a seemingly infinite loop: INFO [CompactionExecutor:7] 2012-10-09 17:55:46,804 CompactionTask.java (line 239) Compacted to [/var/lib/cassandra/data/system/schema_columns/system-schema_columns-ia-32107-Data.db,]. 3,827 to 3,827 (~100% of original) bytes for 3 keys at 0.202762MB/s. Time: 18ms. INFO [CompactionExecutor:7] 2012-10-09 17:55:46,804 CompactionTask.java (line 119) Compacting [SSTableReader(path='/var/lib/cassandra/data/system/schema_columns/system-schema_columns-ia-32107-Data.db')] INFO [CompactionExecutor:7] 2012-10-09 17:55:46,824 CompactionTask.java (line 239) Compacted to [/var/lib/cassandra/data/system/schema_columns/system-schema_columns-ia-32108-Data.db,]. 3,827 to 3,827 (~100% of original) bytes for 3 keys at 0.182486MB/s. Time: 20ms. INFO [CompactionExecutor:7] 2012-10-09 17:55:46,825 CompactionTask.java (line 119) Compacting [SSTableReader(path='/var/lib/cassandra/data/system/schema_columns/system-schema_columns-ia-32108-Data.db')] INFO [CompactionExecutor:7] 2012-10-09 17:55:46,864 CompactionTask.java (line 239) Compacted to [/var/lib/cassandra/data/system/schema_columns/system-schema_columns-ia-32109-Data.db,]. 3,827 to 3,827 (~100% of original) bytes for 3 keys at 0.096045MB/s. Time: 38ms. INFO [CompactionExecutor:7] 2012-10-09 17:55:46,864 CompactionTask.java (line 119) Compacting [SSTableReader(path='/var/lib/cassandra/data/system/schema_columns/system-schema_columns-ia-32109-Data.db')] INFO [CompactionExecutor:7] 2012-10-09 17:55:46,894 CompactionTask.java (line 239) Compacted to [/var/lib/cassandra/data/system/schema_columns/system-schema_columns-ia-32110-Data.db,]. 3,827 to 3,827 (~100% of original) bytes for 3 keys at 0.121657MB/s. Time: 30ms. INFO [CompactionExecutor:7] 2012-10-09 17:55:46,894 CompactionTask.java (line 119) Compacting [SSTableReader(path='/var/lib/cassandra/data/system/schema_columns/system-schema_columns-ia-32110-Data.db')] INFO [CompactionExecutor:7] 2012-10-09 17:55:46,914 CompactionTask.java (line 239) Compacted to [/var/lib/cassandra/data/system/schema_columns/system-schema_columns-ia-32111-Data.db,]. 3,827 to 3,827 (~100% of original) bytes for 3 keys at 0.202762MB/s. Time: 18ms. INFO [CompactionExecutor:7] 2012-10-09 17:55:46,914 CompactionTask.java (line 119) Compacting [SSTableReader(path='/var/lib/cassandra/data/system/schema_columns/system-schema_columns-ia-32111-Data.db')] . Don't know what's causing it. Don't know a way to predictably trigger this behaviour. It just happens sometimes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4781) Sometimes Cassandra starts compacting system-shema_columns cf repeatedly until the node is killed
[ https://issues.apache.org/jira/browse/CASSANDRA-4781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13486606#comment-13486606 ] Mina Naguib commented on CASSANDRA-4781: Restarting the node produced the same endless loop. Downgrading back to 1.1.2 fixed it. Sometimes Cassandra starts compacting system-shema_columns cf repeatedly until the node is killed - Key: CASSANDRA-4781 URL: https://issues.apache.org/jira/browse/CASSANDRA-4781 Project: Cassandra Issue Type: Bug Affects Versions: 1.2.0 beta 1 Environment: Ubuntu 12.04, single-node Cassandra cluster Reporter: Aleksey Yeschenko Assignee: Yuki Morishita Fix For: 1.2.0 beta 2 Attachments: 4781.txt, 4781-v2.txt, 4781-v3.txt Cassandra starts flushing system-schema_columns cf in a seemingly infinite loop: INFO [CompactionExecutor:7] 2012-10-09 17:55:46,804 CompactionTask.java (line 239) Compacted to [/var/lib/cassandra/data/system/schema_columns/system-schema_columns-ia-32107-Data.db,]. 3,827 to 3,827 (~100% of original) bytes for 3 keys at 0.202762MB/s. Time: 18ms. INFO [CompactionExecutor:7] 2012-10-09 17:55:46,804 CompactionTask.java (line 119) Compacting [SSTableReader(path='/var/lib/cassandra/data/system/schema_columns/system-schema_columns-ia-32107-Data.db')] INFO [CompactionExecutor:7] 2012-10-09 17:55:46,824 CompactionTask.java (line 239) Compacted to [/var/lib/cassandra/data/system/schema_columns/system-schema_columns-ia-32108-Data.db,]. 3,827 to 3,827 (~100% of original) bytes for 3 keys at 0.182486MB/s. Time: 20ms. INFO [CompactionExecutor:7] 2012-10-09 17:55:46,825 CompactionTask.java (line 119) Compacting [SSTableReader(path='/var/lib/cassandra/data/system/schema_columns/system-schema_columns-ia-32108-Data.db')] INFO [CompactionExecutor:7] 2012-10-09 17:55:46,864 CompactionTask.java (line 239) Compacted to [/var/lib/cassandra/data/system/schema_columns/system-schema_columns-ia-32109-Data.db,]. 3,827 to 3,827 (~100% of original) bytes for 3 keys at 0.096045MB/s. Time: 38ms. INFO [CompactionExecutor:7] 2012-10-09 17:55:46,864 CompactionTask.java (line 119) Compacting [SSTableReader(path='/var/lib/cassandra/data/system/schema_columns/system-schema_columns-ia-32109-Data.db')] INFO [CompactionExecutor:7] 2012-10-09 17:55:46,894 CompactionTask.java (line 239) Compacted to [/var/lib/cassandra/data/system/schema_columns/system-schema_columns-ia-32110-Data.db,]. 3,827 to 3,827 (~100% of original) bytes for 3 keys at 0.121657MB/s. Time: 30ms. INFO [CompactionExecutor:7] 2012-10-09 17:55:46,894 CompactionTask.java (line 119) Compacting [SSTableReader(path='/var/lib/cassandra/data/system/schema_columns/system-schema_columns-ia-32110-Data.db')] INFO [CompactionExecutor:7] 2012-10-09 17:55:46,914 CompactionTask.java (line 239) Compacted to [/var/lib/cassandra/data/system/schema_columns/system-schema_columns-ia-32111-Data.db,]. 3,827 to 3,827 (~100% of original) bytes for 3 keys at 0.202762MB/s. Time: 18ms. INFO [CompactionExecutor:7] 2012-10-09 17:55:46,914 CompactionTask.java (line 119) Compacting [SSTableReader(path='/var/lib/cassandra/data/system/schema_columns/system-schema_columns-ia-32111-Data.db')] . Don't know what's causing it. Don't know a way to predictably trigger this behaviour. It just happens sometimes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-4874) Possible authorizaton handling impovements
Aleksey Yeschenko created CASSANDRA-4874: Summary: Possible authorizaton handling impovements Key: CASSANDRA-4874 URL: https://issues.apache.org/jira/browse/CASSANDRA-4874 Project: Cassandra Issue Type: Improvement Affects Versions: 1.2.0 beta 1, 1.1.6 Reporter: Aleksey Yeschenko I'll create another issue with my suggestions about fixing/improving IAuthority interfaces. This one lists possible improvements that aren't related to grant/revoke methods. Inconsistencies: - CREATE COLUMNFAMILY: P.CREATE on the KS in CQL2 vs. P.CREATE on the CF in CQL3 and Thrift - BATCH: P.UPDATE or P.DELETE on CF in CQL2 vs. P.UPDATE in CQL3 and Thrift (despite remove* in Thrift asking for P.DELETE) - DELETE: P.DELETE in CQL2 and Thrift vs. P.UPDATE in CQL3 - DROP INDEX: no checks in CQL2 vs. P.ALTER on the CF in CQL3 Other issues/suggestions - CQL2 DROP INDEX should require authorization - current permission checks are inconsistent since they are performed separately by CQL2 query processor, Thrift CassandraServer and CQL3 statement classes. We should move it to one place. SomeClassWithABetterName.authorize(Operation, KS, CF, User), where operation would be a enum (ALTER_KEYSPACE, ALTER_TABLE, CREATE_TABLE, CREATE, USE, UPDATE etc.), CF should be nullable. - we don't respect the hierarchy when checking for permissions, or, to be more specific, we are doing it wrong. take CQL3 INSERT as an example: we require P.UPDATE on the CF or FULL_ACCESS on either KS or CF. However, having P.UPDATE on the KS won't allow you to perform the statement, only FULL_ACCESS will do. I doubt this was intentional, and if it was, I say it's wrong. P.UPDATE on the KS should allow you to do updates on KS's cfs. Examples in http://www.datastax.com/dev/blog/dynamic-permission-allocation-in-cassandra-1-1 point to it being a bug, since REVOKE UPDATE ON ks FROM omega is there. - currently we lack a way to set permission on cassandra/keyspaces resource. I think we should be able to do it. See the following point on why. - currently to create a keyspace you must have a P.CREATE permission on that keyspace THAT DOESN'T EVEN EXIST YET. So only a superuser can create a keyspace, or a superuser must first grant you a permission to create it. Which doesn't look right to me. P.CREATE on cassandra/keyspaces should allow you to create new keyspaces without an explicit permission for each of them. - same goes for CREATE TABLE. you need P.CREATE on that not-yet-existing CF of FULL_ACCESS on the whole KS. P.CREATE on the KS won't do. this is wrong. - since permissions don't map directly to statements, we should describe clearly in the documentation what permissions are required by what cql statement/thrift method. Full list of current permission requirements: https://gist.github.com/3978182 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-4874) Possible authorizaton handling impovements
[ https://issues.apache.org/jira/browse/CASSANDRA-4874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Yeschenko updated CASSANDRA-4874: - Assignee: Aleksey Yeschenko Possible authorizaton handling impovements -- Key: CASSANDRA-4874 URL: https://issues.apache.org/jira/browse/CASSANDRA-4874 Project: Cassandra Issue Type: Improvement Affects Versions: 1.1.6, 1.2.0 beta 1 Reporter: Aleksey Yeschenko Assignee: Aleksey Yeschenko Labels: security I'll create another issue with my suggestions about fixing/improving IAuthority interfaces. This one lists possible improvements that aren't related to grant/revoke methods. Inconsistencies: - CREATE COLUMNFAMILY: P.CREATE on the KS in CQL2 vs. P.CREATE on the CF in CQL3 and Thrift - BATCH: P.UPDATE or P.DELETE on CF in CQL2 vs. P.UPDATE in CQL3 and Thrift (despite remove* in Thrift asking for P.DELETE) - DELETE: P.DELETE in CQL2 and Thrift vs. P.UPDATE in CQL3 - DROP INDEX: no checks in CQL2 vs. P.ALTER on the CF in CQL3 Other issues/suggestions - CQL2 DROP INDEX should require authorization - current permission checks are inconsistent since they are performed separately by CQL2 query processor, Thrift CassandraServer and CQL3 statement classes. We should move it to one place. SomeClassWithABetterName.authorize(Operation, KS, CF, User), where operation would be a enum (ALTER_KEYSPACE, ALTER_TABLE, CREATE_TABLE, CREATE, USE, UPDATE etc.), CF should be nullable. - we don't respect the hierarchy when checking for permissions, or, to be more specific, we are doing it wrong. take CQL3 INSERT as an example: we require P.UPDATE on the CF or FULL_ACCESS on either KS or CF. However, having P.UPDATE on the KS won't allow you to perform the statement, only FULL_ACCESS will do. I doubt this was intentional, and if it was, I say it's wrong. P.UPDATE on the KS should allow you to do updates on KS's cfs. Examples in http://www.datastax.com/dev/blog/dynamic-permission-allocation-in-cassandra-1-1 point to it being a bug, since REVOKE UPDATE ON ks FROM omega is there. - currently we lack a way to set permission on cassandra/keyspaces resource. I think we should be able to do it. See the following point on why. - currently to create a keyspace you must have a P.CREATE permission on that keyspace THAT DOESN'T EVEN EXIST YET. So only a superuser can create a keyspace, or a superuser must first grant you a permission to create it. Which doesn't look right to me. P.CREATE on cassandra/keyspaces should allow you to create new keyspaces without an explicit permission for each of them. - same goes for CREATE TABLE. you need P.CREATE on that not-yet-existing CF of FULL_ACCESS on the whole KS. P.CREATE on the KS won't do. this is wrong. - since permissions don't map directly to statements, we should describe clearly in the documentation what permissions are required by what cql statement/thrift method. Full list of current permission requirements: https://gist.github.com/3978182 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-4875) Possible improvements to IAuthority[2] interface
Aleksey Yeschenko created CASSANDRA-4875: Summary: Possible improvements to IAuthority[2] interface Key: CASSANDRA-4875 URL: https://issues.apache.org/jira/browse/CASSANDRA-4875 Project: Cassandra Issue Type: Improvement Affects Versions: 1.2.0 beta 1, 1.1.6 Reporter: Aleksey Yeschenko Assignee: Aleksey Yeschenko CASSANDRA-4874 is about general improvements to authorization handling, this one is about IAuthority[2] in particular. - 'LIST GRANTS OF user should' become 'LIST PERMISSIONS [on resource] [of user]'. Currently there is no way to see all the permissions on the resource, only all the permissions of a particular user. - IAuthority2.listPermissions() should return a generic collection of ResoucePermission or something, not CQLResult or ResultMessage. That's a wrong level of abstraction. I know this issue has been raised here - https://issues.apache.org/jira/browse/CASSANDRA-4490?focusedCommentId=13449732page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13449732com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13449732, but I think it's possible to change this. Returning a list of {resource, user, permission, grant_option} tuples should be possible. - We should get rid of Permission.NO_ACCESS. An empty list of permissions should mean absence of any permission, not some magical Permission.NO_ACCESS value. It's insecure and error-prone and also ambiguous (what if a user has both FULL_ACCESS and NO_ACCESS permissions? If it's meant to be a way to strip a user of all permissions on the resource, then it should be replaced with some form of REVOKE statement. Something like 'REVOKE ALL PERMISSIONS' sounds more logical than GRANT NO_ACCESS to me. - Previous point will probably require adding revokeAllPermissions() method to make it explicit, special-casing IAuthority2.revoke() won't do - IAuthorize2.grant() and IAuthorize2.revoke() accept CFName instance for a resource, which has its ks and cf fields swapped if cf is omitted. This may cause a real security issue if IAuthorize2 implementer doesn't know about the issue. We must pass the resouce as a collection of strings ([cassandra, keyspaces[, ks_name][, cf_name]]) instead, the way we pass it to IAuthorize.authorize(). - We should probably get rid of FULL_ACCESS as well, at least as a valid permission value (but maybe allow it in the CQL statement) and add an equivalent IAuthority2.grantAllPermissions(), separately. Why? Imagine the following sequence: GRANT FULL_ACCESS ON resource FOR user; REVOKE SELECT ON resource FROM user; should the user be allowed to SELECT anymore? I say no, he shouldn't. Full access should be represented by a list of all permissions, not by a magical special value. - P.DELETE probably should go in favour of P.UPDATE even for TRUNCATE. Presence of P.DELETE will definitely confuse users, who might think that it is somehow required to delete data, when it isn't. You can overwrite every value if you have P.UPDATE with TTL=1 and get the same result. We should also drop P.INSERT. Leave P.UPDATE (or rename it to P.MODIFY). P.MODIFY_DATA + P.READ_DATA should replace P.UPDATE, P.SELECT and P.DELETE. - I suggest new syntax to allow setting permissions on cassandra/keyspaces resource: GRANT permission ON * FOR user. The interface has to change because of the CFName argument to grant() and revoke(), and since it's going to be broken anyway (and has been introduced recently), I think we are in a position to make some other improvements while at it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-4239) Support Thrift SSL socket
[ https://issues.apache.org/jira/browse/CASSANDRA-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-4239: -- Fix Version/s: (was: 1.2.1) 1.2.0 beta 2 Support Thrift SSL socket - Key: CASSANDRA-4239 URL: https://issues.apache.org/jira/browse/CASSANDRA-4239 Project: Cassandra Issue Type: New Feature Components: API Reporter: Jonathan Ellis Assignee: Jason Brown Priority: Minor Fix For: 1.2.0 beta 2 Attachments: 0001-CASSANDRA-4239-Support-Thrift-SSL-socket-both-to-the.patch, 0001-CASSANDRA-4239-v3.patch, 0001-Fix-for-IDE-alert-on-SSLTransportFactory.patch, 0001-Fix-for-IDE-alert-on-SSLTransportFactory.patch, 0002-CASSANDRA-4239-Support-Thrift-SSL.patch Thrift has supported SSL encryption for a while now (THRIFT-106); we should allow configuring that in cassandra.yaml -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
git commit: cleanup
Updated Branches: refs/heads/trunk ab2614ca1 - 171c661e7 cleanup Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/171c661e Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/171c661e Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/171c661e Branch: refs/heads/trunk Commit: 171c661e773bcc0f50dc5ad09290eba700e4d00d Parents: ab2614c Author: Jonathan Ellis jbel...@apache.org Authored: Sat Oct 27 09:48:22 2012 -0700 Committer: Jonathan Ellis jbel...@apache.org Committed: Mon Oct 29 23:11:59 2012 -0500 -- .../org/apache/cassandra/db/ColumnFamilyStore.java |2 +- .../db/compaction/AbstractCompactionStrategy.java | 15 +-- 2 files changed, 6 insertions(+), 11 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/171c661e/src/java/org/apache/cassandra/db/ColumnFamilyStore.java -- diff --git a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java index 3b1df99..e981a9c 100644 --- a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java +++ b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java @@ -187,7 +187,7 @@ public class ColumnFamilyStore implements ColumnFamilyStoreMBean private void maybeReloadCompactionStrategy() { // Check if there is a need for reloading -if (metadata.compactionStrategyClass.equals(compactionStrategy.getClass()) metadata.compactionStrategyOptions.equals(compactionStrategy.getOptions())) +if (metadata.compactionStrategyClass.equals(compactionStrategy.getClass()) metadata.compactionStrategyOptions.equals(compactionStrategy.options)) return; // TODO is there a way to avoid locking here? http://git-wip-us.apache.org/repos/asf/cassandra/blob/171c661e/src/java/org/apache/cassandra/db/compaction/AbstractCompactionStrategy.java -- diff --git a/src/java/org/apache/cassandra/db/compaction/AbstractCompactionStrategy.java b/src/java/org/apache/cassandra/db/compaction/AbstractCompactionStrategy.java index c2271bb..ec328e5 100644 --- a/src/java/org/apache/cassandra/db/compaction/AbstractCompactionStrategy.java +++ b/src/java/org/apache/cassandra/db/compaction/AbstractCompactionStrategy.java @@ -35,12 +35,12 @@ import org.apache.cassandra.io.sstable.SSTableReader; public abstract class AbstractCompactionStrategy { protected static final float DEFAULT_TOMBSTONE_THRESHOLD = 0.2f; -protected static final String TOMBSTONE_THRESHOLD_KEY = tombstone_threshold; +protected static final String TOMBSTONE_THRESHOLD_OPTION = tombstone_threshold; -protected final ColumnFamilyStore cfs; -protected final MapString, String options; +public final MapString, String options; -protected float tombstoneThreshold; +protected final ColumnFamilyStore cfs; +protected final float tombstoneThreshold; protected AbstractCompactionStrategy(ColumnFamilyStore cfs, MapString, String options) { @@ -48,15 +48,10 @@ public abstract class AbstractCompactionStrategy this.cfs = cfs; this.options = options; -String optionValue = options.get(TOMBSTONE_THRESHOLD_KEY); +String optionValue = options.get(TOMBSTONE_THRESHOLD_OPTION); tombstoneThreshold = optionValue == null ? DEFAULT_TOMBSTONE_THRESHOLD : Float.parseFloat(optionValue); } -public MapString, String getOptions() -{ -return options; -} - /** * Releases any resources if this strategy is shutdown (when the CFS is reloaded after a schema change). * Default is to do nothing.
[jira] [Created] (CASSANDRA-4876) Make bloom filters optional by default for LCS
Jonathan Ellis created CASSANDRA-4876: - Summary: Make bloom filters optional by default for LCS Key: CASSANDRA-4876 URL: https://issues.apache.org/jira/browse/CASSANDRA-4876 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Jonathan Ellis Assignee: Jonathan Ellis Priority: Minor Fix For: 1.2.0 Since LCS gives us only 10% chance we will need to merge data from multiple sstables, bloom filters are a waste of memory (unless you have a workload where you request lots of keys that don't exist). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-4850) RuntimeException when bootstrapping a node without an explicitely set token
[ https://issues.apache.org/jira/browse/CASSANDRA-4850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-4850: -- Affects Version/s: (was: 1.2.0 beta 1) 1.2.0 beta 2 Assignee: Pavel Yaskevich IMO we should not include system tables in serializedSchema at all, for version or for sending to other nodes, since they are hardcoded. (Sending a diff to another node will not have the desired effects.) RuntimeException when bootstrapping a node without an explicitely set token --- Key: CASSANDRA-4850 URL: https://issues.apache.org/jira/browse/CASSANDRA-4850 Project: Cassandra Issue Type: Bug Affects Versions: 1.2.0 beta 2 Reporter: Sylvain Lebresne Assignee: Pavel Yaskevich Fix For: 1.2.0 beta 2 Trying to boostrap a node for which no initial token has been set result in: {noformat} java.lang.RuntimeException: No other nodes seen! Unable to bootstrap.If you intended to start a single-node cluster, you should make sure your broadcast_address (or listen_address) is listed as a seed. Otherwise, you need to determine why the seed being contacted has no knowledge of the rest of the cluster. Usually, this can be solved by giving all nodes the same seed list. at org.apache.cassandra.dht.BootStrapper.getBootstrapSource(BootStrapper.java:154) at org.apache.cassandra.dht.BootStrapper.getBalancedToken(BootStrapper.java:135) at org.apache.cassandra.dht.BootStrapper.getBootstrapTokens(BootStrapper.java:115) at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:603) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:490) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:386) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:305) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:393) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:436) {noformat} This has been broken by CASSANDRA-4416. More specifically, now that we storage the system metadata in the schema on startup, the check {noformat} // if we see schema, we can proceed to the next check directly if (!Schema.instance.getVersion().equals(Schema.emptyVersion)) { logger.debug(got schema: {}, Schema.instance.getVersion()); break; } {noformat} in StorageService.joinTokenRing is broken. This result in the node trying to check the Load map to pick a token before any gossip state has been received. Note sure what is the best fix (an easy would be to always wait RING_DELAY before attempting to pick the token, at least in the case where an initial token isn't set, but that's a big hammer). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-4861) Consider separating tracing from log4j
[ https://issues.apache.org/jira/browse/CASSANDRA-4861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-4861: -- Fix Version/s: (was: 1.2.0 beta 2) 1.2.0 Tagging 1.2.0; don't want to block beta2 for this. Consider separating tracing from log4j -- Key: CASSANDRA-4861 URL: https://issues.apache.org/jira/browse/CASSANDRA-4861 Project: Cassandra Issue Type: Improvement Reporter: Sylvain Lebresne Fix For: 1.2.0 Currently, (as far as I understand) tracing is implemented as a log4j appender that intercepts all log messages and write them to a system table. I'm sorry to not have bring that up during the initial review (it's hard to follow every ticket) but before we release this I'd like to have a serious discussion on that choice because I'm not convinced (at all) that it's a good idea. Namely, I can see the following drawbacks: # the main one is that this *forces* every debug messages to be traced and conversely, every traced message to be logged at debug. But I strongly think that debug logging and query tracing are not the same thing. Don't get me wrong, there is clearly a large intersection between those two things (which is fine), but I do think that *identifying* them is a mistake. More concretely: ** Consider some of the messages we log at debug in CFS: {noformat} logger.debug(memtable is already frozen; another thread must be flushing it); logger.debug(forceFlush requested but everything is clean in {}, columnFamily); logger.debug(Checking for sstables overlapping {}, sstables); {noformat} Those messages are useful for debugging and have a place in the log at debug, but they are noise as far as query tracing is concerned (None have any concrete impact on query performance, they just describe what the code has done). Or take the following ones from CompactionManager: {noformat} logger.debug(Background compaction is still running for {}.{} ({} remaining). Skipping, new Object[] {cfs.table.name, cfs.columnFamily, count}); logger.debug(Scheduling a background task check for {}.{} with {}, new Object[] {cfs.table.name, cfs.columnFamily, cfs.getCompactionStrategy().getClass().getSimpleName()}); logger.debug(Checking {}.{}, cfs.table.name, cfs.columnFamily); logger.debug(Aborting compaction for dropped CF); logger.debug(No tasks available); logger.debug(Expected bloom filter size : + expectedBloomFilterSize); logger.debug(Cache flushing was already in progress: skipping {}, writer.getCompactionInfo()); {noformat} It is useful to have that in the debug log, but how is any of that useful to users in query tracing? (it may be useful to trace if a new compaction start or stop, because that does influence query performance, but those message do not). Also take the following message logged when a compaction is user interrupted: {noformat} if (t instanceof CompactionInterruptedException) { logger.info(t.getMessage()); logger.debug(Full interruption stack trace:, t); } {noformat} I can buy that you may want the first log message in the query tracing, but the second one is definitively something that only make sense for debug logging but not for query tracing (and as a side note, the current implementation don't do something sensible as it traces Full interruption stack trace: but completely ignore the throwable). Lastly, and though that's arguably more a detail (but why would we settle for something good enough if we can do better) I believe that in some cases you want an event to be both logged at debug and traced but having different messages could make sense. For instance, in CFS we have {noformat} logger.debug(Snapshot for + table + keyspace data file + ssTable.getFilename() + created in + snapshotDirectory); {noformat} I'm not convinced that snapshot should be part of query tracing given it doesn't really have an impact on queries, but even if we do trace it, we probably don't care about having one event for each snapshoted file (2 events, one for the start of the snapshot, one for the end would be enough). As it stands, I think query tracing will have a lot of random noises, which will not only be annoying but I'm also sure will make users spend time worrying about events that have no impact whatsoever. And I've only looked at the debug message of 2 classes ... ** I also think there could be case where we would want to trace something, but not have it in the debug log. For instance, it makes sense in the query trace to know how long parsing the query took. But logging too much info per query like that will make the debug log
[jira] [Updated] (CASSANDRA-4049) Add generic way of adding SSTable components required custom compaction strategy
[ https://issues.apache.org/jira/browse/CASSANDRA-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-4049: -- Fix Version/s: (was: 1.2.0 beta 2) 1.2.0 Add generic way of adding SSTable components required custom compaction strategy Key: CASSANDRA-4049 URL: https://issues.apache.org/jira/browse/CASSANDRA-4049 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Piotr Kołaczkowski Assignee: Piotr Kołaczkowski Priority: Minor Labels: compaction Fix For: 1.2.0 Attachments: 4049-v3.txt, 4049-v4.txt, pluggable_custom_components-1.1.5-2.patch, pluggable_custom_components-1.1.5.patch CFS compaction strategy coming up in the next DSE release needs to store some important information in Tombstones.db and RemovedKeys.db files, one per sstable. However, currently Cassandra issues warnings when these files are found in the data directory. Additionally, when switched to SizeTieredCompactionStrategy, the files are left in the data directory after compaction. The attached patch adds new components to the Component class so Cassandra knows about those files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-4868) When authorizing actions, check for NO_ACCESS permission first instead of FULL_ACCESS
[ https://issues.apache.org/jira/browse/CASSANDRA-4868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-4868: -- Fix Version/s: (was: 1.2.0 beta 2) 1.2.0 When authorizing actions, check for NO_ACCESS permission first instead of FULL_ACCESS - Key: CASSANDRA-4868 URL: https://issues.apache.org/jira/browse/CASSANDRA-4868 Project: Cassandra Issue Type: Improvement Affects Versions: 1.1.6, 1.2.0 beta 1 Reporter: Aleksey Yeschenko Assignee: Aleksey Yeschenko Priority: Minor Fix For: 1.1.7, 1.2.0 Attachments: CASSANDRA-4868-1.1.txt, CASSANDRA-4868-1.2.txt When authorizing actions, check for NO_ACCESS permission first instead of FULL_ACCESS (ClientState.hasAccess). This seems like a safer order to me. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-4803) CFRR wide row iterators improvements
[ https://issues.apache.org/jira/browse/CASSANDRA-4803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-4803: -- Fix Version/s: (was: 1.2.0 beta 2) 1.2.0 CFRR wide row iterators improvements Key: CASSANDRA-4803 URL: https://issues.apache.org/jira/browse/CASSANDRA-4803 Project: Cassandra Issue Type: Bug Components: Hadoop Affects Versions: 1.1.0 Reporter: Piotr Kołaczkowski Assignee: Piotr Kołaczkowski Fix For: 1.1.7, 1.2.0 Attachments: 0001-Wide-row-iterator-counts-rows-not-columns.patch, 0002-Fixed-bugs-in-describe_splits.-CFRR-uses-row-counts-.patch, 0003-Fixed-get_paged_slice-memtable-and-sstable-column-it.patch, 0004-Better-token-range-wrap-around-handling-in-CFIF-CFRR.patch, 0005-Fixed-handling-of-start_key-end_token-in-get_range_s.patch, 0006-Code-cleanup-refactoring-in-CFRR.-Fixed-bug-with-mis.patch {code} public float getProgress() { // TODO this is totally broken for wide rows // the progress is likely to be reported slightly off the actual but close enough float progress = ((float) iter.rowsRead() / totalRowCount); return progress 1.0F ? 1.0F : progress; } {code} The problem is iter.rowsRead() does not return the number of rows read from the wide row iterator, but returns number of *columns* (every row is counted multiple times). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira