[jira] [Issue Comment Deleted] (CASSANDRA-6674) TombstoneOverwhelmingException during/after batch insert
[ https://issues.apache.org/jira/browse/CASSANDRA-6674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Machiel Groeneveld updated CASSANDRA-6674: -- Comment: was deleted (was: Is there a way to make the tombstones go away, can I force a cleanup for instance?) TombstoneOverwhelmingException during/after batch insert Key: CASSANDRA-6674 URL: https://issues.apache.org/jira/browse/CASSANDRA-6674 Project: Cassandra Issue Type: Bug Environment: 2.0.4; 2.0.5 Mac OS X Reporter: Machiel Groeneveld Priority: Critical Select query on a table where I'm doing insert fails with tombstone exception. The database is clean/empty before doing inserts, doing the first query after a few thousand records inserted. I don't understand where the tombstones are coming from as I'm not doing any deletes. ERROR [ReadStage:41] 2014-02-07 12:16:42,169 SliceQueryFilter.java (line 200) Scanned over 10 tombstones in visits.visits; query aborted (see tombstone_fail_threshold) ERROR [ReadStage:41] 2014-02-07 12:16:42,171 CassandraDaemon.java (line 192) Exception in thread Thread[ReadStage:41,5,main] java.lang.RuntimeException: org.apache.cassandra.db.filter.TombstoneOverwhelmingException at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1935) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:724) Caused by: org.apache.cassandra.db.filter.TombstoneOverwhelmingException at org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:202) at org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:122) at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:80) at org.apache.cassandra.db.RowIteratorFactory$2.getReduced(RowIteratorFactory.java:101) at org.apache.cassandra.db.RowIteratorFactory$2.getReduced(RowIteratorFactory.java:75) at org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:115) at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:98) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) at org.apache.cassandra.db.ColumnFamilyStore$9.computeNext(ColumnFamilyStore.java:1607) at org.apache.cassandra.db.ColumnFamilyStore$9.computeNext(ColumnFamilyStore.java:1603) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) at org.apache.cassandra.db.ColumnFamilyStore.filter(ColumnFamilyStore.java:1754) at org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:1718) at org.apache.cassandra.db.RangeSliceCommand.executeLocally(RangeSliceCommand.java:137) at org.apache.cassandra.service.StorageProxy$LocalRangeSliceRunnable.runMayThrow(StorageProxy.java:1418) at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1931) ... 3 more -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (CASSANDRA-6674) TombstoneOverwhelmingException during/after batch insert
[ https://issues.apache.org/jira/browse/CASSANDRA-6674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13895870#comment-13895870 ] Machiel Groeneveld commented on CASSANDRA-6674: --- Thanks for taking the time to explain. TombstoneOverwhelmingException during/after batch insert Key: CASSANDRA-6674 URL: https://issues.apache.org/jira/browse/CASSANDRA-6674 Project: Cassandra Issue Type: Bug Environment: 2.0.4; 2.0.5 Mac OS X Reporter: Machiel Groeneveld Priority: Critical Select query on a table where I'm doing insert fails with tombstone exception. The database is clean/empty before doing inserts, doing the first query after a few thousand records inserted. I don't understand where the tombstones are coming from as I'm not doing any deletes. ERROR [ReadStage:41] 2014-02-07 12:16:42,169 SliceQueryFilter.java (line 200) Scanned over 10 tombstones in visits.visits; query aborted (see tombstone_fail_threshold) ERROR [ReadStage:41] 2014-02-07 12:16:42,171 CassandraDaemon.java (line 192) Exception in thread Thread[ReadStage:41,5,main] java.lang.RuntimeException: org.apache.cassandra.db.filter.TombstoneOverwhelmingException at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1935) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:724) Caused by: org.apache.cassandra.db.filter.TombstoneOverwhelmingException at org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:202) at org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:122) at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:80) at org.apache.cassandra.db.RowIteratorFactory$2.getReduced(RowIteratorFactory.java:101) at org.apache.cassandra.db.RowIteratorFactory$2.getReduced(RowIteratorFactory.java:75) at org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:115) at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:98) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) at org.apache.cassandra.db.ColumnFamilyStore$9.computeNext(ColumnFamilyStore.java:1607) at org.apache.cassandra.db.ColumnFamilyStore$9.computeNext(ColumnFamilyStore.java:1603) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) at org.apache.cassandra.db.ColumnFamilyStore.filter(ColumnFamilyStore.java:1754) at org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:1718) at org.apache.cassandra.db.RangeSliceCommand.executeLocally(RangeSliceCommand.java:137) at org.apache.cassandra.service.StorageProxy$LocalRangeSliceRunnable.runMayThrow(StorageProxy.java:1418) at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1931) ... 3 more -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (CASSANDRA-6332) Cassandra startup failure: java.util.concurrent.ExecutionException: java.lang.RuntimeException: 706167655f74616773 is not defined as a collection
[ https://issues.apache.org/jira/browse/CASSANDRA-6332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13895873#comment-13895873 ] Mck SembWever commented on CASSANDRA-6332: -- being a dev environment it's pretty much an open playground so it could well have been that without us knowing about it. (i've been away the past 2 months but will try and chase up if this was the case…) Cassandra startup failure: java.util.concurrent.ExecutionException: java.lang.RuntimeException: 706167655f74616773 is not defined as a collection -- Key: CASSANDRA-6332 URL: https://issues.apache.org/jira/browse/CASSANDRA-6332 Project: Cassandra Issue Type: Bug Environment: Ubuntu 12.04 Cassandra 2.0.1 Reporter: Prateek Priority: Critical The cassandra node fails to startup with the following error message. This is currently impacting availability of our production cluster so your quick response is highly appreciated. ERROR 22:58:26,046 Exception encountered during startup java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: 706167655f74616773 is not defined as a collection at org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:411) at org.apache.cassandra.utils.FBUtilities.waitOnFutures(FBUtilities.java:400) at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:273) at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:96) at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:146) at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:126) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:299) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:442) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:485) Caused by: java.util.concurrent.ExecutionException: java.lang.RuntimeException: 706167655f74616773 is not defined as a collection at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:188) at org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:407) ... 8 more Caused by: java.lang.RuntimeException: 706167655f74616773 is not defined as a collection at org.apache.cassandra.db.marshal.ColumnToCollectionType.compareCollectionMembers(ColumnToCollectionType.java:72) at org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:85) at org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:35) at edu.stanford.ppl.concurrent.SnapTreeMap$1.compareTo(SnapTreeMap.java:538) at edu.stanford.ppl.concurrent.SnapTreeMap.attemptUpdate(SnapTreeMap.java:1108) at edu.stanford.ppl.concurrent.SnapTreeMap.attemptUpdate(SnapTreeMap.java:1192) at edu.stanford.ppl.concurrent.SnapTreeMap.updateUnderRoot(SnapTreeMap.java:1059) at edu.stanford.ppl.concurrent.SnapTreeMap.update(SnapTreeMap.java:1023) at edu.stanford.ppl.concurrent.SnapTreeMap.putIfAbsent(SnapTreeMap.java:985) at org.apache.cassandra.db.AtomicSortedColumns$Holder.addColumn(AtomicSortedColumns.java:323) at org.apache.cassandra.db.AtomicSortedColumns.addAllWithSizeDelta(AtomicSortedColumns.java:195) at org.apache.cassandra.db.Memtable.resolve(Memtable.java:196) at org.apache.cassandra.db.Memtable.put(Memtable.java:160) at org.apache.cassandra.db.ColumnFamilyStore.apply(ColumnFamilyStore.java:842) at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:373) at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:338) at org.apache.cassandra.db.commitlog.CommitLogReplayer$1.runMayThrow(CommitLogReplayer.java:265) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (CASSANDRA-6404) Tools emit ERRORs and WARNINGs about missing javaagent
[ https://issues.apache.org/jira/browse/CASSANDRA-6404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13895900#comment-13895900 ] Daniel Smedegaard Buus commented on CASSANDRA-6404: --- I know this reads, Fixed, but I'm using 2.0.4 on Ubuntu 14.04, and I see this when running nodetool ring. Should I be concerned about that, or? Thanks :) Tools emit ERRORs and WARNINGs about missing javaagent --- Key: CASSANDRA-6404 URL: https://issues.apache.org/jira/browse/CASSANDRA-6404 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Sam Tunnicliffe Assignee: Sam Tunnicliffe Priority: Minor Fix For: 1.2.14, 2.0.4 Attachments: 0001-Set-javaagent-when-running-tools-in-bin.patch The combination of CASSANDRA-6107 CASSANDRA-6293 has lead to a number of the tools shipped in bin/ to display the following warnings when run: {code} ERROR 15:21:47,337 Unable to initialize MemoryMeter (jamm not specified as javaagent). This means Cassandra will be unable to measure object sizes accurately and may consequently OOM. WARN 15:21:47,506 MemoryMeter uninitialized (jamm not specified as java agent); KeyCache size in JVM Heap will not be calculated accurately. Usually this means cassandra-env.sh disabled jamm because you are u {code} Although harmless, these are a bit disconcerting. The simplest fix seems to be to set the javaagent switch as we do for the main C* launch. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (CASSANDRA-6404) Tools emit ERRORs and WARNINGs about missing javaagent
[ https://issues.apache.org/jira/browse/CASSANDRA-6404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13895901#comment-13895901 ] Brandon Williams commented on CASSANDRA-6404: - You probably aren't running quite what you think you're running, it should be easy to tell if you look at the shell script though. That said, as the description says, this is harmless. Tools emit ERRORs and WARNINGs about missing javaagent --- Key: CASSANDRA-6404 URL: https://issues.apache.org/jira/browse/CASSANDRA-6404 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Sam Tunnicliffe Assignee: Sam Tunnicliffe Priority: Minor Fix For: 1.2.14, 2.0.4 Attachments: 0001-Set-javaagent-when-running-tools-in-bin.patch The combination of CASSANDRA-6107 CASSANDRA-6293 has lead to a number of the tools shipped in bin/ to display the following warnings when run: {code} ERROR 15:21:47,337 Unable to initialize MemoryMeter (jamm not specified as javaagent). This means Cassandra will be unable to measure object sizes accurately and may consequently OOM. WARN 15:21:47,506 MemoryMeter uninitialized (jamm not specified as java agent); KeyCache size in JVM Heap will not be calculated accurately. Usually this means cassandra-env.sh disabled jamm because you are u {code} Although harmless, these are a bit disconcerting. The simplest fix seems to be to set the javaagent switch as we do for the main C* launch. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (CASSANDRA-6682) Loss of precision under concurrent access on o.a.c.utils.EstimatedHistogram methods
Paulo Gaspar created CASSANDRA-6682: --- Summary: Loss of precision under concurrent access on o.a.c.utils.EstimatedHistogram methods Key: CASSANDRA-6682 URL: https://issues.apache.org/jira/browse/CASSANDRA-6682 Project: Cassandra Issue Type: Bug Reporter: Paulo Gaspar Priority: Minor Fix For: 1.2.16, 2.0.6 On method getBuckets(), under concurrent use of an instance, the value of a buckets variable element might change between its access on the 1st for cycle and its access on the 2nd, when it is reset to 0. This means that, if one collects metrics by repeatedly calling estHist.getBuckets(true), (e.g.: to sum its values) it will miss counting some values added to buckets entries between that 1st and 2nd access. On method mean(), if the i-th entry of buckets changes value between the 1st and 2nd access inside the for cycle, than the elements and sum accumulators are not working with the same values for that entry. It is more precise (and faster) to use a local variable to read the value just once. Not an error but a semantic improvement: at my initial read of this class, I thought the buckets and bucketOffsets fields could change length. Such perception can be avoided by making the bucketOffsets field final. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (CASSANDRA-6682) Loss of precision under concurrent access on o.a.c.utils.EstimatedHistogram methods
[ https://issues.apache.org/jira/browse/CASSANDRA-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paulo Gaspar updated CASSANDRA-6682: Attachment: 6682.txt The attached files contains all necessary changes. Loss of precision under concurrent access on o.a.c.utils.EstimatedHistogram methods --- Key: CASSANDRA-6682 URL: https://issues.apache.org/jira/browse/CASSANDRA-6682 Project: Cassandra Issue Type: Bug Reporter: Paulo Gaspar Priority: Minor Fix For: 1.2.16, 2.0.6 Attachments: 6682.txt On method getBuckets(), under concurrent use of an instance, the value of a buckets variable element might change between its access on the 1st for cycle and its access on the 2nd, when it is reset to 0. This means that, if one collects metrics by repeatedly calling estHist.getBuckets(true), (e.g.: to sum its values) it will miss counting some values added to buckets entries between that 1st and 2nd access. On method mean(), if the i-th entry of buckets changes value between the 1st and 2nd access inside the for cycle, than the elements and sum accumulators are not working with the same values for that entry. It is more precise (and faster) to use a local variable to read the value just once. Not an error but a semantic improvement: at my initial read of this class, I thought the buckets and bucketOffsets fields could change length. Such perception can be avoided by making the bucketOffsets field final. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Comment Edited] (CASSANDRA-6663) Connecting to a Raspberry PI Cassandra Cluster crashes the node being connected to
[ https://issues.apache.org/jira/browse/CASSANDRA-6663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13895833#comment-13895833 ] ian mccrae edited comment on CASSANDRA-6663 at 2/9/14 11:02 PM: Snappy has been turned off .or at least should have as I have: # Disabled internode compression in Cassandra.yaml files on nodes... with internode_compression: none # and called the Python driver with compression = none However these configuration options don't appear to be working. Also to my knowledge Snappy hasn't been ported to ARM and so it is not an option for the Raspberry Pi. Perhaps LZ4 can be used as an alternative. So this is a Bug. was (Author: ianm): Snappy has been turned off .or at least should have as I have: # Disabled internode compression in Cassandra.yaml files on nodes... with internode_compression: none # and called the Python driver with compression = none However these configuration options don't appear to be working. Also to my knowledge Snappy hasn't been ported to ARM and so it is NOT an option for the Raspberry Pi. So this is a Bug. Connecting to a Raspberry PI Cassandra Cluster crashes the node being connected to -- Key: CASSANDRA-6663 URL: https://issues.apache.org/jira/browse/CASSANDRA-6663 Project: Cassandra Issue Type: Bug Components: Drivers (now out of tree) Environment: 4x node Raspberry PI cluster Macbook running Idle 2.7 Reporter: ian mccrae Attachments: Python Client Log.txt, hs_err_pid6327.log I have a working 4x node Raspberry Pi cluster and # DevCenter happily connects to this (...which has an option to turn Snappy compression off) # ...however the Python Driver fails to connect and crashes the node being connected to with the errors in the error-log below. There appears to be a problem with Snappy compression (not supported on the Raspberry Pi). So I also tried compression = None with the same result. How might I fix this? *Python Code* {noformat} from cassandra.cluster import Cluster cluster = Cluster(['192.168.200.151'], compression = None) session = cluster.connect() {noformat} *Error Log* {noformat} Traceback (most recent call last): File pyshell#58, line 1, in module session = cluster.connect() File /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/cassandra/cluster.py, line 471, in connect self.control_connection.connect() File /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/cassandra/cluster.py, line 1351, in connect self._set_new_connection(self._reconnect_internal()) File /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/cassandra/cluster.py, line 1386, in _reconnect_internal raise NoHostAvailable(Unable to connect to any servers, errors) NoHostAvailable: ('Unable to connect to any servers', {'192.168.200.151': ConnectionShutdown('Connection to 192.168.200.151 is closed',)) {noformat} *A Dump of the cluster class attributes* {noformat} pprint(vars(cluster)) {'_core_connections_per_host': {0: 2, 1: 1}, '_is_setup': True, '_is_shutdown': True, '_listener_lock': thread.lock object at 0x10616d230, '_listeners': set([]), '_lock': _RLock owner=None count=0, '_max_connections_per_host': {0: 8, 1: 2}, '_max_requests_per_connection': {0: 100, 1: 100}, '_min_requests_per_connection': {0: 5, 1: 5}, '_prepared_statements': WeakValueDictionary at 4396942904, 'compression': None, 'contact_points': ['192.168.200.151'], 'control_connection': cassandra.cluster.ControlConnection object at 0x106168cd0, 'control_connection_timeout': 2.0, 'cql_version': None, 'executor': concurrent.futures.thread.ThreadPoolExecutor object at 0x106148410, 'load_balancing_policy': cassandra.policies.RoundRobinPolicy object at 0x104adae50, 'max_schema_agreement_wait': 10, 'metadata': cassandra.metadata.Metadata object at 0x1061481d0, 'metrics_enabled': False, 'port': 9042, 'scheduler': cassandra.cluster._Scheduler object at 0x106148550, 'sessions': _weakrefset.WeakSet object at 0x106148750, 'sockopts': None, 'ssl_options': None} {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[1/3] git commit: Fix EstimatedHistogram races patch by Paulo Gaspar; reviewed by jbellis for CASSANDRA-6682
Updated Branches: refs/heads/cassandra-2.0 800e45f48 - 16f99c5a2 refs/heads/trunk 319877fda - de9be79d5 Fix EstimatedHistogram races patch by Paulo Gaspar; reviewed by jbellis for CASSANDRA-6682 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/16f99c5a Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/16f99c5a Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/16f99c5a Branch: refs/heads/cassandra-2.0 Commit: 16f99c5a2edd2df3ed9512d3598d60f845e58d19 Parents: 800e45f Author: Jonathan Ellis jbel...@apache.org Authored: Mon Feb 10 00:43:14 2014 -0600 Committer: Jonathan Ellis jbel...@apache.org Committed: Mon Feb 10 00:43:14 2014 -0600 -- CHANGES.txt | 2 ++ .../cassandra/utils/EstimatedHistogram.java | 31 2 files changed, 20 insertions(+), 13 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/16f99c5a/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index d32490e..b98dec7 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.0.6 + * Fix EstimatedHistogram races (CASSANDRA-6682) * Failure detector correctly converts initial value to nanos (CASSANDRA-6658) * Add nodetool taketoken to relocate vnodes (CASSANDRA-4445) * Fix upgradesstables NPE for non-CF-based indexes (CASSANDRA-6645) @@ -13,6 +14,7 @@ Merged from 1.2: * Fix mean cells and mean row size per sstable calculations (CASSANDRA-6667) * Compact hints after partial replay to clean out tombstones (CASSANDRA-) + 2.0.5 * Reduce garbage generated by bloom filter lookups (CASSANDRA-6609) * Add ks.cf names to tombstone logging (CASSANDRA-6597) http://git-wip-us.apache.org/repos/asf/cassandra/blob/16f99c5a/src/java/org/apache/cassandra/utils/EstimatedHistogram.java -- diff --git a/src/java/org/apache/cassandra/utils/EstimatedHistogram.java b/src/java/org/apache/cassandra/utils/EstimatedHistogram.java index 1b8ffe1..5941057 100644 --- a/src/java/org/apache/cassandra/utils/EstimatedHistogram.java +++ b/src/java/org/apache/cassandra/utils/EstimatedHistogram.java @@ -44,7 +44,7 @@ public class EstimatedHistogram * * Each bucket represents values from (previous bucket offset, current offset]. */ -private long[] bucketOffsets; +private final long[] bucketOffsets; // buckets is one element longer than bucketOffsets -- the last element is values greater than the last offset final AtomicLongArray buckets; @@ -56,7 +56,7 @@ public class EstimatedHistogram public EstimatedHistogram(int bucketCount) { -makeOffsets(bucketCount); +bucketOffsets = newOffsets(bucketCount); buckets = new AtomicLongArray(bucketOffsets.length + 1); } @@ -67,19 +67,21 @@ public class EstimatedHistogram buckets = new AtomicLongArray(bucketData); } -private void makeOffsets(int size) +private static long[] newOffsets(int size) { -bucketOffsets = new long[size]; +long[] result = new long[size]; long last = 1; -bucketOffsets[0] = last; +result[0] = last; for (int i = 1; i size; i++) { long next = Math.round(last * 1.2); if (next == last) next++; -bucketOffsets[i] = next; +result[i] = next; last = next; } + +return result; } /** @@ -120,13 +122,15 @@ public class EstimatedHistogram */ public long[] getBuckets(boolean reset) { -long[] rv = new long[buckets.length()]; -for (int i = 0; i buckets.length(); i++) -rv[i] = buckets.get(i); +final int len = buckets.length(); +long[] rv = new long[len]; if (reset) -for (int i = 0; i buckets.length(); i++) -buckets.set(i, 0L); +for (int i = 0; i len; i++) +rv[i] = buckets.getAndSet(i, 0L); +else +for (int i = 0; i len; i++) +rv[i] = buckets.get(i); return rv; } @@ -201,8 +205,9 @@ public class EstimatedHistogram long sum = 0; for (int i = 0; i lastBucket; i++) { -elements += buckets.get(i); -sum += buckets.get(i) * bucketOffsets[i]; +long bCount = buckets.get(i); +elements += bCount; +sum += bCount * bucketOffsets[i]; } return (long) Math.ceil((double) sum / elements);
[2/3] git commit: Fix EstimatedHistogram races patch by Paulo Gaspar; reviewed by jbellis for CASSANDRA-6682
Fix EstimatedHistogram races patch by Paulo Gaspar; reviewed by jbellis for CASSANDRA-6682 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/16f99c5a Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/16f99c5a Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/16f99c5a Branch: refs/heads/trunk Commit: 16f99c5a2edd2df3ed9512d3598d60f845e58d19 Parents: 800e45f Author: Jonathan Ellis jbel...@apache.org Authored: Mon Feb 10 00:43:14 2014 -0600 Committer: Jonathan Ellis jbel...@apache.org Committed: Mon Feb 10 00:43:14 2014 -0600 -- CHANGES.txt | 2 ++ .../cassandra/utils/EstimatedHistogram.java | 31 2 files changed, 20 insertions(+), 13 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/16f99c5a/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index d32490e..b98dec7 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.0.6 + * Fix EstimatedHistogram races (CASSANDRA-6682) * Failure detector correctly converts initial value to nanos (CASSANDRA-6658) * Add nodetool taketoken to relocate vnodes (CASSANDRA-4445) * Fix upgradesstables NPE for non-CF-based indexes (CASSANDRA-6645) @@ -13,6 +14,7 @@ Merged from 1.2: * Fix mean cells and mean row size per sstable calculations (CASSANDRA-6667) * Compact hints after partial replay to clean out tombstones (CASSANDRA-) + 2.0.5 * Reduce garbage generated by bloom filter lookups (CASSANDRA-6609) * Add ks.cf names to tombstone logging (CASSANDRA-6597) http://git-wip-us.apache.org/repos/asf/cassandra/blob/16f99c5a/src/java/org/apache/cassandra/utils/EstimatedHistogram.java -- diff --git a/src/java/org/apache/cassandra/utils/EstimatedHistogram.java b/src/java/org/apache/cassandra/utils/EstimatedHistogram.java index 1b8ffe1..5941057 100644 --- a/src/java/org/apache/cassandra/utils/EstimatedHistogram.java +++ b/src/java/org/apache/cassandra/utils/EstimatedHistogram.java @@ -44,7 +44,7 @@ public class EstimatedHistogram * * Each bucket represents values from (previous bucket offset, current offset]. */ -private long[] bucketOffsets; +private final long[] bucketOffsets; // buckets is one element longer than bucketOffsets -- the last element is values greater than the last offset final AtomicLongArray buckets; @@ -56,7 +56,7 @@ public class EstimatedHistogram public EstimatedHistogram(int bucketCount) { -makeOffsets(bucketCount); +bucketOffsets = newOffsets(bucketCount); buckets = new AtomicLongArray(bucketOffsets.length + 1); } @@ -67,19 +67,21 @@ public class EstimatedHistogram buckets = new AtomicLongArray(bucketData); } -private void makeOffsets(int size) +private static long[] newOffsets(int size) { -bucketOffsets = new long[size]; +long[] result = new long[size]; long last = 1; -bucketOffsets[0] = last; +result[0] = last; for (int i = 1; i size; i++) { long next = Math.round(last * 1.2); if (next == last) next++; -bucketOffsets[i] = next; +result[i] = next; last = next; } + +return result; } /** @@ -120,13 +122,15 @@ public class EstimatedHistogram */ public long[] getBuckets(boolean reset) { -long[] rv = new long[buckets.length()]; -for (int i = 0; i buckets.length(); i++) -rv[i] = buckets.get(i); +final int len = buckets.length(); +long[] rv = new long[len]; if (reset) -for (int i = 0; i buckets.length(); i++) -buckets.set(i, 0L); +for (int i = 0; i len; i++) +rv[i] = buckets.getAndSet(i, 0L); +else +for (int i = 0; i len; i++) +rv[i] = buckets.get(i); return rv; } @@ -201,8 +205,9 @@ public class EstimatedHistogram long sum = 0; for (int i = 0; i lastBucket; i++) { -elements += buckets.get(i); -sum += buckets.get(i) * bucketOffsets[i]; +long bCount = buckets.get(i); +elements += bCount; +sum += bCount * bucketOffsets[i]; } return (long) Math.ceil((double) sum / elements);
[3/3] git commit: Merge branch 'cassandra-2.0' into trunk
Merge branch 'cassandra-2.0' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/de9be79d Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/de9be79d Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/de9be79d Branch: refs/heads/trunk Commit: de9be79d55ce36a23d9795faa37d792835404982 Parents: 319877f 16f99c5 Author: Jonathan Ellis jbel...@apache.org Authored: Mon Feb 10 00:43:24 2014 -0600 Committer: Jonathan Ellis jbel...@apache.org Committed: Mon Feb 10 00:43:24 2014 -0600 -- CHANGES.txt | 2 ++ .../cassandra/utils/EstimatedHistogram.java | 31 2 files changed, 20 insertions(+), 13 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/de9be79d/CHANGES.txt -- diff --cc CHANGES.txt index 802f515,b98dec7..7f9f9d2 --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,35 -1,5 +1,36 @@@ +2.1 + * add listsnapshots command to nodetool (CASSANDRA-5742) + * Introduce AtomicBTreeColumns (CASSANDRA-6271) + * Multithreaded commitlog (CASSANDRA-3578) + * allocate fixed index summary memory pool and resample cold index summaries + to use less memory (CASSANDRA-5519) + * Removed multithreaded compaction (CASSANDRA-6142) + * Parallelize fetching rows for low-cardinality indexes (CASSANDRA-1337) + * change logging from log4j to logback (CASSANDRA-5883) + * switch to LZ4 compression for internode communication (CASSANDRA-5887) + * Stop using Thrift-generated Index* classes internally (CASSANDRA-5971) + * Remove 1.2 network compatibility code (CASSANDRA-5960) + * Remove leveled json manifest migration code (CASSANDRA-5996) + * Remove CFDefinition (CASSANDRA-6253) + * Use AtomicIntegerFieldUpdater in RefCountedMemory (CASSANDRA-6278) + * User-defined types for CQL3 (CASSANDRA-5590) + * Use of o.a.c.metrics in nodetool (CASSANDRA-5871, 6406) + * Batch read from OTC's queue and cleanup (CASSANDRA-1632) + * Secondary index support for collections (CASSANDRA-4511, 6383) + * SSTable metadata(Stats.db) format change (CASSANDRA-6356) + * Push composites support in the storage engine + (CASSANDRA-5417, CASSANDRA-6520) + * Add snapshot space used to cfstats (CASSANDRA-6231) + * Add cardinality estimator for key count estimation (CASSANDRA-5906) + * CF id is changed to be non-deterministic. Data dir/key cache are created + uniquely for CF id (CASSANDRA-5202) + * New counters implementation (CASSANDRA-6504) + * Replace UnsortedColumns usage with ArrayBackedSortedColumns (CASSANDRA-6630) + * Add option to use row cache with a given amount of rows (CASSANDRA-5357) + * Avoid repairing already repaired data (CASSANDRA-5351) + 2.0.6 + * Fix EstimatedHistogram races (CASSANDRA-6682) * Failure detector correctly converts initial value to nanos (CASSANDRA-6658) * Add nodetool taketoken to relocate vnodes (CASSANDRA-4445) * Fix upgradesstables NPE for non-CF-based indexes (CASSANDRA-6645) http://git-wip-us.apache.org/repos/asf/cassandra/blob/de9be79d/src/java/org/apache/cassandra/utils/EstimatedHistogram.java --
[jira] [Resolved] (CASSANDRA-6682) Loss of precision under concurrent access on o.a.c.utils.EstimatedHistogram methods
[ https://issues.apache.org/jira/browse/CASSANDRA-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis resolved CASSANDRA-6682. --- Resolution: Fixed committed; thanks! Loss of precision under concurrent access on o.a.c.utils.EstimatedHistogram methods --- Key: CASSANDRA-6682 URL: https://issues.apache.org/jira/browse/CASSANDRA-6682 Project: Cassandra Issue Type: Bug Components: Core Reporter: Paulo Gaspar Assignee: Paulo Gaspar Priority: Minor Fix For: 2.0.6 Attachments: 6682.txt On method getBuckets(), under concurrent use of an instance, the value of a buckets variable element might change between its access on the 1st for cycle and its access on the 2nd, when it is reset to 0. This means that, if one collects metrics by repeatedly calling estHist.getBuckets(true), (e.g.: to sum its values) it will miss counting some values added to buckets entries between that 1st and 2nd access. On method mean(), if the i-th entry of buckets changes value between the 1st and 2nd access inside the for cycle, than the elements and sum accumulators are not working with the same values for that entry. It is more precise (and faster) to use a local variable to read the value just once. Not an error but a semantic improvement: at my initial read of this class, I thought the buckets and bucketOffsets fields could change length. Such perception can be avoided by making the bucketOffsets field final. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (CASSANDRA-6682) Loss of precision under concurrent access on o.a.c.utils.EstimatedHistogram methods
[ https://issues.apache.org/jira/browse/CASSANDRA-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-6682: -- Reviewer: Jonathan Ellis Component/s: Core Fix Version/s: (was: 1.2.16) Assignee: Paulo Gaspar Loss of precision under concurrent access on o.a.c.utils.EstimatedHistogram methods --- Key: CASSANDRA-6682 URL: https://issues.apache.org/jira/browse/CASSANDRA-6682 Project: Cassandra Issue Type: Bug Components: Core Reporter: Paulo Gaspar Assignee: Paulo Gaspar Priority: Minor Fix For: 2.0.6 Attachments: 6682.txt On method getBuckets(), under concurrent use of an instance, the value of a buckets variable element might change between its access on the 1st for cycle and its access on the 2nd, when it is reset to 0. This means that, if one collects metrics by repeatedly calling estHist.getBuckets(true), (e.g.: to sum its values) it will miss counting some values added to buckets entries between that 1st and 2nd access. On method mean(), if the i-th entry of buckets changes value between the 1st and 2nd access inside the for cycle, than the elements and sum accumulators are not working with the same values for that entry. It is more precise (and faster) to use a local variable to read the value just once. Not an error but a semantic improvement: at my initial read of this class, I thought the buckets and bucketOffsets fields could change length. Such perception can be avoided by making the bucketOffsets field final. -- This message was sent by Atlassian JIRA (v6.1.5#6160)