cassandra git commit: remove dead code
Repository: cassandra Updated Branches: refs/heads/trunk 74647a8cc -> da995b72e remove dead code Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/da995b72 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/da995b72 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/da995b72 Branch: refs/heads/trunk Commit: da995b72e3577a058606afbbb82a873089c47a80 Parents: 74647a8 Author: Dave BrosiusAuthored: Wed Jul 13 01:04:03 2016 -0400 Committer: Dave Brosius Committed: Wed Jul 13 01:04:34 2016 -0400 -- .../apache/cassandra/db/filter/ClusteringIndexSliceFilter.java | 5 - 1 file changed, 5 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/da995b72/src/java/org/apache/cassandra/db/filter/ClusteringIndexSliceFilter.java -- diff --git a/src/java/org/apache/cassandra/db/filter/ClusteringIndexSliceFilter.java b/src/java/org/apache/cassandra/db/filter/ClusteringIndexSliceFilter.java index ba30dcf..02a44d7 100644 --- a/src/java/org/apache/cassandra/db/filter/ClusteringIndexSliceFilter.java +++ b/src/java/org/apache/cassandra/db/filter/ClusteringIndexSliceFilter.java @@ -94,11 +94,6 @@ public class ClusteringIndexSliceFilter extends AbstractClusteringIndexFilter // the range extend) and it's harmless to leave them. class FilterNotIndexed extends Transformation { -public boolean isDoneForPartition() -{ -return tester.isDone(); -} - @Override public Row applyToRow(Row row) {
[jira] [Updated] (CASSANDRA-7384) Collect metrics on queries by consistency level
[ https://issues.apache.org/jira/browse/CASSANDRA-7384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sankalp kohli updated CASSANDRA-7384: - Attachment: (was: CASSANDRA-7384_3.0.txt) > Collect metrics on queries by consistency level > --- > > Key: CASSANDRA-7384 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7384 > Project: Cassandra > Issue Type: Improvement >Reporter: Vishy Kasar >Assignee: sankalp kohli >Priority: Minor > Fix For: 3.x > > Attachments: CASSANDRA-7384_3.0_v2.txt > > > We had cases where cassandra client users thought that they were doing > queries at one consistency level but turned out to be not correct. It will be > good to collect metrics on number of queries done at various consistency > level on the server. See the equivalent JIRA on java driver: > https://datastax-oss.atlassian.net/browse/JAVA-354 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7384) Collect metrics on queries by consistency level
[ https://issues.apache.org/jira/browse/CASSANDRA-7384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374336#comment-15374336 ] sankalp kohli commented on CASSANDRA-7384: -- Changed it to EnumMap > Collect metrics on queries by consistency level > --- > > Key: CASSANDRA-7384 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7384 > Project: Cassandra > Issue Type: Improvement >Reporter: Vishy Kasar >Assignee: sankalp kohli >Priority: Minor > Fix For: 3.x > > Attachments: CASSANDRA-7384_3.0.txt, CASSANDRA-7384_3.0_v2.txt > > > We had cases where cassandra client users thought that they were doing > queries at one consistency level but turned out to be not correct. It will be > good to collect metrics on number of queries done at various consistency > level on the server. See the equivalent JIRA on java driver: > https://datastax-oss.atlassian.net/browse/JAVA-354 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-7384) Collect metrics on queries by consistency level
[ https://issues.apache.org/jira/browse/CASSANDRA-7384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sankalp kohli updated CASSANDRA-7384: - Attachment: CASSANDRA-7384_3.0_v2.txt > Collect metrics on queries by consistency level > --- > > Key: CASSANDRA-7384 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7384 > Project: Cassandra > Issue Type: Improvement >Reporter: Vishy Kasar >Assignee: sankalp kohli >Priority: Minor > Fix For: 3.x > > Attachments: CASSANDRA-7384_3.0.txt, CASSANDRA-7384_3.0_v2.txt > > > We had cases where cassandra client users thought that they were doing > queries at one consistency level but turned out to be not correct. It will be > good to collect metrics on number of queries done at various consistency > level on the server. See the equivalent JIRA on java driver: > https://datastax-oss.atlassian.net/browse/JAVA-354 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12158) dtest failure in thrift_tests.TestMutations.test_describe_keyspace
[ https://issues.apache.org/jira/browse/CASSANDRA-12158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Thompson updated CASSANDRA-12158: Status: Patch Available (was: Open) https://github.com/riptano/cassandra-dtest/pull/1091 > dtest failure in thrift_tests.TestMutations.test_describe_keyspace > -- > > Key: CASSANDRA-12158 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12158 > Project: Cassandra > Issue Type: Test >Reporter: Sean McCarthy >Assignee: Philip Thompson > Labels: dtest > Attachments: node1.log > > > example failure: > http://cassci.datastax.com/job/cassandra-2.1_dtest/492/testReport/thrift_tests/TestMutations/test_describe_keyspace > Failed on CassCI build cassandra-2.1_dtest #492 > {code} > Stacktrace > Traceback (most recent call last): > File "/usr/lib/python2.7/unittest/case.py", line 329, in run > testMethod() > File "/home/automaton/cassandra-dtest/thrift_tests.py", line 1507, in > test_describe_keyspace > assert len(kspaces) == 4, [x.name for x in kspaces] # ['Keyspace2', > 'Keyspace1', 'system', 'system_traces'] > AssertionError: ['Keyspace2', 'system', 'Keyspace1', 'ValidKsForUpdate', > 'system_traces'] > {code} > Related failures: > http://cassci.datastax.com/job/cassandra-2.2_novnode_dtest/304/testReport/thrift_tests/TestMutations/test_describe_keyspace/ > http://cassci.datastax.com/job/cassandra-3.0_dtest/767/testReport/thrift_tests/TestMutations/test_describe_keyspace/ > http://cassci.datastax.com/job/cassandra-3.0_novnode_dtest/264/testReport/thrift_tests/TestMutations/test_describe_keyspace/ > http://cassci.datastax.com/job/trunk_dtest/1301/testReport/thrift_tests/TestMutations/test_describe_keyspace/ > http://cassci.datastax.com/job/trunk_novnode_dtest/421/testReport/thrift_tests/TestMutations/test_describe_keyspace/ > http://cassci.datastax.com/job/cassandra-3.9_dtest/6/testReport/thrift_tests/TestMutations/test_describe_keyspace/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12150) cqlsh does not automatically downgrade CQL version
[ https://issues.apache.org/jira/browse/CASSANDRA-12150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefania updated CASSANDRA-12150: - Resolution: Fixed Assignee: Yusuke Takata Fix Version/s: 3.10 Status: Resolved (was: Patch Available) > cqlsh does not automatically downgrade CQL version > -- > > Key: CASSANDRA-12150 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12150 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Yusuke Takata >Assignee: Yusuke Takata >Priority: Minor > Labels: cqlsh > Fix For: 3.10 > > Attachments: patch.txt > > > Cassandra drivers such as the Python driver can automatically connect a > supported version, > but I found that cqlsh does not automatically downgrade CQL version as the > following. > {code} > $ cqlsh > Connection error: ('Unable to connect to any servers', {'127.0.0.1': > ProtocolError("cql_version '3.4.2' is not supported by remote (w/ native > protocol). Supported versions: [u'3.4.0']",)}) > {code} > I think that the function is useful for cqlsh too. > Could someone review the attached patch? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12150) cqlsh does not automatically downgrade CQL version
[ https://issues.apache.org/jira/browse/CASSANDRA-12150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefania updated CASSANDRA-12150: - Component/s: (was: CQL) Tools > cqlsh does not automatically downgrade CQL version > -- > > Key: CASSANDRA-12150 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12150 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Yusuke Takata >Priority: Minor > Labels: cqlsh > Attachments: patch.txt > > > Cassandra drivers such as the Python driver can automatically connect a > supported version, > but I found that cqlsh does not automatically downgrade CQL version as the > following. > {code} > $ cqlsh > Connection error: ('Unable to connect to any servers', {'127.0.0.1': > ProtocolError("cql_version '3.4.2' is not supported by remote (w/ native > protocol). Supported versions: [u'3.4.0']",)}) > {code} > I think that the function is useful for cqlsh too. > Could someone review the attached patch? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12150) cqlsh does not automatically downgrade CQL version
[ https://issues.apache.org/jira/browse/CASSANDRA-12150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374177#comment-15374177 ] Stefania commented on CASSANDRA-12150: -- Thanks for the review. Committed to trunk as 74647a8cc5b18b86ee5cfdb071a8644630a1e7f3 with the following in NEWS.txt: {code} - cqlsh can now connect to older Cassandra versions by downgrading the native protocol version. Please note that this is currently not part of our release testing and, as a consequence, it is not guaranteed to work in all cases. See CASSANDRA-12150 for more details. {code} We can consider adding a warning later on if, as you said, it causes frequent problems. > cqlsh does not automatically downgrade CQL version > -- > > Key: CASSANDRA-12150 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12150 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Yusuke Takata >Priority: Minor > Labels: cqlsh > Attachments: patch.txt > > > Cassandra drivers such as the Python driver can automatically connect a > supported version, > but I found that cqlsh does not automatically downgrade CQL version as the > following. > {code} > $ cqlsh > Connection error: ('Unable to connect to any servers', {'127.0.0.1': > ProtocolError("cql_version '3.4.2' is not supported by remote (w/ native > protocol). Supported versions: [u'3.4.0']",)}) > {code} > I think that the function is useful for cqlsh too. > Could someone review the attached patch? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
cassandra git commit: cqlsh does not automatically downgrade CQL version
Repository: cassandra Updated Branches: refs/heads/trunk adffb3602 -> 74647a8cc cqlsh does not automatically downgrade CQL version patch by Yusuke Takata and Stefania Alborghetti; reviewed by Stefania Alborghetti and Tyler Hobbs for CASSANDRA-12150 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/74647a8c Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/74647a8c Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/74647a8c Branch: refs/heads/trunk Commit: 74647a8cc5b18b86ee5cfdb071a8644630a1e7f3 Parents: adffb36 Author: Yusuke TakataAuthored: Mon Jul 11 09:56:23 2016 +0800 Committer: Stefania Alborghetti Committed: Wed Jul 13 10:00:21 2016 +0800 -- CHANGES.txt | 1 + NEWS.txt | 5 ++ bin/cqlsh.py | 22 +++--- pylib/cqlshlib/test/cassconnect.py | 8 +-- pylib/cqlshlib/test/test_cqlsh_completion.py | 2 +- pylib/cqlshlib/test/test_cqlsh_output.py | 86 ++- 6 files changed, 60 insertions(+), 64 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/74647a8c/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index df07ba0..f63bd2b 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 3.10 + * cqlsh does not automatically downgrade CQL version (CASSANDRA-12150) * Omit (de)serialization of state variable in UDAs (CASSANDRA-9613) * Create a system table to expose prepared statements (CASSANDRA-8831) * Reuse DataOutputBuffer from ColumnIndex (CASSANDRA-11970) http://git-wip-us.apache.org/repos/asf/cassandra/blob/74647a8c/NEWS.txt -- diff --git a/NEWS.txt b/NEWS.txt index 52eee1a..fd6f005 100644 --- a/NEWS.txt +++ b/NEWS.txt @@ -24,6 +24,11 @@ New features previously prepared statements - i.e. in many cases clients do not need to re-prepare statements against restarted nodes. + - cqlsh can now connect to older Cassandra versions by downgrading the native + protocol version. Please note that this is currently not part of our release + testing and, as a consequence, it is not guaranteed to work in all cases. + See CASSANDRA-12150 for more details. + Upgrading - - Nothing specific to 3.10 but please see previous versions upgrading section, http://git-wip-us.apache.org/repos/asf/cassandra/blob/74647a8c/bin/cqlsh.py -- diff --git a/bin/cqlsh.py b/bin/cqlsh.py index 3e03767..7877800 100644 --- a/bin/cqlsh.py +++ b/bin/cqlsh.py @@ -177,7 +177,6 @@ from cqlshlib.util import get_file_encoding_bomsize, trim_if_present DEFAULT_HOST = '127.0.0.1' DEFAULT_PORT = 9042 DEFAULT_SSL = False -DEFAULT_CQLVER = '3.4.2' DEFAULT_PROTOCOL_VERSION = 4 DEFAULT_CONNECT_TIMEOUT_SECONDS = 5 DEFAULT_REQUEST_TIMEOUT_SECONDS = 10 @@ -219,8 +218,9 @@ parser.add_option('--debug', action='store_true', parser.add_option("--encoding", help="Specify a non-default encoding for output." + " (Default: %s)" % (UTF8,)) parser.add_option("--cqlshrc", help="Specify an alternative cqlshrc file location.") -parser.add_option('--cqlversion', default=DEFAULT_CQLVER, - help='Specify a particular CQL version (default: %default).' +parser.add_option('--cqlversion', default=None, + help='Specify a particular CQL version, ' + 'by default the highest version supported by the server will be used.' ' Examples: "3.0.3", "3.1.0"') parser.add_option("-e", "--execute", help='Execute the statement and quit.') parser.add_option("--connect-timeout", default=DEFAULT_CONNECT_TIMEOUT_SECONDS, dest='connect_timeout', @@ -662,7 +662,7 @@ class Shell(cmd.Cmd): def __init__(self, hostname, port, color=False, username=None, password=None, encoding=None, stdin=None, tty=True, completekey=DEFAULT_COMPLETEKEY, browser=None, use_conn=None, - cqlver=DEFAULT_CQLVER, keyspace=None, + cqlver=None, keyspace=None, tracing_enabled=False, expand_enabled=False, display_nanotime_format=DEFAULT_NANOTIME_FORMAT, display_timestamp_format=DEFAULT_TIMESTAMP_FORMAT, @@ -701,7 +701,6 @@ class Shell(cmd.Cmd): control_connection_timeout=connect_timeout, connect_timeout=connect_timeout) self.owns_connection = not use_conn -self.set_expanded_cql_version(cqlver) if
[jira] [Commented] (CASSANDRA-9318) Bound the number of in-flight requests at the coordinator
[ https://issues.apache.org/jira/browse/CASSANDRA-9318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374158#comment-15374158 ] Stefania commented on CASSANDRA-9318: - bq. Right, but there isn't much we can do without way more invasive changes. Anyway, I don't think that's actually a problem, as if the coordinator is overloaded we'll end up generating too many hints and fail with OverloadedException (this time with its original meaning), so we should be covered. I tend to agree that it is an approximation we can live with; I also would rather not change the lower levels of messaging service for this. bq. Does it mean we should advance the protocol version in this issue, or delegate to a new issue? We have a number of issues waiting for protocol V5, they are labeled as {{protocolv5}}. Either we make this issue dependent on V5 as well or, since we are committing this as disabled, we delegate to a new issue that is dependent on V5. bq. Do you see any complexity I'm missing there? A new flag would involve a new version and it would need to be handled during rolling upgrades. Even if on its own it is not too complex, the system in its entirety becomes even more complex (different versions, compression, cross-node-timeouts, some verbs are droppable, others aren't and the list goes on). Unless it solves a problem, I don't think we should consider it; and we are saying in other parts of this conversation that hints are no longer a problem. bq. as the advantage would be increased consistency at the expense of more resource consumption, We don't increase consistency if the client has been told the mutation failed IMO. If we are instead referring to replicas that were out of the CL pool and temporarily overloaded, I think they are better off dropping mutations and handling them later on through hints. Basically, I see dropping mutations replica side as a self defense mechanism for replicas, I don't think we should remove it, rather we should focus on a backpressure strategy such that replicas don't need to drop mutations. Also, for the time being, I'd rather focus on the major issue, which is that we haven't reached consensus on how to apply backpressure yet, and propose this new idea in a follow up ticket if backpressure is successful. bq. These are valid concerns of course, and given similar concerns from Jonathan Ellis, I'm working on some changes to avoid write timeouts due to healthy replicas unnaturally throttled by unhealthy ones, and depending on Jonathan Ellis answer to my last comment above, maybe only actually back-pressure if the CL is not met. OK, so we are basically trying to address the 3 scenarios by throttling/failing only if the system as a whole cannot handle the mutations (that is at least CL replicas are slow/overloaded) whereas if less than CL replicas are slow/overloaded, those replicas get hinted? > Bound the number of in-flight requests at the coordinator > - > > Key: CASSANDRA-9318 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9318 > Project: Cassandra > Issue Type: Improvement > Components: Local Write-Read Paths, Streaming and Messaging >Reporter: Ariel Weisberg >Assignee: Sergio Bossa > Attachments: 9318-3.0-nits-trailing-spaces.patch, backpressure.png, > limit.btm, no_backpressure.png > > > It's possible to somewhat bound the amount of load accepted into the cluster > by bounding the number of in-flight requests and request bytes. > An implementation might do something like track the number of outstanding > bytes and requests and if it reaches a high watermark disable read on client > connections until it goes back below some low watermark. > Need to make sure that disabling read on the client connection won't > introduce other issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11698) dtest failure in materialized_views_test.TestMaterializedViews.clustering_column_test
[ https://issues.apache.org/jira/browse/CASSANDRA-11698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joshua McKenzie updated CASSANDRA-11698: Assignee: Carl Yeksigian > dtest failure in > materialized_views_test.TestMaterializedViews.clustering_column_test > - > > Key: CASSANDRA-11698 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11698 > Project: Cassandra > Issue Type: Bug >Reporter: Russ Hatch >Assignee: Carl Yeksigian > Labels: dtest > Attachments: node1.log, node1_debug.log, node2.log, node2_debug.log, > node3.log, node3_debug.log > > > recent failure, test has flapped before a while back. > {noformat} > Expecting 2 users, got 1 > {noformat} > http://cassci.datastax.com/job/cassandra-3.0_dtest/688/testReport/materialized_views_test/TestMaterializedViews/clustering_column_test > Failed on CassCI build cassandra-3.0_dtest #688 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12184) incorrect compaction log information on totalSourceRows in C* pre-3.8 versions
[ https://issues.apache.org/jira/browse/CASSANDRA-12184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremiah Jordan updated CASSANDRA-12184: Fix Version/s: 3.0.x > incorrect compaction log information on totalSourceRows in C* pre-3.8 versions > -- > > Key: CASSANDRA-12184 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12184 > Project: Cassandra > Issue Type: Bug > Components: Compaction, Observability >Reporter: Wei Deng >Priority: Minor > Fix For: 3.0.x > > > I was looking at some confusing compaction log information on C* 3.0.7 and > realized that we have a bug that was trivially fixed in C* 3.8. > Basically here is the log entry in debug.log (as most of the compaction > related log have been moved to debug.log due to adjustment in > CASSANDRA-10241). > {noformat} > DEBUG [CompactionExecutor:6] 2016-07-12 15:38:28,471 CompactionTask.java:217 > - Compacted (96aa1ba6-4846-11e6-adb7-17866fa8ddfd) 4 sstables to > [/var/lib/cassandra/data/keyspace1/standard1-713f7920484411e6adb717866fa8ddfd/mb-5-big,] > to level=0. 267,974,735 bytes to 78,187,400 (~29% of original) in 39,067ms > = 1.908652MB/s. 0 total partitions merged to 332,904. Partition merge > counts were {1:9008, 2:34822, 3:74505, 4:214569, } > DEBUG [CompactionExecutor:4] 2016-07-12 20:51:56,578 CompactionTask.java:217 > - Compacted (786cd9d0-4872-11e6-8755-79a37e6d8141) 4 sstables to > [/var/lib/cassandra/data/system_schema/indexes-0feb57ac311f382fba6d9024d305702f/mb-25-big,] > to level=0. 620 bytes to 498 (~80% of original) in 51ms = 0.009312MB/s. 0 > total partitions merged to 6. Partition merge counts were {1:4, 3:2, } > DEBUG [CompactionExecutor:4] 2016-07-12 20:51:58,345 CompactionTask.java:217 > - Compacted (79771de0-4872-11e6-8755-79a37e6d8141) 4 sstables to > [/var/lib/cassandra/data/system_schema/columns-24101c25a2ae3af787c1b40ee1aca33f/mb-65-big,] > to level=0. 14,113 bytes to 9,553 (~67% of original) in 70ms = > 0.130149MB/s. 0 total partitions merged to 16. Partition merge counts were > {1:13, 2:2, 3:1, } > DEBUG [CompactionExecutor:3] 2016-07-12 20:52:00,415 CompactionTask.java:217 > - Compacted (7ab6a2c0-4872-11e6-8755-79a37e6d8141) 4 sstables to > [/var/lib/cassandra/data/system_schema/keyspaces-abac5682dea631c5b535b3d6cffd0fb6/mb-85-big,] > to level=0. 1,066 bytes to 611 (~57% of original) in 48ms = 0.012139MB/s. > 0 total partitions merged to 16. Partition merge counts were {1:13, 2:2, > 4:1, } > DEBUG [CompactionExecutor:4] 2016-07-12 20:52:00,442 CompactionTask.java:217 > - Compacted (7abae880-4872-11e6-8755-79a37e6d8141) 4 sstables to > [/var/lib/cassandra/data/system_schema/tables-afddfb9dbc1e30688056eed6c302ba09/mb-77-big,] > to level=0. 6,910 bytes to 4,396 (~63% of original) in 48ms = 0.087341MB/s. > 0 total partitions merged to 16. Partition merge counts were {1:13, 2:2, > 3:1, } > {noformat} > Note no matter if it's system table or user table, it's always showing "0 > total partitions merged to xx", which is incorrect information due to this > code segment > https://github.com/apache/cassandra/blob/cassandra-3.0.7/src/java/org/apache/cassandra/db/compaction/CompactionTask.java#L215-217. > Basically it only initialized the totalSourceRows value with 0 and never > assigned a real calculated value to it. Looks like the latest > [commit|https://github.com/tjake/cassandra/blob/dc2951d1684777cf70aab401515d755699af99bc/src/java/org/apache/cassandra/db/compaction/CompactionTask.java#L225-226] > from CASSANDRA-12080 fixed this problem, but it only got checked into 3.8 > branch. > Since this can make people doubt the accuracy of compaction related log > entries, and the changes made in CASSANDRA-12080 are only log metrics > related, low impact changes, I'd advocate we backport the change from > CASSANDRA-12080 into C*-3.0 branch as many people's production C*-3.0 version > can benefit from the bug fix, along with better compaction log information in > general. I realize that CASSANDRA-12080 may be based on the C*-3.6 changes in > CASSANDRA-10805, so this means we may have to bring in changes from > CASSANDRA-10805 as well if CASSANDRA-12080 cannot be cleanly rebased on > C*-3.0 branch, but both are going to benefit compaction observability in > production on C*-3.0.x versions, so both should be welcomed changes in C*-3.0 > branch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12133) Failed to load Java8 implementation ohc-core-j8
[ https://issues.apache.org/jira/browse/CASSANDRA-12133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Stupp updated CASSANDRA-12133: - Fix Version/s: (was: 3.x) 3.0.x > Failed to load Java8 implementation ohc-core-j8 > --- > > Key: CASSANDRA-12133 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12133 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: Ubuntu 14.04, Java 1.8.0_91 >Reporter: Mike >Assignee: Robert Stupp >Priority: Trivial > Fix For: 3.0.x > > > After enabling row cache in cassandra.yaml by setting row_cache_size_in_mb, I > receive this warning in system.log during startup: > {noformat} > WARN [main] 2016-07-05 13:36:14,671 Uns.java:169 - Failed to load Java8 > implementation ohc-core-j8 : java.lang.NoSuchMethodException: > org.caffinitas.ohc.linked.UnsExt8.(java.lang.Class) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12184) incorrect compaction log information on totalSourceRows in C* pre-3.8 versions
[ https://issues.apache.org/jira/browse/CASSANDRA-12184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Deng updated CASSANDRA-12184: - Description: I was looking at some confusing compaction log information on C* 3.0.7 and realized that we have a bug that was trivially fixed in C* 3.8. Basically here is the log entry in debug.log (as most of the compaction related log have been moved to debug.log due to adjustment in CASSANDRA-10241). {noformat} DEBUG [CompactionExecutor:6] 2016-07-12 15:38:28,471 CompactionTask.java:217 - Compacted (96aa1ba6-4846-11e6-adb7-17866fa8ddfd) 4 sstables to [/var/lib/cassandra/data/keyspace1/standard1-713f7920484411e6adb717866fa8ddfd/mb-5-big,] to level=0. 267,974,735 bytes to 78,187,400 (~29% of original) in 39,067ms = 1.908652MB/s. 0 total partitions merged to 332,904. Partition merge counts were {1:9008, 2:34822, 3:74505, 4:214569, } DEBUG [CompactionExecutor:4] 2016-07-12 20:51:56,578 CompactionTask.java:217 - Compacted (786cd9d0-4872-11e6-8755-79a37e6d8141) 4 sstables to [/var/lib/cassandra/data/system_schema/indexes-0feb57ac311f382fba6d9024d305702f/mb-25-big,] to level=0. 620 bytes to 498 (~80% of original) in 51ms = 0.009312MB/s. 0 total partitions merged to 6. Partition merge counts were {1:4, 3:2, } DEBUG [CompactionExecutor:4] 2016-07-12 20:51:58,345 CompactionTask.java:217 - Compacted (79771de0-4872-11e6-8755-79a37e6d8141) 4 sstables to [/var/lib/cassandra/data/system_schema/columns-24101c25a2ae3af787c1b40ee1aca33f/mb-65-big,] to level=0. 14,113 bytes to 9,553 (~67% of original) in 70ms = 0.130149MB/s. 0 total partitions merged to 16. Partition merge counts were {1:13, 2:2, 3:1, } DEBUG [CompactionExecutor:3] 2016-07-12 20:52:00,415 CompactionTask.java:217 - Compacted (7ab6a2c0-4872-11e6-8755-79a37e6d8141) 4 sstables to [/var/lib/cassandra/data/system_schema/keyspaces-abac5682dea631c5b535b3d6cffd0fb6/mb-85-big,] to level=0. 1,066 bytes to 611 (~57% of original) in 48ms = 0.012139MB/s. 0 total partitions merged to 16. Partition merge counts were {1:13, 2:2, 4:1, } DEBUG [CompactionExecutor:4] 2016-07-12 20:52:00,442 CompactionTask.java:217 - Compacted (7abae880-4872-11e6-8755-79a37e6d8141) 4 sstables to [/var/lib/cassandra/data/system_schema/tables-afddfb9dbc1e30688056eed6c302ba09/mb-77-big,] to level=0. 6,910 bytes to 4,396 (~63% of original) in 48ms = 0.087341MB/s. 0 total partitions merged to 16. Partition merge counts were {1:13, 2:2, 3:1, } {noformat} Note no matter if it's system table or user table, it's always showing "0 total partitions merged to xx", which is incorrect information due to this code segment https://github.com/apache/cassandra/blob/cassandra-3.0.7/src/java/org/apache/cassandra/db/compaction/CompactionTask.java#L215-217. Basically it only initialized the totalSourceRows value with 0 and never assigned a real calculated value to it. Looks like the latest [commit|https://github.com/tjake/cassandra/blob/dc2951d1684777cf70aab401515d755699af99bc/src/java/org/apache/cassandra/db/compaction/CompactionTask.java#L225-226] from CASSANDRA-12080 fixed this problem, but it only got checked into 3.8 branch. Since this can make people doubt the accuracy of compaction related log entries, and the changes made in CASSANDRA-12080 are only log metrics related, low impact changes, I'd advocate we backport the change from CASSANDRA-12080 into C*-3.0 branch as many people's production C*-3.0 version can benefit from the bug fix, along with better compaction log information in general. I realize that CASSANDRA-12080 may be based on the C*-3.6 changes in CASSANDRA-10805, so this means we may have to bring in changes from CASSANDRA-10805 as well if CASSANDRA-12080 cannot be cleanly rebased on C*-3.0 branch, but both are going to benefit compaction observability in production on C*-3.0.x versions, so both should be welcomed changes in C*-3.0 branch. was: I was looking at some confusing compaction log information on C* 3.0.7 and realized that we have a bug that was trivially fixed in C* 3.8. Basically here is the log entry in debug.log (as most of the compaction related log have been moved to debug.log due to adjustment in CASSANDRA-10241). {noformat} DEBUG [CompactionExecutor:6] 2016-07-12 15:38:28,471 CompactionTask.java:217 - Compacted (96aa1ba6-4846-11e6-adb7-17866fa8ddfd) 4 sstables to [/var/lib/cassandra/data/keyspace1/standard1-713f7920484411e6adb717866fa8ddfd/mb-5-big,] to level=0. 267,974,735 bytes to 78,187,400 (~29% of original) in 39,067ms = 1.908652MB/s. 0 total partitions merged to 332,904. Partition merge counts were {1:9008, 2:34822, 3:74505, 4:214569, } DEBUG [CompactionExecutor:4] 2016-07-12 20:51:56,578 CompactionTask.java:217 - Compacted (786cd9d0-4872-11e6-8755-79a37e6d8141) 4 sstables to [/var/lib/cassandra/data/system_schema/indexes-0feb57ac311f382fba6d9024d305702f/mb-25-big,] to level=0. 620 bytes
[jira] [Created] (CASSANDRA-12184) incorrect compaction log information on totalSourceRows in C* pre-3.8 versions
Wei Deng created CASSANDRA-12184: Summary: incorrect compaction log information on totalSourceRows in C* pre-3.8 versions Key: CASSANDRA-12184 URL: https://issues.apache.org/jira/browse/CASSANDRA-12184 Project: Cassandra Issue Type: Bug Components: Compaction, Observability Reporter: Wei Deng Priority: Minor I was looking at some confusing compaction log information on C* 3.0.7 and realized that we have a bug that was trivially fixed in C* 3.8. Basically here is the log entry in debug.log (as most of the compaction related log have been moved to debug.log due to adjustment in CASSANDRA-10241). {noformat} DEBUG [CompactionExecutor:6] 2016-07-12 15:38:28,471 CompactionTask.java:217 - Compacted (96aa1ba6-4846-11e6-adb7-17866fa8ddfd) 4 sstables to [/var/lib/cassandra/data/keyspace1/standard1-713f7920484411e6adb717866fa8ddfd/mb-5-big,] to level=0. 267,974,735 bytes to 78,187,400 (~29% of original) in 39,067ms = 1.908652MB/s. 0 total partitions merged to 332,904. Partition merge counts were {1:9008, 2:34822, 3:74505, 4:214569, } DEBUG [CompactionExecutor:4] 2016-07-12 20:51:56,578 CompactionTask.java:217 - Compacted (786cd9d0-4872-11e6-8755-79a37e6d8141) 4 sstables to [/var/lib/cassandra/data/system_schema/indexes-0feb57ac311f382fba6d9024d305702f/mb-25-big,] to level=0. 620 bytes to 498 (~80% of original) in 51ms = 0.009312MB/s. 0 total partitions merged to 6. Partition merge counts were {1:4, 3:2, } DEBUG [CompactionExecutor:4] 2016-07-12 20:51:58,345 CompactionTask.java:217 - Compacted (79771de0-4872-11e6-8755-79a37e6d8141) 4 sstables to [/var/lib/cassandra/data/system_schema/columns-24101c25a2ae3af787c1b40ee1aca33f/mb-65-big,] to level=0. 14,113 bytes to 9,553 (~67% of original) in 70ms = 0.130149MB/s. 0 total partitions merged to 16. Partition merge counts were {1:13, 2:2, 3:1, } DEBUG [CompactionExecutor:3] 2016-07-12 20:52:00,415 CompactionTask.java:217 - Compacted (7ab6a2c0-4872-11e6-8755-79a37e6d8141) 4 sstables to [/var/lib/cassandra/data/system_schema/keyspaces-abac5682dea631c5b535b3d6cffd0fb6/mb-85-big,] to level=0. 1,066 bytes to 611 (~57% of original) in 48ms = 0.012139MB/s. 0 total partitions merged to 16. Partition merge counts were {1:13, 2:2, 4:1, } DEBUG [CompactionExecutor:4] 2016-07-12 20:52:00,442 CompactionTask.java:217 - Compacted (7abae880-4872-11e6-8755-79a37e6d8141) 4 sstables to [/var/lib/cassandra/data/system_schema/tables-afddfb9dbc1e30688056eed6c302ba09/mb-77-big,] to level=0. 6,910 bytes to 4,396 (~63% of original) in 48ms = 0.087341MB/s. 0 total partitions merged to 16. Partition merge counts were {1:13, 2:2, 3:1, } {noformat} Note no matter if it's system table or user table, it's always showing "0 total partitions merged to xx", which is incorrect information due to this code segment https://github.com/apache/cassandra/blob/cassandra-3.0.7/src/java/org/apache/cassandra/db/compaction/CompactionTask.java#L215-217. Basically it only initialized the totalSourceRows value with 0 and never assigned a real calculated value to it. Looks like the latest commit from CASSANDRA-12080 fixed this problem, but it only got checked into 3.8 branch. Since this can make people doubt the accuracy of compaction related log entries, and the changes made in CASSANDRA-12080 are only log metrics related, low impact changes, I'd advocate we backport the change from CASSANDRA-12080 into C*-3.0 branch as many people's production C*-3.0 version can benefit from the bug fix, along with better compaction log information in general. I realize that CASSANDRA-12080 may be based on the C*-3.6 changes in CASSANDRA-10805, so this means we may have to bring in changes from CASSANDRA-10805 as well if CASSANDRA-12080 cannot be cleanly rebased on C*-3.0 branch, but both are going to benefit compaction observability in production on C*-3.0.x versions, so both should be welcomed changes in C*-3.0 branch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9613) Omit (de)serialization of state variable in UDAs
[ https://issues.apache.org/jira/browse/CASSANDRA-9613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Stupp updated CASSANDRA-9613: Resolution: Fixed Fix Version/s: (was: 3.x) 3.10 Status: Resolved (was: Ready to Commit) Thank you! Committed as [adffb3602033273efdbb8b5303c62dbf33c36903|https://github.com/apache/cassandra/commit/adffb3602033273efdbb8b5303c62dbf33c36903] to [trunk|https://github.com/apache/cassandra/tree/trunk] > Omit (de)serialization of state variable in UDAs > > > Key: CASSANDRA-9613 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9613 > Project: Cassandra > Issue Type: Improvement >Reporter: Robert Stupp >Assignee: Robert Stupp >Priority: Minor > Fix For: 3.10 > > > Currently the result of each UDA's state function call is serialized and then > deserialized for the next state-function invocation and optionally final > function invocation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-12183) compaction_history system table does not capture all historical compaction sessions
Wei Deng created CASSANDRA-12183: Summary: compaction_history system table does not capture all historical compaction sessions Key: CASSANDRA-12183 URL: https://issues.apache.org/jira/browse/CASSANDRA-12183 Project: Cassandra Issue Type: Bug Components: Compaction, Observability Reporter: Wei Deng It appears that some compaction sessions are not recorded in system.compaction_history table after the compaction session successfully finishes. The following is an example (test by simply running +cassandra-stress write n=1+): {noformat} automaton@wdengdse50google-98425b985-3:~$ nodetool compactionstats pending tasks: 46 id compaction typekeyspace tablecompletedtotalunit progress fa8a4f30-4884-11e6-b916-1dbd340a212fCompaction keyspace1 standard1 4233184044 4774209875 bytes 88.67% 91e30d21-4887-11e6-b916-1dbd340a212fCompaction keyspace1 standard1836983029889773060 bytes 94.07% Active compaction remaining time : 0h00m35s automaton@wdengdse50google-98425b985-3:~$ nodetool compactionstats pending tasks: 47 id compaction typekeyspace tablecompletedtotalunit progress fa8a4f30-4884-11e6-b916-1dbd340a212fCompaction keyspace1 standard1 4353251539 4774209875 bytes 91.18% 28359094-4888-11e6-b916-1dbd340a212fCompaction keyspace1 standard1 49732274 4071652280 bytes 1.22% Active compaction remaining time : 0h04m24s {noformat} At this point you know the previous compaction session 91e30d21-4887-11e6-b916-1dbd340a212f finished and confirmation can be found from debug.log {noformat} automaton@wdengdse50google-98425b985-3:~$ grep 91e30d21-4887-11e6-b916-1dbd340a212f /var/log/cassandra/debug.log DEBUG [CompactionExecutor:4] 2016-07-12 23:22:58,674 CompactionTask.java:146 - Compacting (91e30d21-4887-11e6-b916-1dbd340a212f) [/var/lib/cassandra/data/keyspace1/standard1-9c02e9c1487c11e6b9161dbd340a212f/mb-290-big-Data.db:level=0, /var/lib/cassandra/data/keyspace1/standard1-9c02e9c1487c11e6b9161dbd340a212f/mb-279-big-Data.db:level=0, /var/lib/cassandra/data/keyspace1/standard1-9c02e9c1487c11e6b9161dbd340a212f/mb-281-big-Data.db:level=0, /var/lib/cassandra/data/keyspace1/standard1-9c02e9c1487c11e6b9161dbd340a212f/mb-280-big-Data.db:level=0, /var/lib/cassandra/data/keyspace1/standard1-9c02e9c1487c11e6b9161dbd340a212f/mb-284-big-Data.db:level=0, /var/lib/cassandra/data/keyspace1/standard1-9c02e9c1487c11e6b9161dbd340a212f/mb-283-big-Data.db:level=0, /var/lib/cassandra/data/keyspace1/standard1-9c02e9c1487c11e6b9161dbd340a212f/mb-287-big-Data.db:level=0, /var/lib/cassandra/data/keyspace1/standard1-9c02e9c1487c11e6b9161dbd340a212f/mb-292-big-Data.db:level=0, /var/lib/cassandra/data/keyspace1/standard1-9c02e9c1487c11e6b9161dbd340a212f/mb-286-big-Data.db:level=0, /var/lib/cassandra/data/keyspace1/standard1-9c02e9c1487c11e6b9161dbd340a212f/mb-289-big-Data.db:level=0, ] DEBUG [CompactionExecutor:4] 2016-07-12 23:26:56,054 CompactionTask.java:217 - Compacted (91e30d21-4887-11e6-b916-1dbd340a212f) 10 sstables to [/var/lib/cassandra/data/keyspace1/standard1-9c02e9c1487c11e6b9161dbd340a212f/mb-293-big,] to level=0. 889,773,060 bytes to 890,473,350 (~100% of original) in 237,365ms = 3.577703MB/s. 0 total partitions merged to 3,871,921. Partition merge counts were {1:3871921, } {noformat} However, if you query system.compaction_history table or run "nodetool compactionhistory | grep 91e30d21-4887-11e6-b916-1dbd340a212f" you will get nothing: {noformat} automaton@wdengdse50google-98425b985-3:~$ cqlsh -u cassandra Password: Connected to dse50 at 127.0.0.1:9042. [cqlsh 5.0.1 | Cassandra 3.0.7.1158 | DSE 5.0.0 | CQL spec 3.4.0 | Native protocol v4] Use HELP for help. cassandra@cqlsh> select * from system.compaction_history where id=91e30d21-4887-11e6-b916-1dbd340a212f; id | bytes_in | bytes_out | columnfamily_name | compacted_at | keyspace_name | rows_merged +--+---+---+--+---+- (0 rows) automaton@wdengdse50google-98425b985-3:~$ nodetool flush system automaton@wdengdse50google-98425b985-3:~$ nodetool compactionhistory | grep 91e30d21-4887-11e6-b916-1dbd340a212f {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
cassandra git commit: Omit (de)serialization of state variable in UDAs
Repository: cassandra Updated Branches: refs/heads/trunk 7751588f7 -> adffb3602 Omit (de)serialization of state variable in UDAs patch by Robert Stupp; reviewed by Tyler Hobbs for CASSANDRA-9613 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/adffb360 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/adffb360 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/adffb360 Branch: refs/heads/trunk Commit: adffb3602033273efdbb8b5303c62dbf33c36903 Parents: 7751588 Author: Robert StuppAuthored: Wed Jul 13 09:43:12 2016 +1000 Committer: Robert Stupp Committed: Wed Jul 13 09:43:12 2016 +1000 -- CHANGES.txt | 1 + .../cql3/functions/JavaBasedUDFunction.java | 49 ++-- .../cassandra/cql3/functions/JavaUDF.java | 2 + .../cql3/functions/ScriptBasedUDFunction.java | 25 +- .../cassandra/cql3/functions/UDAggregate.java | 52 - .../cql3/functions/UDFByteCodeVerifier.java | 7 ++ .../cassandra/cql3/functions/UDFunction.java| 82 ++-- .../cassandra/cql3/functions/JavaSourceUDF.txt | 8 ++ .../entities/udfverify/CallClone.java | 5 ++ .../entities/udfverify/CallComDatastax.java | 5 ++ .../entities/udfverify/CallFinalize.java| 5 ++ .../entities/udfverify/CallOrgApache.java | 5 ++ .../entities/udfverify/ClassWithField.java | 5 ++ .../udfverify/ClassWithInitializer.java | 5 ++ .../udfverify/ClassWithInitializer2.java| 5 ++ .../udfverify/ClassWithInitializer3.java| 5 ++ .../entities/udfverify/ClassWithInnerClass.java | 5 ++ .../udfverify/ClassWithInnerClass2.java | 5 ++ .../udfverify/ClassWithStaticInitializer.java | 5 ++ .../udfverify/ClassWithStaticInnerClass.java| 5 ++ .../entities/udfverify/GoodClass.java | 5 ++ .../entities/udfverify/UseOfSynchronized.java | 5 ++ .../udfverify/UseOfSynchronizedWithNotify.java | 5 ++ .../UseOfSynchronizedWithNotifyAll.java | 5 ++ .../udfverify/UseOfSynchronizedWithWait.java| 5 ++ .../udfverify/UseOfSynchronizedWithWaitL.java | 5 ++ .../udfverify/UseOfSynchronizedWithWaitLI.java | 5 ++ .../entities/udfverify/UsingMapEntry.java | 5 ++ .../validation/operations/AggregationTest.java | 59 +- 29 files changed, 349 insertions(+), 36 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/adffb360/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 6b0a118..df07ba0 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 3.10 + * Omit (de)serialization of state variable in UDAs (CASSANDRA-9613) * Create a system table to expose prepared statements (CASSANDRA-8831) * Reuse DataOutputBuffer from ColumnIndex (CASSANDRA-11970) * Remove DatabaseDescriptor dependency from SegmentedFile (CASSANDRA-11580) http://git-wip-us.apache.org/repos/asf/cassandra/blob/adffb360/src/java/org/apache/cassandra/cql3/functions/JavaBasedUDFunction.java -- diff --git a/src/java/org/apache/cassandra/cql3/functions/JavaBasedUDFunction.java b/src/java/org/apache/cassandra/cql3/functions/JavaBasedUDFunction.java index 87f5019..34c6cc9 100644 --- a/src/java/org/apache/cassandra/cql3/functions/JavaBasedUDFunction.java +++ b/src/java/org/apache/cassandra/cql3/functions/JavaBasedUDFunction.java @@ -191,7 +191,7 @@ public final class JavaBasedUDFunction extends UDFunction // javaParamTypes is just the Java representation for argTypes resp. argDataTypes TypeToken[] javaParamTypes = UDHelper.typeTokens(argCodecs, calledOnNullInput); -// javaReturnType is just the Java representation for returnType resp. returnDataType +// javaReturnType is just the Java representation for returnType resp. returnTypeCodec TypeToken javaReturnType = returnCodec.getJavaType(); // put each UDF in a separate package to prevent cross-UDF code access @@ -222,7 +222,10 @@ public final class JavaBasedUDFunction extends UDFunction s = body; break; case "arguments": -s = generateArguments(javaParamTypes, argNames); +s = generateArguments(javaParamTypes, argNames, false); +break; +case "arguments_aggregate": +s = generateArguments(javaParamTypes, argNames, true); break; case "argument_list": s = generateArgumentList(javaParamTypes, argNames); @@ -326,7
[jira] [Commented] (CASSANDRA-7384) Collect metrics on queries by consistency level
[ https://issues.apache.org/jira/browse/CASSANDRA-7384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15373960#comment-15373960 ] Robert Stupp commented on CASSANDRA-7384: - Just looked briefly at the patch: can you make the two maps of type {{EnumMap}} instead of CHM? > Collect metrics on queries by consistency level > --- > > Key: CASSANDRA-7384 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7384 > Project: Cassandra > Issue Type: Improvement >Reporter: Vishy Kasar >Assignee: sankalp kohli >Priority: Minor > Fix For: 3.x > > Attachments: CASSANDRA-7384_3.0.txt > > > We had cases where cassandra client users thought that they were doing > queries at one consistency level but turned out to be not correct. It will be > good to collect metrics on number of queries done at various consistency > level on the server. See the equivalent JIRA on java driver: > https://datastax-oss.atlassian.net/browse/JAVA-354 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11393) dtest failure in upgrade_tests.upgrade_through_versions_test.ProtoV3Upgrade_2_1_UpTo_3_0_HEAD.rolling_upgrade_test
[ https://issues.apache.org/jira/browse/CASSANDRA-11393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15373953#comment-15373953 ] Russ Hatch commented on CASSANDRA-11393: Sorry I missed this earlier. I updated the dtest branch to point to your 3.0/3.9 branches, and I've got another test run kicked off over here: http://cassci.datastax.com/view/Upgrades/job/upgrade_tests-all-custom_branch_runs/43/ > dtest failure in > upgrade_tests.upgrade_through_versions_test.ProtoV3Upgrade_2_1_UpTo_3_0_HEAD.rolling_upgrade_test > -- > > Key: CASSANDRA-11393 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11393 > Project: Cassandra > Issue Type: Bug > Components: Coordination, Streaming and Messaging >Reporter: Philip Thompson >Assignee: Benjamin Lerer > Labels: dtest > Fix For: 3.0.x, 3.x > > Attachments: 11393-3.0.txt > > > We are seeing a failure in the upgrade tests that go from 2.1 to 3.0 > {code} > node2: ERROR [SharedPool-Worker-2] 2016-03-10 20:05:17,865 Message.java:611 - > Unexpected exception during request; channel = [id: 0xeb79b477, > /127.0.0.1:39613 => /127.0.0.2:9042] > java.lang.AssertionError: null > at > org.apache.cassandra.db.ReadCommand$LegacyReadCommandSerializer.serializedSize(ReadCommand.java:1208) > ~[main/:na] > at > org.apache.cassandra.db.ReadCommand$LegacyReadCommandSerializer.serializedSize(ReadCommand.java:1155) > ~[main/:na] > at org.apache.cassandra.net.MessageOut.payloadSize(MessageOut.java:166) > ~[main/:na] > at > org.apache.cassandra.net.OutboundTcpConnectionPool.getConnection(OutboundTcpConnectionPool.java:72) > ~[main/:na] > at > org.apache.cassandra.net.MessagingService.getConnection(MessagingService.java:609) > ~[main/:na] > at > org.apache.cassandra.net.MessagingService.sendOneWay(MessagingService.java:758) > ~[main/:na] > at > org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:701) > ~[main/:na] > at > org.apache.cassandra.net.MessagingService.sendRRWithFailure(MessagingService.java:684) > ~[main/:na] > at > org.apache.cassandra.service.AbstractReadExecutor.makeRequests(AbstractReadExecutor.java:110) > ~[main/:na] > at > org.apache.cassandra.service.AbstractReadExecutor.makeDataRequests(AbstractReadExecutor.java:85) > ~[main/:na] > at > org.apache.cassandra.service.AbstractReadExecutor$AlwaysSpeculatingReadExecutor.executeAsync(AbstractReadExecutor.java:330) > ~[main/:na] > at > org.apache.cassandra.service.StorageProxy$SinglePartitionReadLifecycle.doInitialQueries(StorageProxy.java:1699) > ~[main/:na] > at > org.apache.cassandra.service.StorageProxy.fetchRows(StorageProxy.java:1654) > ~[main/:na] > at > org.apache.cassandra.service.StorageProxy.readRegular(StorageProxy.java:1601) > ~[main/:na] > at > org.apache.cassandra.service.StorageProxy.read(StorageProxy.java:1520) > ~[main/:na] > at > org.apache.cassandra.db.SinglePartitionReadCommand.execute(SinglePartitionReadCommand.java:302) > ~[main/:na] > at > org.apache.cassandra.service.pager.AbstractQueryPager.fetchPage(AbstractQueryPager.java:67) > ~[main/:na] > at > org.apache.cassandra.service.pager.SinglePartitionPager.fetchPage(SinglePartitionPager.java:34) > ~[main/:na] > at > org.apache.cassandra.cql3.statements.SelectStatement$Pager$NormalPager.fetchPage(SelectStatement.java:297) > ~[main/:na] > at > org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:333) > ~[main/:na] > at > org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:209) > ~[main/:na] > at > org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:76) > ~[main/:na] > at > org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:206) > ~[main/:na] > at > org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:472) > ~[main/:na] > at > org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:449) > ~[main/:na] > at > org.apache.cassandra.transport.messages.ExecuteMessage.execute(ExecuteMessage.java:130) > ~[main/:na] > at > org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:507) > [main/:na] > at > org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:401) > [main/:na] > at > io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) > [netty-all-4.0.23.Final.jar:4.0.23.Final] > at >
[jira] [Updated] (CASSANDRA-12178) Add prefixes to the name of snapshots created before a truncate or drop
[ https://issues.apache.org/jira/browse/CASSANDRA-12178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Geoffrey Yu updated CASSANDRA-12178: Attachment: 12178-3.0.txt > Add prefixes to the name of snapshots created before a truncate or drop > --- > > Key: CASSANDRA-12178 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12178 > Project: Cassandra > Issue Type: Improvement >Reporter: Geoffrey Yu >Assignee: Geoffrey Yu >Priority: Minor > Fix For: 3.0.x > > Attachments: 12178-3.0.txt, 12178-trunk.txt > > > It would be useful to be able to identify snapshots that are taken because a > table was truncated or dropped. We can do this by prepending a prefix to > snapshot names for snapshots that are created before a truncate/drop. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12178) Add prefixes to the name of snapshots created before a truncate or drop
[ https://issues.apache.org/jira/browse/CASSANDRA-12178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Geoffrey Yu updated CASSANDRA-12178: Fix Version/s: 3.0.x > Add prefixes to the name of snapshots created before a truncate or drop > --- > > Key: CASSANDRA-12178 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12178 > Project: Cassandra > Issue Type: Improvement >Reporter: Geoffrey Yu >Assignee: Geoffrey Yu >Priority: Minor > Fix For: 3.0.x > > Attachments: 12178-3.0.txt, 12178-trunk.txt > > > It would be useful to be able to identify snapshots that are taken because a > table was truncated or dropped. We can do this by prepending a prefix to > snapshot names for snapshots that are created before a truncate/drop. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-12182) redundant StatusLogger print out when both dropped message and long GC event happen
Wei Deng created CASSANDRA-12182: Summary: redundant StatusLogger print out when both dropped message and long GC event happen Key: CASSANDRA-12182 URL: https://issues.apache.org/jira/browse/CASSANDRA-12182 Project: Cassandra Issue Type: Bug Reporter: Wei Deng Priority: Minor I was stress testing a C* 3.0 environment and it appears that when the CPU is running low, HINT and MUTATION messages will start to get dropped, and the GC thread can also get some really long-running GC, and I'd get some redundant log entries in system.log like the following: {noformat} WARN [Service Thread] 2016-07-12 22:48:45,748 GCInspector.java:282 - G1 Young Generation GC in 522ms. G1 Eden Space: 68157440 -> 0; G1 Old Gen: 3376113224 -> 3468387912; G1 Survivor Space: 24117248 -> 0; INFO [Service Thread] 2016-07-12 22:48:45,763 StatusLogger.java:52 - Pool NameActive Pending Completed Blocked All Time Blocked INFO [ScheduledTasks:1] 2016-07-12 22:48:45,775 MessagingService.java:983 - MUTATION messages were dropped in last 5000 ms: 419 for internal timeout and 0 for cross node timeout INFO [ScheduledTasks:1] 2016-07-12 22:48:45,776 MessagingService.java:983 - HINT messages were dropped in last 5000 ms: 89 for internal timeout and 0 for cross node timeout INFO [ScheduledTasks:1] 2016-07-12 22:48:45,776 StatusLogger.java:52 - Pool NameActive Pending Completed Blocked All Time Blocked INFO [ScheduledTasks:1] 2016-07-12 22:48:45,798 StatusLogger.java:56 - MutationStage32 4194 32997234 0 0 INFO [ScheduledTasks:1] 2016-07-12 22:48:45,798 StatusLogger.java:56 - ViewMutationStage 0 0 0 0 0 INFO [ScheduledTasks:1] 2016-07-12 22:48:45,799 StatusLogger.java:56 - ReadStage 0 0940 0 0 INFO [Service Thread] 2016-07-12 22:48:45,800 StatusLogger.java:56 - MutationStage32 4363 32997333 0 0 INFO [Service Thread] 2016-07-12 22:48:45,801 StatusLogger.java:56 - ViewMutationStage 0 0 0 0 0 INFO [Service Thread] 2016-07-12 22:48:45,801 StatusLogger.java:56 - ReadStage 0 0940 0 0 INFO [Service Thread] 2016-07-12 22:48:45,802 StatusLogger.java:56 - RequestResponseStage 0 0 11094437 0 0 INFO [Service Thread] 2016-07-12 22:48:45,802 StatusLogger.java:56 - ReadRepairStage 0 0 5 0 0 INFO [ScheduledTasks:1] 2016-07-12 22:48:45,803 StatusLogger.java:56 - RequestResponseStage 4 0 11094509 0 0 INFO [ScheduledTasks:1] 2016-07-12 22:48:45,807 StatusLogger.java:56 - ReadRepairStage 0 0 5 0 0 INFO [ScheduledTasks:1] 2016-07-12 22:48:45,808 StatusLogger.java:56 - CounterMutationStage 0 0 0 0 0 INFO [ScheduledTasks:1] 2016-07-12 22:48:45,809 StatusLogger.java:56 - MiscStage 0 0 0 0 0 INFO [ScheduledTasks:1] 2016-07-12 22:48:45,809 StatusLogger.java:56 - CompactionExecutor262 1234 0 0 INFO [ScheduledTasks:1] 2016-07-12 22:48:45,810 StatusLogger.java:56 - MemtableReclaimMemory 0 0 79 0 0 INFO [ScheduledTasks:1] 2016-07-12 22:48:45,810 StatusLogger.java:56 - PendingRangeCalculator0 0 3 0 0 INFO [ScheduledTasks:1] 2016-07-12 22:48:45,819 StatusLogger.java:56 - GossipStage 0 0 5214 0 0 INFO [ScheduledTasks:1] 2016-07-12 22:48:45,820 StatusLogger.java:56 - SecondaryIndexManagement 0 0 3 0 0 INFO [ScheduledTasks:1] 2016-07-12 22:48:45,820 StatusLogger.java:56 - HintsDispatcher 1 2 36 0 0 INFO [ScheduledTasks:1] 2016-07-12 22:48:45,821 StatusLogger.java:56 - MigrationStage0 0 0 0 0 INFO [ScheduledTasks:1] 2016-07-12 22:48:45,822 StatusLogger.java:56 - MemtablePostFlush 1 3115 0 0 INFO [Service Thread] 2016-07-12 22:48:45,830 StatusLogger.java:56 - CounterMutationStage 0
[jira] [Commented] (CASSANDRA-12172) Fail to bootstrap new node.
[ https://issues.apache.org/jira/browse/CASSANDRA-12172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15373890#comment-15373890 ] Dikang Gu commented on CASSANDRA-12172: --- [~jjirsa], no, only the joining node has the problem, I haven't changed the phi yet, it's default value. And We do not have this problem in 2.1 clusters, I'm not sure if this is because of some changes in 2.2. > Fail to bootstrap new node. > --- > > Key: CASSANDRA-12172 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12172 > Project: Cassandra > Issue Type: Bug >Reporter: Dikang Gu > > When I try to bootstrap new node in the cluster, sometimes it failed because > of following exceptions. > {code} > 2016-07-12_05:14:55.58509 INFO 05:14:55 [main]: JOINING: Starting to > bootstrap... > 2016-07-12_05:14:56.07491 INFO 05:14:56 [GossipTasks:1]: InetAddress > /2401:db00:2011:50c7:face:0:9:0 is now DOWN > 2016-07-12_05:14:56.32219 Exception (java.lang.RuntimeException) encountered > during startup: A node required to move the data consistently is down > (/2401:db00:2011:50c7:face:0:9:0). If you wish to move the data from a > potentially inconsis > tent replica, restart the node with -Dcassandra.consistent.rangemovement=false > 2016-07-12_05:14:56.32582 ERROR 05:14:56 [main]: Exception encountered during > startup > 2016-07-12_05:14:56.32583 java.lang.RuntimeException: A node required to move > the data consistently is down (/2401:db00:2011:50c7:face:0:9:0). If you wish > to move the data from a potentially inconsistent replica, restart the node > with -Dc > assandra.consistent.rangemovement=false > 2016-07-12_05:14:56.32584 at > org.apache.cassandra.dht.RangeStreamer.getAllRangesWithStrictSourcesFor(RangeStreamer.java:264) > ~[apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20160315.c29948b] > 2016-07-12_05:14:56.32584 at > org.apache.cassandra.dht.RangeStreamer.addRanges(RangeStreamer.java:147) > ~[apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20160315.c29948b] > 2016-07-12_05:14:56.32584 at > org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:82) > ~[apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20160315.c29948b] > 2016-07-12_05:14:56.32584 at > org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:1230) > ~[apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20160315.c29948b] > 2016-07-12_05:14:56.32584 at > org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:924) > ~[apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20160315.c29948b] > 2016-07-12_05:14:56.32585 at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:709) > ~[apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20160315.c29948b] > 2016-07-12_05:14:56.32585 at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:585) > ~[apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20160315.c29948b] > 2016-07-12_05:14:56.32585 at > org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) > [apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20160315.c29948b] > 2016-07-12_05:14:56.32586 at > org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:516) > [apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20160315.c29948b] > 2016-07-12_05:14:56.32586 at > org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:625) > [apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20160315.c29948b] > 2016-07-12_05:14:56.32730 WARN 05:14:56 [StorageServiceShutdownHook]: No > local state or state is in silent shutdown, not announcing shutdown > {code} > Here are more logs: > https://gist.github.com/DikangGu/c6a83eafdbc091250eade4a3bddcc40b > I'm pretty sure there are no DOWN nodes or restarted nodes in the cluster, > but I still see a lot of nodes UP and DOWN in the gossip log, which failed > the bootstrap at the end, is this a known bug? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12172) Fail to bootstrap new node.
[ https://issues.apache.org/jira/browse/CASSANDRA-12172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15373842#comment-15373842 ] Jeff Jirsa commented on CASSANDRA-12172: Ultimately the problem is that your joining/bootstrapping node thinks a bunch of other nodes are down. Do OTHER nodes also see nodes bouncing around? Have you tuned phi / failure detector at all? > Fail to bootstrap new node. > --- > > Key: CASSANDRA-12172 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12172 > Project: Cassandra > Issue Type: Bug >Reporter: Dikang Gu > > When I try to bootstrap new node in the cluster, sometimes it failed because > of following exceptions. > {code} > 2016-07-12_05:14:55.58509 INFO 05:14:55 [main]: JOINING: Starting to > bootstrap... > 2016-07-12_05:14:56.07491 INFO 05:14:56 [GossipTasks:1]: InetAddress > /2401:db00:2011:50c7:face:0:9:0 is now DOWN > 2016-07-12_05:14:56.32219 Exception (java.lang.RuntimeException) encountered > during startup: A node required to move the data consistently is down > (/2401:db00:2011:50c7:face:0:9:0). If you wish to move the data from a > potentially inconsis > tent replica, restart the node with -Dcassandra.consistent.rangemovement=false > 2016-07-12_05:14:56.32582 ERROR 05:14:56 [main]: Exception encountered during > startup > 2016-07-12_05:14:56.32583 java.lang.RuntimeException: A node required to move > the data consistently is down (/2401:db00:2011:50c7:face:0:9:0). If you wish > to move the data from a potentially inconsistent replica, restart the node > with -Dc > assandra.consistent.rangemovement=false > 2016-07-12_05:14:56.32584 at > org.apache.cassandra.dht.RangeStreamer.getAllRangesWithStrictSourcesFor(RangeStreamer.java:264) > ~[apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20160315.c29948b] > 2016-07-12_05:14:56.32584 at > org.apache.cassandra.dht.RangeStreamer.addRanges(RangeStreamer.java:147) > ~[apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20160315.c29948b] > 2016-07-12_05:14:56.32584 at > org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:82) > ~[apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20160315.c29948b] > 2016-07-12_05:14:56.32584 at > org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:1230) > ~[apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20160315.c29948b] > 2016-07-12_05:14:56.32584 at > org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:924) > ~[apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20160315.c29948b] > 2016-07-12_05:14:56.32585 at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:709) > ~[apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20160315.c29948b] > 2016-07-12_05:14:56.32585 at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:585) > ~[apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20160315.c29948b] > 2016-07-12_05:14:56.32585 at > org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) > [apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20160315.c29948b] > 2016-07-12_05:14:56.32586 at > org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:516) > [apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20160315.c29948b] > 2016-07-12_05:14:56.32586 at > org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:625) > [apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20160315.c29948b] > 2016-07-12_05:14:56.32730 WARN 05:14:56 [StorageServiceShutdownHook]: No > local state or state is in silent shutdown, not announcing shutdown > {code} > Here are more logs: > https://gist.github.com/DikangGu/c6a83eafdbc091250eade4a3bddcc40b > I'm pretty sure there are no DOWN nodes or restarted nodes in the cluster, > but I still see a lot of nodes UP and DOWN in the gossip log, which failed > the bootstrap at the end, is this a known bug? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12172) Fail to bootstrap new node.
[ https://issues.apache.org/jira/browse/CASSANDRA-12172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15373789#comment-15373789 ] Dikang Gu commented on CASSANDRA-12172: --- And this is the replication strategy for system_distirbuted keyspace {code} CREATE KEYSPACE system_distributed WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '3'} AND durable_writes = true; {code} > Fail to bootstrap new node. > --- > > Key: CASSANDRA-12172 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12172 > Project: Cassandra > Issue Type: Bug >Reporter: Dikang Gu > > When I try to bootstrap new node in the cluster, sometimes it failed because > of following exceptions. > {code} > 2016-07-12_05:14:55.58509 INFO 05:14:55 [main]: JOINING: Starting to > bootstrap... > 2016-07-12_05:14:56.07491 INFO 05:14:56 [GossipTasks:1]: InetAddress > /2401:db00:2011:50c7:face:0:9:0 is now DOWN > 2016-07-12_05:14:56.32219 Exception (java.lang.RuntimeException) encountered > during startup: A node required to move the data consistently is down > (/2401:db00:2011:50c7:face:0:9:0). If you wish to move the data from a > potentially inconsis > tent replica, restart the node with -Dcassandra.consistent.rangemovement=false > 2016-07-12_05:14:56.32582 ERROR 05:14:56 [main]: Exception encountered during > startup > 2016-07-12_05:14:56.32583 java.lang.RuntimeException: A node required to move > the data consistently is down (/2401:db00:2011:50c7:face:0:9:0). If you wish > to move the data from a potentially inconsistent replica, restart the node > with -Dc > assandra.consistent.rangemovement=false > 2016-07-12_05:14:56.32584 at > org.apache.cassandra.dht.RangeStreamer.getAllRangesWithStrictSourcesFor(RangeStreamer.java:264) > ~[apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20160315.c29948b] > 2016-07-12_05:14:56.32584 at > org.apache.cassandra.dht.RangeStreamer.addRanges(RangeStreamer.java:147) > ~[apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20160315.c29948b] > 2016-07-12_05:14:56.32584 at > org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:82) > ~[apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20160315.c29948b] > 2016-07-12_05:14:56.32584 at > org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:1230) > ~[apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20160315.c29948b] > 2016-07-12_05:14:56.32584 at > org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:924) > ~[apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20160315.c29948b] > 2016-07-12_05:14:56.32585 at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:709) > ~[apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20160315.c29948b] > 2016-07-12_05:14:56.32585 at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:585) > ~[apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20160315.c29948b] > 2016-07-12_05:14:56.32585 at > org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) > [apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20160315.c29948b] > 2016-07-12_05:14:56.32586 at > org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:516) > [apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20160315.c29948b] > 2016-07-12_05:14:56.32586 at > org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:625) > [apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20160315.c29948b] > 2016-07-12_05:14:56.32730 WARN 05:14:56 [StorageServiceShutdownHook]: No > local state or state is in silent shutdown, not announcing shutdown > {code} > Here are more logs: > https://gist.github.com/DikangGu/c6a83eafdbc091250eade4a3bddcc40b > I'm pretty sure there are no DOWN nodes or restarted nodes in the cluster, > but I still see a lot of nodes UP and DOWN in the gossip log, which failed > the bootstrap at the end, is this a known bug? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-7384) Collect metrics on queries by consistency level
[ https://issues.apache.org/jira/browse/CASSANDRA-7384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sankalp kohli updated CASSANDRA-7384: - Status: Patch Available (was: Reopened) > Collect metrics on queries by consistency level > --- > > Key: CASSANDRA-7384 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7384 > Project: Cassandra > Issue Type: Improvement >Reporter: Vishy Kasar >Assignee: sankalp kohli >Priority: Minor > Fix For: 3.x > > Attachments: CASSANDRA-7384_3.0.txt > > > We had cases where cassandra client users thought that they were doing > queries at one consistency level but turned out to be not correct. It will be > good to collect metrics on number of queries done at various consistency > level on the server. See the equivalent JIRA on java driver: > https://datastax-oss.atlassian.net/browse/JAVA-354 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12172) Fail to bootstrap new node.
[ https://issues.apache.org/jira/browse/CASSANDRA-12172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15373779#comment-15373779 ] Dikang Gu commented on CASSANDRA-12172: --- [~jkni], thanks for looking at this: I tried several things: 1) set larger ring_delay_ms, from 30s to 60s, which does not seem to help a lot. 2) sleep 2 mins between each bootstrap, which does not help either. 3) set the range movement to be false, which introduces a new type of error: {code} 2016-07-12_21:52:09.40788 INFO 21:52:09 [SharedPool-Worker-1]: InetAddress /2401:db00:2011:50c7:face:0:2d:0 is now UP 2016-07-12_21:52:09.52132 Exception (java.lang.IllegalStateException) encountered during startup: unable to find sufficient sources for streaming range (12928845086740495435201607154872516048,12932880296782147283630181058291836395] in keyspace system_distributed 2016-07-12_21:52:09.52496 ERROR 21:52:09 [main]: Exception encountered during startup 2016-07-12_21:52:09.52497 java.lang.IllegalStateException: unable to find sufficient sources for streaming range (12928845086740495435201607154872516048,12932880296782147283630181058291836395] in keyspace system_distributed 2016-07-12_21:52:09.52498 at org.apache.cassandra.dht.RangeStreamer.getRangeFetchMap(RangeStreamer.java:308) ~[apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20160315.c29948b] 2016-07-12_21:52:09.52498 at org.apache.cassandra.dht.RangeStreamer.addRanges(RangeStreamer.java:155) ~[apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20160315.c29948b] 2016-07-12_21:52:09.52498 at org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:82) ~[apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20160315.c29948b] 2016-07-12_21:52:09.52498 at org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:1230) ~[apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20160315.c29948b] 2016-07-12_21:52:09.52498 at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:924) ~[apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20160315.c29948b] 2016-07-12_21:52:09.52499 at org.apache.cassandra.service.StorageService.initServer(StorageService.java:709) ~[apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20160315.c29948b] 2016-07-12_21:52:09.52499 at org.apache.cassandra.service.StorageService.initServer(StorageService.java:585) ~[apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20160315.c29948b] 2016-07-12_21:52:09.52499 at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) [apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20160315.c29948b] 2016-07-12_21:52:09.52500 at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:516) [apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20160315.c29948b] 2016-07-12_21:52:09.52500 at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:625) [apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20160315.c29948b] 2016-07-12_21:52:09.52646 WARN 21:52:09 [StorageServiceShutdownHook]: No local state or state is in silent shutdown, not announcing shutdown 2016-07-12_21:52:09.52659 INFO 21:52:09 [StorageServiceShutdownHook]: Waiting for messaging service to quiesce {code} I also sent you an email about this, let me know if you need more information. Thanks Dikang > Fail to bootstrap new node. > --- > > Key: CASSANDRA-12172 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12172 > Project: Cassandra > Issue Type: Bug >Reporter: Dikang Gu > > When I try to bootstrap new node in the cluster, sometimes it failed because > of following exceptions. > {code} > 2016-07-12_05:14:55.58509 INFO 05:14:55 [main]: JOINING: Starting to > bootstrap... > 2016-07-12_05:14:56.07491 INFO 05:14:56 [GossipTasks:1]: InetAddress > /2401:db00:2011:50c7:face:0:9:0 is now DOWN > 2016-07-12_05:14:56.32219 Exception (java.lang.RuntimeException) encountered > during startup: A node required to move the data consistently is down > (/2401:db00:2011:50c7:face:0:9:0). If you wish to move the data from a > potentially inconsis > tent replica, restart the node with -Dcassandra.consistent.rangemovement=false > 2016-07-12_05:14:56.32582 ERROR 05:14:56 [main]: Exception encountered during > startup > 2016-07-12_05:14:56.32583 java.lang.RuntimeException: A node required to move > the data consistently is down (/2401:db00:2011:50c7:face:0:9:0). If you wish > to move the data from a potentially inconsistent replica, restart the node > with -Dc > assandra.consistent.rangemovement=false > 2016-07-12_05:14:56.32584 at > org.apache.cassandra.dht.RangeStreamer.getAllRangesWithStrictSourcesFor(RangeStreamer.java:264) >
[jira] [Updated] (CASSANDRA-7384) Collect metrics on queries by consistency level
[ https://issues.apache.org/jira/browse/CASSANDRA-7384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sankalp kohli updated CASSANDRA-7384: - Attachment: CASSANDRA-7384_3.0.txt > Collect metrics on queries by consistency level > --- > > Key: CASSANDRA-7384 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7384 > Project: Cassandra > Issue Type: Improvement >Reporter: Vishy Kasar >Assignee: sankalp kohli >Priority: Minor > Fix For: 3.x > > Attachments: CASSANDRA-7384_3.0.txt > > > We had cases where cassandra client users thought that they were doing > queries at one consistency level but turned out to be not correct. It will be > good to collect metrics on number of queries done at various consistency > level on the server. See the equivalent JIRA on java driver: > https://datastax-oss.atlassian.net/browse/JAVA-354 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9318) Bound the number of in-flight requests at the coordinator
[ https://issues.apache.org/jira/browse/CASSANDRA-9318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15373750#comment-15373750 ] Sergio Bossa commented on CASSANDRA-9318: - [~jbellis], bq. I think you just explained why that's not a very good reason to add more complexity here, at least not without a demonstration that it's actually still a problem. Hints are not a solution for chronically overloaded clusters where clients ingest faster than replicas can consume: even if hints get delivered timely and reliably, replicas will always play catchup if there's no back-pressure. I find this pretty straightforward, but as a practical example, try injecting the attached byteman rule into a cluster and see it fall over at a rate of hundred of thousand dropped mutations per minute. bq. But CL is per request. How do you disentangle that client side? Not sure I follow your objection, can you elaborate? bq. And we're still not solving what I think is (post file-based hints) the real problem, my scenario 3. I think we do solve that, actually in a better way, which takes into consideration all replicas, not just the coordinator capacity of acting as a buffer, unless I'm missing a specific case you're referring to? > Bound the number of in-flight requests at the coordinator > - > > Key: CASSANDRA-9318 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9318 > Project: Cassandra > Issue Type: Improvement > Components: Local Write-Read Paths, Streaming and Messaging >Reporter: Ariel Weisberg >Assignee: Sergio Bossa > Attachments: 9318-3.0-nits-trailing-spaces.patch, backpressure.png, > limit.btm, no_backpressure.png > > > It's possible to somewhat bound the amount of load accepted into the cluster > by bounding the number of in-flight requests and request bytes. > An implementation might do something like track the number of outstanding > bytes and requests and if it reaches a high watermark disable read on client > connections until it goes back below some low watermark. > Need to make sure that disabling read on the client connection won't > introduce other issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9318) Bound the number of in-flight requests at the coordinator
[ https://issues.apache.org/jira/browse/CASSANDRA-9318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15373696#comment-15373696 ] Jonathan Ellis commented on CASSANDRA-9318: --- bq. I have interacted with many people using Cassandra that would actually like to see some rate limiting applied for cases 1 and 2 such that things don't fall over (shouldn't happen with new hints hopefully) I think you just explained why that's not a very good reason to add more complexity here, at least not without a demonstration that it's actually still a problem. bq. What if we tailored the algorithm to only: Rate limit if CL replicas are below the high threshold / Throw exception if CL replicas are below the low threshold. But CL is per request. How do you disentangle that client side? And we're still not solving what I think is (post file-based hints) the real problem, my scenario 3. At this point instead of adding more complexity to an approach that fundamentally doesn't solve that, why not back up and use an approach that does the right thing in all 3 cases instead? > Bound the number of in-flight requests at the coordinator > - > > Key: CASSANDRA-9318 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9318 > Project: Cassandra > Issue Type: Improvement > Components: Local Write-Read Paths, Streaming and Messaging >Reporter: Ariel Weisberg >Assignee: Sergio Bossa > Attachments: 9318-3.0-nits-trailing-spaces.patch, backpressure.png, > limit.btm, no_backpressure.png > > > It's possible to somewhat bound the amount of load accepted into the cluster > by bounding the number of in-flight requests and request bytes. > An implementation might do something like track the number of outstanding > bytes and requests and if it reaches a high watermark disable read on client > connections until it goes back below some low watermark. > Need to make sure that disabling read on the client connection won't > introduce other issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11730) [windows] dtest failure in jmx_auth_test.TestJMXAuth.basic_auth_test
[ https://issues.apache.org/jira/browse/CASSANDRA-11730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15373649#comment-15373649 ] Joshua McKenzie commented on CASSANDRA-11730: - Getting the following on Windows when run w/that branch on dtest: {noformat} == FAIL: basic_auth_test (jmx_auth_test.TestJMXAuth) -- Traceback (most recent call last): File "d:\src\cassandra-dtest\jmx_auth_test.py", line 36, in basic_auth_test node.nodetool('-u baduser -pw abc123 gossipinfo') AssertionError: "Provided username baduser and/or password are incorrect" does not match "Nodetool command 'd:\src\cassandra\bin\nodetool.bat -h localhost -p 7100 -u baduser -pw abc123 gossipinfo' failed; exit status: 1; stderr: nodetool: Failed to connect to 'localhost:7100' - FailedLoginException: 'Invalid username or password'. {noformat} With: {noformat} cassandra -v Cassandra Version: 3.10-SNAPSHOT {noformat} This is with {{cassandra cassandra}} in the local jmxremote.password file and no other users. I'm assuming that's the correct config and we're just getting slightly different output on Win? > [windows] dtest failure in jmx_auth_test.TestJMXAuth.basic_auth_test > > > Key: CASSANDRA-11730 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11730 > Project: Cassandra > Issue Type: Bug >Reporter: Russ Hatch >Assignee: Sam Tunnicliffe > Labels: dtest, windows > Fix For: 3.x > > > looks to be failing on each run so far: > http://cassci.datastax.com/job/trunk_dtest_win32/406/testReport/jmx_auth_test/TestJMXAuth/basic_auth_test > Failed on CassCI build trunk_dtest_win32 #406 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11730) [windows] dtest failure in jmx_auth_test.TestJMXAuth.basic_auth_test
[ https://issues.apache.org/jira/browse/CASSANDRA-11730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joshua McKenzie updated CASSANDRA-11730: Status: Awaiting Feedback (was: Open) > [windows] dtest failure in jmx_auth_test.TestJMXAuth.basic_auth_test > > > Key: CASSANDRA-11730 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11730 > Project: Cassandra > Issue Type: Bug >Reporter: Russ Hatch >Assignee: Sam Tunnicliffe > Labels: dtest, windows > Fix For: 3.x > > > looks to be failing on each run so far: > http://cassci.datastax.com/job/trunk_dtest_win32/406/testReport/jmx_auth_test/TestJMXAuth/basic_auth_test > Failed on CassCI build trunk_dtest_win32 #406 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11730) [windows] dtest failure in jmx_auth_test.TestJMXAuth.basic_auth_test
[ https://issues.apache.org/jira/browse/CASSANDRA-11730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joshua McKenzie updated CASSANDRA-11730: Status: Open (was: Patch Available) > [windows] dtest failure in jmx_auth_test.TestJMXAuth.basic_auth_test > > > Key: CASSANDRA-11730 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11730 > Project: Cassandra > Issue Type: Bug >Reporter: Russ Hatch >Assignee: Sam Tunnicliffe > Labels: dtest, windows > Fix For: 3.x > > > looks to be failing on each run so far: > http://cassci.datastax.com/job/trunk_dtest_win32/406/testReport/jmx_auth_test/TestJMXAuth/basic_auth_test > Failed on CassCI build trunk_dtest_win32 #406 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12158) dtest failure in thrift_tests.TestMutations.test_describe_keyspace
[ https://issues.apache.org/jira/browse/CASSANDRA-12158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joshua McKenzie updated CASSANDRA-12158: Assignee: Philip Thompson (was: DS Test Eng) Reviewer: Joel Knighton > dtest failure in thrift_tests.TestMutations.test_describe_keyspace > -- > > Key: CASSANDRA-12158 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12158 > Project: Cassandra > Issue Type: Test >Reporter: Sean McCarthy >Assignee: Philip Thompson > Labels: dtest > Attachments: node1.log > > > example failure: > http://cassci.datastax.com/job/cassandra-2.1_dtest/492/testReport/thrift_tests/TestMutations/test_describe_keyspace > Failed on CassCI build cassandra-2.1_dtest #492 > {code} > Stacktrace > Traceback (most recent call last): > File "/usr/lib/python2.7/unittest/case.py", line 329, in run > testMethod() > File "/home/automaton/cassandra-dtest/thrift_tests.py", line 1507, in > test_describe_keyspace > assert len(kspaces) == 4, [x.name for x in kspaces] # ['Keyspace2', > 'Keyspace1', 'system', 'system_traces'] > AssertionError: ['Keyspace2', 'system', 'Keyspace1', 'ValidKsForUpdate', > 'system_traces'] > {code} > Related failures: > http://cassci.datastax.com/job/cassandra-2.2_novnode_dtest/304/testReport/thrift_tests/TestMutations/test_describe_keyspace/ > http://cassci.datastax.com/job/cassandra-3.0_dtest/767/testReport/thrift_tests/TestMutations/test_describe_keyspace/ > http://cassci.datastax.com/job/cassandra-3.0_novnode_dtest/264/testReport/thrift_tests/TestMutations/test_describe_keyspace/ > http://cassci.datastax.com/job/trunk_dtest/1301/testReport/thrift_tests/TestMutations/test_describe_keyspace/ > http://cassci.datastax.com/job/trunk_novnode_dtest/421/testReport/thrift_tests/TestMutations/test_describe_keyspace/ > http://cassci.datastax.com/job/cassandra-3.9_dtest/6/testReport/thrift_tests/TestMutations/test_describe_keyspace/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-9318) Bound the number of in-flight requests at the coordinator
[ https://issues.apache.org/jira/browse/CASSANDRA-9318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15373624#comment-15373624 ] Sergio Bossa edited comment on CASSANDRA-9318 at 7/12/16 8:23 PM: -- [~Stefania], bq. if a message expires before it is sent, we consider this negatively for that replica, since we increment the outgoing rate but not the incoming rate when the callback expires, and still it may have nothing to do with the replica if the message was not sent, it may be due to the coordinator dealing with too many messages. Right, but there isn't much we can do without way more invasive changes. Anyway, I don't think that's actually a problem, as if the coordinator is overloaded we'll end up generating too many hints and fail with {{OverloadedException}} (this time with its original meaning), so we should be covered. bq. I also observe that if a replica has a low rate, then we may block when acquiring the limiter, and this will indirectly throttle for all following replicas, even if they were ready to receive mutations sooner. See my answer at the end. bq. AbstractWriteResponseHandler sets the start time in the constructor, so the time spent acquiring a rate limiter for slow replicas counts towards the total time before the coordinator throws a write timeout exception. See my answer at the end. bq. SP.sendToHintedEndpoints(), we should apply backpressure only if the destination is alive. I know, I'm holding on these changes until we settle on a plan for the whole write path (in terms of what to do with CL, the exception to be thrown etc.). bq. Let's use UnavailableException since WriteFailureException indicates a non-timeout failure when processing a mutation, and so it is not appropriate for this case. For protocol V4 we cannot change UnavailableException, but for V5 we should add a new parameter to it. At the moment it contains , we should add the number of overloaded replicas, so that drivers can treat the two cases differently. Does it mean we should advance the protocol version in this issue, or delegate to a new issue? bq. Marking messages as throttled would let the replica know if backpressure was enabled, that's true, but it also makes the existing mechanism even more complex. How so? In implementation terms, it should be literally as easy as: 1) Add a byte parameter to {{MessageOut}}. 2) Read such byte parameter from {{MessageIn}} and eventually skip dropping it replica-side. 3) If possible (didn't check it), when a "late" response is received on the coordinator, try to cancel the related hint. Do you see any complexity I'm missing there? bq. dropping mutations that have been in the queue for longer that the RPC write timeout is done not only to shed load on the replica, but also to avoid wasting resources to perform a mutation when the coordinator has already returned a timeout exception to the client. This is very true and that's why I said it's a bit of a wild idea. Obviously, that is true outside of back-pressure, as even now it is possible to return a write timeout to clients and still have some or all mutations applied. In the end, it might be good to optionally enable such behaviour, as the advantage would be increased consistency at the expense of more resource consumption, which is a tradeoff some users might want to make, but to be clear, I'm not strictly lobbying to implement it, just trying to reason about pros and cons. bq. I still have concerns regarding additional write timeout exceptions and whether an overloaded or slow replica can slow everything down. These are valid concerns of course, and given similar concerns from [~jbellis], I'm working on some changes to avoid write timeouts due to healthy replicas unnaturally throttled by unhealthy ones, and depending on [~jbellis] answer to my last comment above, maybe only actually back-pressure if the CL is not met. Stay tuned. was (Author: sbtourist): [~Stefania], bq. if a message expires before it is sent, we consider this negatively for that replica, since we increment the outgoing rate but not the incoming rate when the callback expires, and still it may have nothing to do with the replica if the message was not sent, it may be due to the coordinator dealing with too many messages. Right, but there isn't much we can do without way more invasive changes. Anyway, I don't think that's actually a problem, as if the coordinator is overloaded we'll end up generating too many hints and fail with {{OverloadedException}} (this time with its original meaning), so we should be covered. bq. I also observe that if a replica has a low rate, then we may block when acquiring the limiter, and this will indirectly throttle for all following replicas, even if they were ready to receive mutations sooner. See my answer at the end. bq. AbstractWriteResponseHandler sets the
[jira] [Updated] (CASSANDRA-11730) [windows] dtest failure in jmx_auth_test.TestJMXAuth.basic_auth_test
[ https://issues.apache.org/jira/browse/CASSANDRA-11730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joshua McKenzie updated CASSANDRA-11730: Reviewer: Joshua McKenzie > [windows] dtest failure in jmx_auth_test.TestJMXAuth.basic_auth_test > > > Key: CASSANDRA-11730 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11730 > Project: Cassandra > Issue Type: Bug >Reporter: Russ Hatch >Assignee: Sam Tunnicliffe > Labels: dtest, windows > Fix For: 3.x > > > looks to be failing on each run so far: > http://cassci.datastax.com/job/trunk_dtest_win32/406/testReport/jmx_auth_test/TestJMXAuth/basic_auth_test > Failed on CassCI build trunk_dtest_win32 #406 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9318) Bound the number of in-flight requests at the coordinator
[ https://issues.apache.org/jira/browse/CASSANDRA-9318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15373624#comment-15373624 ] Sergio Bossa commented on CASSANDRA-9318: - [~Stefania], bq. if a message expires before it is sent, we consider this negatively for that replica, since we increment the outgoing rate but not the incoming rate when the callback expires, and still it may have nothing to do with the replica if the message was not sent, it may be due to the coordinator dealing with too many messages. Right, but there isn't much we can do without way more invasive changes. Anyway, I don't think that's actually a problem, as if the coordinator is overloaded we'll end up generating too many hints and fail with {{OverloadedException}} (this time with its original meaning), so we should be covered. bq. I also observe that if a replica has a low rate, then we may block when acquiring the limiter, and this will indirectly throttle for all following replicas, even if they were ready to receive mutations sooner. See my answer at the end. bq. AbstractWriteResponseHandler sets the start time in the constructor, so the time spent acquiring a rate limiter for slow replicas counts towards the total time before the coordinator throws a write timeout exception. See my answer at the end. bq. SP.sendToHintedEndpoints(), we should apply backpressure only if the destination is alive. I know, I'm holding on these changes until we settle on a plan for the whole write path (in terms of what to do with CL, the exception to be thrown etc.). bq. Let's use UnavailableException since WriteFailureException indicates a non-timeout failure when processing a mutation, and so it is not appropriate for this case. For protocol V4 we cannot change UnavailableException, but for V5 we should add a new parameter to it. At the moment it contains , we should add the number of overloaded replicas, so that drivers can treat the two cases differently. Does it mean we should advance the protocol version in this issue, or delegate to a new issue? bq. Marking messages as throttled would let the replica know if backpressure was enabled, that's true, but it also makes the existing mechanism even more complex. How so? In implementation terms, it should be literally as easy as: 1) Add a byte parameter to {{MessageOut}}. 2) Read such byte parameter from {{MessageIn}} and eventually skip dropping it replica-side. 3) If possible (didn't check it), when a "late" response is received on the coordinator, try to cancel the related hint. Do you see any complexity I'm missing there? bq. dropping mutations that have been in the queue for longer that the RPC write timeout is done not only to shed load on the replica, but also to avoid wasting resources to perform a mutation when the coordinator has already returned a timeout exception to the client. This is very true and that's why I said it's a bit of a wild idea. Obviously, that is true outside of back-pressure, as even now it is possible to return a write timeout to clients and still have some or all mutations applied. In the end, it might be good to optionally enable such behaviour, as the advantage would be increased consistency at the expense of more resource consumption, which is a tradeoff some users might want to make, but to be clear, I'm not strictly lobbying to implement it, just trying to reason about pros and cons. bq. I still have concerns regarding additional write timeout exceptions and whether an overloaded or slow replica can slow everything down. These are valid concerns of course, and given similar concerns from [~jbellis], I'm working on some changes to avoid write timeouts due to healthy replicas unnaturally throttled by unhealthy ones, and depending on [~jbellis] answer to my last comment above, maybe only actually back-pressure if the CL is not met. Stay tuned. > Bound the number of in-flight requests at the coordinator > - > > Key: CASSANDRA-9318 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9318 > Project: Cassandra > Issue Type: Improvement > Components: Local Write-Read Paths, Streaming and Messaging >Reporter: Ariel Weisberg >Assignee: Sergio Bossa > Attachments: 9318-3.0-nits-trailing-spaces.patch, backpressure.png, > limit.btm, no_backpressure.png > > > It's possible to somewhat bound the amount of load accepted into the cluster > by bounding the number of in-flight requests and request bytes. > An implementation might do something like track the number of outstanding > bytes and requests and if it reaches a high watermark disable read on client > connections until it goes back below some low watermark. > Need to make sure that disabling read on the client connection won't > introduce
[jira] [Resolved] (CASSANDRA-12163) dtest failure in cqlsh_tests.cqlsh_tests.CqlLoginTest.test_login_rejects_bad_pass
[ https://issues.apache.org/jira/browse/CASSANDRA-12163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Knighton resolved CASSANDRA-12163. --- Resolution: Fixed The PR updating the tests has been merged. > dtest failure in > cqlsh_tests.cqlsh_tests.CqlLoginTest.test_login_rejects_bad_pass > - > > Key: CASSANDRA-12163 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12163 > Project: Cassandra > Issue Type: Test >Reporter: Sean McCarthy >Assignee: Joel Knighton > Labels: dtest > > example failure: > http://cassci.datastax.com/job/trunk_dtest/1300/testReport/cqlsh_tests.cqlsh_tests/CqlLoginTest/test_login_rejects_bad_pass > Failed on CassCI build trunk_dtest #1300 > {code} > Standard Output > (EE) :2:('Unable to connect to any servers', {'127.0.0.1': > AuthenticationFailed(u'Failed to authenticate to 127.0.0.1: code=0100 [Bad > credentials] message="Provided username user1 and/or password are > incorrect"',)})(EE) > {code} > Related failure: > http://cassci.datastax.com/job/trunk_dtest/1300/testReport/cqlsh_tests.cqlsh_tests/CqlLoginTest/test_login_allows_bad_pass_and_continued_use/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9318) Bound the number of in-flight requests at the coordinator
[ https://issues.apache.org/jira/browse/CASSANDRA-9318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15373547#comment-15373547 ] Sergio Bossa commented on CASSANDRA-9318: - [~jbellis], bq. Put another way: we do NOT want to limit performance to the slowest node in a set of replicas. That is kind of the opposite of the redundancy we want to provide. What if we tailored the algorithm to only: * Rate limit if CL replicas are below the high threshold. * Throw exception if CL replicas are below the low threshold. By doing so, C* would behave the same provided at least CL replicas behave normally. > Bound the number of in-flight requests at the coordinator > - > > Key: CASSANDRA-9318 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9318 > Project: Cassandra > Issue Type: Improvement > Components: Local Write-Read Paths, Streaming and Messaging >Reporter: Ariel Weisberg >Assignee: Sergio Bossa > Attachments: 9318-3.0-nits-trailing-spaces.patch, backpressure.png, > limit.btm, no_backpressure.png > > > It's possible to somewhat bound the amount of load accepted into the cluster > by bounding the number of in-flight requests and request bytes. > An implementation might do something like track the number of outstanding > bytes and requests and if it reaches a high watermark disable read on client > connections until it goes back below some low watermark. > Need to make sure that disabling read on the client connection won't > introduce other issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12040) If a level compaction fails due to no space it should schedule the next one
[ https://issues.apache.org/jira/browse/CASSANDRA-12040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15373542#comment-15373542 ] sankalp kohli commented on CASSANDRA-12040: --- Since the patch is very small. Can we also add this to 2.1.16? > If a level compaction fails due to no space it should schedule the next one > - > > Key: CASSANDRA-12040 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12040 > Project: Cassandra > Issue Type: Improvement >Reporter: sankalp kohli >Assignee: sankalp kohli >Priority: Minor > Attachments: CASSANDRA-12040_3.0.diff, CASSANDRA-12040_trunk.txt > > > If a level compaction fails the space check, it aborts but next time the > compactions are scheduled it will attempt the same one. It should skip it and > go to the next so it can find smaller compactions to do. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-9318) Bound the number of in-flight requests at the coordinator
[ https://issues.apache.org/jira/browse/CASSANDRA-9318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15373537#comment-15373537 ] Jeremiah Jordan edited comment on CASSANDRA-9318 at 7/12/16 7:31 PM: - [~jbellis] then we can default this to off. I have interacted with *many* people using Cassandra that would actually like to see some rate limiting applied for cases 1 and 2 such that things don't fall over (shouldn't happen with new hints hopefully) or even hint like crazy. Turning this on for them would allow that to happen. was (Author: jjordan): [~jbellis] then we can default this to off. I have interacted with *many* people using Cassandra that would actually like to see some rate limiting applied for cases 1 and 2 such that things don't fall over (shouldn't happen with new hints hopefully) or even hint like crazy. Turning this on would allow that to happen. > Bound the number of in-flight requests at the coordinator > - > > Key: CASSANDRA-9318 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9318 > Project: Cassandra > Issue Type: Improvement > Components: Local Write-Read Paths, Streaming and Messaging >Reporter: Ariel Weisberg >Assignee: Sergio Bossa > Attachments: 9318-3.0-nits-trailing-spaces.patch, backpressure.png, > limit.btm, no_backpressure.png > > > It's possible to somewhat bound the amount of load accepted into the cluster > by bounding the number of in-flight requests and request bytes. > An implementation might do something like track the number of outstanding > bytes and requests and if it reaches a high watermark disable read on client > connections until it goes back below some low watermark. > Need to make sure that disabling read on the client connection won't > introduce other issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9318) Bound the number of in-flight requests at the coordinator
[ https://issues.apache.org/jira/browse/CASSANDRA-9318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15373537#comment-15373537 ] Jeremiah Jordan commented on CASSANDRA-9318: [~jbellis] then we can default this to off. I have interacted with *many* people using Cassandra that would actually like to see some rate limiting applied for cases 1 and 2 such that things don't fall over (shouldn't happen with new hints hopefully) or even hint like crazy. Turning this on would allow that to happen. > Bound the number of in-flight requests at the coordinator > - > > Key: CASSANDRA-9318 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9318 > Project: Cassandra > Issue Type: Improvement > Components: Local Write-Read Paths, Streaming and Messaging >Reporter: Ariel Weisberg >Assignee: Sergio Bossa > Attachments: 9318-3.0-nits-trailing-spaces.patch, backpressure.png, > limit.btm, no_backpressure.png > > > It's possible to somewhat bound the amount of load accepted into the cluster > by bounding the number of in-flight requests and request bytes. > An implementation might do something like track the number of outstanding > bytes and requests and if it reaches a high watermark disable read on client > connections until it goes back below some low watermark. > Need to make sure that disabling read on the client connection won't > introduce other issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9318) Bound the number of in-flight requests at the coordinator
[ https://issues.apache.org/jira/browse/CASSANDRA-9318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15373512#comment-15373512 ] Jonathan Ellis commented on CASSANDRA-9318: --- The more I think about it the more I think the entire approach may be a bad fit for Cassandra. Consider: # If a node has a "hiccup" of slow performance, e.g. due to a GC pause, we want to hint those writes and return success to the client. No need to rate limit. # If a node has a sustained period of slow performance, we want to hint those writes and return success to the client. No need to rate limit, unless we are overwhelmed with hints. (Not sure if hint overload is actually a problem with the new file based hints.) # Where we DO want to rate limit is when the client is throwing more updates at the coordinator than the system can handle, whether that is for a single node or globally across many nodes. So I see this approach as doing the wrong thing for 1 and 2 and only partially helping with 3. Put another way: we do NOT want to limit performance to the slowest node in a set of replicas. That is kind of the opposite of the redundancy we want to provide. > Bound the number of in-flight requests at the coordinator > - > > Key: CASSANDRA-9318 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9318 > Project: Cassandra > Issue Type: Improvement > Components: Local Write-Read Paths, Streaming and Messaging >Reporter: Ariel Weisberg >Assignee: Sergio Bossa > Attachments: 9318-3.0-nits-trailing-spaces.patch, backpressure.png, > limit.btm, no_backpressure.png > > > It's possible to somewhat bound the amount of load accepted into the cluster > by bounding the number of in-flight requests and request bytes. > An implementation might do something like track the number of outstanding > bytes and requests and if it reaches a high watermark disable read on client > connections until it goes back below some low watermark. > Need to make sure that disabling read on the client connection won't > introduce other issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-9318) Bound the number of in-flight requests at the coordinator
[ https://issues.apache.org/jira/browse/CASSANDRA-9318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15373512#comment-15373512 ] Jonathan Ellis edited comment on CASSANDRA-9318 at 7/12/16 7:23 PM: The more I think about it the more I think the entire approach may be a bad fit for Cassandra. Consider: # If a node has a "hiccup" of slow performance, e.g. due to a GC pause, we want to hint those writes and return success to the client. No need to rate limit. # If a node has a sustained period of slow performance, we want to hint those writes and return success to the client. No need to rate limit, unless we are overwhelmed with hints. (Not sure if hint overload is actually a problem with the new file based hints.) # Where we DO want to rate limit is when the client is throwing more updates at the coordinator than the system can handle, whether that is for a single token range or globally across all nodes. So I see this approach as doing the wrong thing for 1 and 2 and only partially helping with 3. Put another way: we do NOT want to limit performance to the slowest node in a set of replicas. That is kind of the opposite of the redundancy we want to provide. was (Author: jbellis): The more I think about it the more I think the entire approach may be a bad fit for Cassandra. Consider: # If a node has a "hiccup" of slow performance, e.g. due to a GC pause, we want to hint those writes and return success to the client. No need to rate limit. # If a node has a sustained period of slow performance, we want to hint those writes and return success to the client. No need to rate limit, unless we are overwhelmed with hints. (Not sure if hint overload is actually a problem with the new file based hints.) # Where we DO want to rate limit is when the client is throwing more updates at the coordinator than the system can handle, whether that is for a single token range or globally across many nodes. So I see this approach as doing the wrong thing for 1 and 2 and only partially helping with 3. Put another way: we do NOT want to limit performance to the slowest node in a set of replicas. That is kind of the opposite of the redundancy we want to provide. > Bound the number of in-flight requests at the coordinator > - > > Key: CASSANDRA-9318 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9318 > Project: Cassandra > Issue Type: Improvement > Components: Local Write-Read Paths, Streaming and Messaging >Reporter: Ariel Weisberg >Assignee: Sergio Bossa > Attachments: 9318-3.0-nits-trailing-spaces.patch, backpressure.png, > limit.btm, no_backpressure.png > > > It's possible to somewhat bound the amount of load accepted into the cluster > by bounding the number of in-flight requests and request bytes. > An implementation might do something like track the number of outstanding > bytes and requests and if it reaches a high watermark disable read on client > connections until it goes back below some low watermark. > Need to make sure that disabling read on the client connection won't > introduce other issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-9318) Bound the number of in-flight requests at the coordinator
[ https://issues.apache.org/jira/browse/CASSANDRA-9318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15373512#comment-15373512 ] Jonathan Ellis edited comment on CASSANDRA-9318 at 7/12/16 7:22 PM: The more I think about it the more I think the entire approach may be a bad fit for Cassandra. Consider: # If a node has a "hiccup" of slow performance, e.g. due to a GC pause, we want to hint those writes and return success to the client. No need to rate limit. # If a node has a sustained period of slow performance, we want to hint those writes and return success to the client. No need to rate limit, unless we are overwhelmed with hints. (Not sure if hint overload is actually a problem with the new file based hints.) # Where we DO want to rate limit is when the client is throwing more updates at the coordinator than the system can handle, whether that is for a single token range or globally across many nodes. So I see this approach as doing the wrong thing for 1 and 2 and only partially helping with 3. Put another way: we do NOT want to limit performance to the slowest node in a set of replicas. That is kind of the opposite of the redundancy we want to provide. was (Author: jbellis): The more I think about it the more I think the entire approach may be a bad fit for Cassandra. Consider: # If a node has a "hiccup" of slow performance, e.g. due to a GC pause, we want to hint those writes and return success to the client. No need to rate limit. # If a node has a sustained period of slow performance, we want to hint those writes and return success to the client. No need to rate limit, unless we are overwhelmed with hints. (Not sure if hint overload is actually a problem with the new file based hints.) # Where we DO want to rate limit is when the client is throwing more updates at the coordinator than the system can handle, whether that is for a single node or globally across many nodes. So I see this approach as doing the wrong thing for 1 and 2 and only partially helping with 3. Put another way: we do NOT want to limit performance to the slowest node in a set of replicas. That is kind of the opposite of the redundancy we want to provide. > Bound the number of in-flight requests at the coordinator > - > > Key: CASSANDRA-9318 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9318 > Project: Cassandra > Issue Type: Improvement > Components: Local Write-Read Paths, Streaming and Messaging >Reporter: Ariel Weisberg >Assignee: Sergio Bossa > Attachments: 9318-3.0-nits-trailing-spaces.patch, backpressure.png, > limit.btm, no_backpressure.png > > > It's possible to somewhat bound the amount of load accepted into the cluster > by bounding the number of in-flight requests and request bytes. > An implementation might do something like track the number of outstanding > bytes and requests and if it reaches a high watermark disable read on client > connections until it goes back below some low watermark. > Need to make sure that disabling read on the client connection won't > introduce other issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12181) Include table name in "Cannot get comparator" exception
[ https://issues.apache.org/jira/browse/CASSANDRA-12181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sankalp kohli updated CASSANDRA-12181: -- Status: Patch Available (was: Open) > Include table name in "Cannot get comparator" exception > --- > > Key: CASSANDRA-12181 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12181 > Project: Cassandra > Issue Type: Improvement >Reporter: sankalp kohli >Assignee: sankalp kohli >Priority: Trivial > Attachments: CASSANDRA-12181_3.0.txt > > > Having table name will help in debugging the following exception. > ERROR [MutationStage:xx] CassandraDaemon.java (line 199) Exception in thread > Thread[MutationStage:3788,5,main] > clusterName=itms8shared20 > java.lang.RuntimeException: Cannot get comparator 2 in > org.apache.cassandra.db.marshal.CompositeType(org.apache.cassandra.db.marshal.UTF8Type,org.apache.cassandra.db.marshal.UTF8Type). > > This might be due to a mismatch between the schema and the data read -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12181) Include table name in "Cannot get comparator" exception
[ https://issues.apache.org/jira/browse/CASSANDRA-12181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sankalp kohli updated CASSANDRA-12181: -- Attachment: CASSANDRA-12181_3.0.txt > Include table name in "Cannot get comparator" exception > --- > > Key: CASSANDRA-12181 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12181 > Project: Cassandra > Issue Type: Improvement >Reporter: sankalp kohli >Assignee: sankalp kohli >Priority: Trivial > Attachments: CASSANDRA-12181_3.0.txt > > > Having table name will help in debugging the following exception. > ERROR [MutationStage:xx] CassandraDaemon.java (line 199) Exception in thread > Thread[MutationStage:3788,5,main] > clusterName=itms8shared20 > java.lang.RuntimeException: Cannot get comparator 2 in > org.apache.cassandra.db.marshal.CompositeType(org.apache.cassandra.db.marshal.UTF8Type,org.apache.cassandra.db.marshal.UTF8Type). > > This might be due to a mismatch between the schema and the data read -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (CASSANDRA-12181) Include table name in "Cannot get comparator" exception
[ https://issues.apache.org/jira/browse/CASSANDRA-12181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sankalp kohli reassigned CASSANDRA-12181: - Assignee: sankalp kohli > Include table name in "Cannot get comparator" exception > --- > > Key: CASSANDRA-12181 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12181 > Project: Cassandra > Issue Type: Improvement >Reporter: sankalp kohli >Assignee: sankalp kohli >Priority: Trivial > > Having table name will help in debugging the following exception. > ERROR [MutationStage:xx] CassandraDaemon.java (line 199) Exception in thread > Thread[MutationStage:3788,5,main] > clusterName=itms8shared20 > java.lang.RuntimeException: Cannot get comparator 2 in > org.apache.cassandra.db.marshal.CompositeType(org.apache.cassandra.db.marshal.UTF8Type,org.apache.cassandra.db.marshal.UTF8Type). > > This might be due to a mismatch between the schema and the data read -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-12181) Include table name in "Cannot get comparator" exception
sankalp kohli created CASSANDRA-12181: - Summary: Include table name in "Cannot get comparator" exception Key: CASSANDRA-12181 URL: https://issues.apache.org/jira/browse/CASSANDRA-12181 Project: Cassandra Issue Type: Improvement Reporter: sankalp kohli Priority: Trivial Having table name will help in debugging the following exception. ERROR [MutationStage:xx] CassandraDaemon.java (line 199) Exception in thread Thread[MutationStage:3788,5,main] clusterName=itms8shared20 java.lang.RuntimeException: Cannot get comparator 2 in org.apache.cassandra.db.marshal.CompositeType(org.apache.cassandra.db.marshal.UTF8Type,org.apache.cassandra.db.marshal.UTF8Type). This might be due to a mismatch between the schema and the data read -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10993) Make read and write requests paths fully non-blocking, eliminate related stages
[ https://issues.apache.org/jira/browse/CASSANDRA-10993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15373497#comment-15373497 ] Tyler Hobbs commented on CASSANDRA-10993: - I've pushed another commit to get reads from memtables working, so next I'll be working on benchmarking those against trunk and the CASSANDRA-10528 version. > Make read and write requests paths fully non-blocking, eliminate related > stages > --- > > Key: CASSANDRA-10993 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10993 > Project: Cassandra > Issue Type: Sub-task > Components: Coordination, Local Write-Read Paths >Reporter: Aleksey Yeschenko >Assignee: Tyler Hobbs > Fix For: 3.x > > > Building on work done by [~tjake] (CASSANDRA-10528), [~slebresne] > (CASSANDRA-5239), and others, convert read and write request paths to be > fully non-blocking, to enable the eventual transition from SEDA to TPC > (CASSANDRA-10989) > Eliminate {{MUTATION}}, {{COUNTER_MUTATION}}, {{VIEW_MUTATION}}, {{READ}}, > and {{READ_REPAIR}} stages, move read and write execution directly to Netty > context. > For lack of decent async I/O options on Linux, we’ll still have to retain an > extra thread pool for serving read requests for data not residing in our page > cache (CASSANDRA-5863), however. > Implementation-wise, we only have two options available to us: explicit FSMs > and chained futures. Fibers would be the third, and easiest option, but > aren’t feasible in Java without resorting to direct bytecode manipulation > (ourselves or using [quasar|https://github.com/puniverse/quasar]). > I have seen 4 implementations bases on chained futures/promises now - three > in Java and one in C++ - and I’m not convinced that it’s the optimal (or > sane) choice for representing our complex logic - think 2i quorum read > requests with timeouts at all levels, read repair (blocking and > non-blocking), and speculative retries in the mix, {{SERIAL}} reads and > writes. > I’m currently leaning towards an implementation based on explicit FSMs, and > intend to provide a prototype - soonish - for comparison with > {{CompletableFuture}}-like variants. > Either way the transition is a relatively boring straightforward refactoring. > There are, however, some extension points on both write and read paths that > we do not control: > - authorisation implementations will have to be non-blocking. We have control > over built-in ones, but for any custom implementation we will have to execute > them in a separate thread pool > - 2i hooks on the write path will need to be non-blocking > - any trigger implementations will not be allowed to block > - UDFs and UDAs > We are further limited by API compatibility restrictions in the 3.x line, > forbidding us to alter, or add any non-{{default}} interface methods to those > extension points, so these pose a problem. > Depending on logistics, expecting to get this done in time for 3.4 or 3.6 > feature release. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10993) Make read and write requests paths fully non-blocking, eliminate related stages
[ https://issues.apache.org/jira/browse/CASSANDRA-10993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15373490#comment-15373490 ] Tyler Hobbs commented on CASSANDRA-10993: - Yes, it should be fine for the moment. Will do. > Make read and write requests paths fully non-blocking, eliminate related > stages > --- > > Key: CASSANDRA-10993 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10993 > Project: Cassandra > Issue Type: Sub-task > Components: Coordination, Local Write-Read Paths >Reporter: Aleksey Yeschenko >Assignee: Tyler Hobbs > Fix For: 3.x > > > Building on work done by [~tjake] (CASSANDRA-10528), [~slebresne] > (CASSANDRA-5239), and others, convert read and write request paths to be > fully non-blocking, to enable the eventual transition from SEDA to TPC > (CASSANDRA-10989) > Eliminate {{MUTATION}}, {{COUNTER_MUTATION}}, {{VIEW_MUTATION}}, {{READ}}, > and {{READ_REPAIR}} stages, move read and write execution directly to Netty > context. > For lack of decent async I/O options on Linux, we’ll still have to retain an > extra thread pool for serving read requests for data not residing in our page > cache (CASSANDRA-5863), however. > Implementation-wise, we only have two options available to us: explicit FSMs > and chained futures. Fibers would be the third, and easiest option, but > aren’t feasible in Java without resorting to direct bytecode manipulation > (ourselves or using [quasar|https://github.com/puniverse/quasar]). > I have seen 4 implementations bases on chained futures/promises now - three > in Java and one in C++ - and I’m not convinced that it’s the optimal (or > sane) choice for representing our complex logic - think 2i quorum read > requests with timeouts at all levels, read repair (blocking and > non-blocking), and speculative retries in the mix, {{SERIAL}} reads and > writes. > I’m currently leaning towards an implementation based on explicit FSMs, and > intend to provide a prototype - soonish - for comparison with > {{CompletableFuture}}-like variants. > Either way the transition is a relatively boring straightforward refactoring. > There are, however, some extension points on both write and read paths that > we do not control: > - authorisation implementations will have to be non-blocking. We have control > over built-in ones, but for any custom implementation we will have to execute > them in a separate thread pool > - 2i hooks on the write path will need to be non-blocking > - any trigger implementations will not be allowed to block > - UDFs and UDAs > We are further limited by API compatibility restrictions in the 3.x line, > forbidding us to alter, or add any non-{{default}} interface methods to those > extension points, so these pose a problem. > Depending on logistics, expecting to get this done in time for 3.4 or 3.6 > feature release. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12180) Should be able to override compaction space check
[ https://issues.apache.org/jira/browse/CASSANDRA-12180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sankalp kohli updated CASSANDRA-12180: -- Attachment: CASSANDRA-12180_3.0.txt > Should be able to override compaction space check > - > > Key: CASSANDRA-12180 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12180 > Project: Cassandra > Issue Type: Improvement >Reporter: sankalp kohli >Assignee: sankalp kohli >Priority: Trivial > Attachments: CASSANDRA-12180_3.0.txt > > > If there's not enough space for a compaction it won't do it and print the > exception below. Sometimes we know compaction will free up lot of space since > an ETL job could have inserted a lot of deletes. This override helps in this > case. > ERROR [CompactionExecutor:17] CassandraDaemon.java (line 258) Exception in > thread Thread > [CompactionExecutor:17,1,main] > java.lang.RuntimeException: Not enough space for compaction, estimated > sstables = 1552, expected > write size = 260540558535 > at org.apache.cassandra.db.compaction.CompactionTask.checkAvailableDiskSpace > (CompactionTask.java:306) > at > org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask. > java:106) > at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > at > org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask. > java:60) > at > org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask. > java:59) > at > org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run > (CompactionManager.java:198) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12180) Should be able to override compaction space check
[ https://issues.apache.org/jira/browse/CASSANDRA-12180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sankalp kohli updated CASSANDRA-12180: -- Status: Patch Available (was: Open) > Should be able to override compaction space check > - > > Key: CASSANDRA-12180 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12180 > Project: Cassandra > Issue Type: Improvement >Reporter: sankalp kohli >Assignee: sankalp kohli >Priority: Trivial > Attachments: CASSANDRA-12180_3.0.txt > > > If there's not enough space for a compaction it won't do it and print the > exception below. Sometimes we know compaction will free up lot of space since > an ETL job could have inserted a lot of deletes. This override helps in this > case. > ERROR [CompactionExecutor:17] CassandraDaemon.java (line 258) Exception in > thread Thread > [CompactionExecutor:17,1,main] > java.lang.RuntimeException: Not enough space for compaction, estimated > sstables = 1552, expected > write size = 260540558535 > at org.apache.cassandra.db.compaction.CompactionTask.checkAvailableDiskSpace > (CompactionTask.java:306) > at > org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask. > java:106) > at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > at > org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask. > java:60) > at > org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask. > java:59) > at > org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run > (CompactionManager.java:198) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (CASSANDRA-12180) Should be able to override compaction space check
[ https://issues.apache.org/jira/browse/CASSANDRA-12180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sankalp kohli reassigned CASSANDRA-12180: - Assignee: sankalp kohli > Should be able to override compaction space check > - > > Key: CASSANDRA-12180 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12180 > Project: Cassandra > Issue Type: Improvement >Reporter: sankalp kohli >Assignee: sankalp kohli >Priority: Trivial > > If there's not enough space for a compaction it won't do it and print the > exception below. Sometimes we know compaction will free up lot of space since > an ETL job could have inserted a lot of deletes. This override helps in this > case. > ERROR [CompactionExecutor:17] CassandraDaemon.java (line 258) Exception in > thread Thread > [CompactionExecutor:17,1,main] > java.lang.RuntimeException: Not enough space for compaction, estimated > sstables = 1552, expected > write size = 260540558535 > at org.apache.cassandra.db.compaction.CompactionTask.checkAvailableDiskSpace > (CompactionTask.java:306) > at > org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask. > java:106) > at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > at > org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask. > java:60) > at > org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask. > java:59) > at > org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run > (CompactionManager.java:198) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12179) Make DynamicEndpointSnitch dynamic_snitch_update_interval_in_ms a JMX Prop
[ https://issues.apache.org/jira/browse/CASSANDRA-12179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sankalp kohli updated CASSANDRA-12179: -- Issue Type: Improvement (was: Bug) > Make DynamicEndpointSnitch dynamic_snitch_update_interval_in_ms a JMX Prop > --- > > Key: CASSANDRA-12179 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12179 > Project: Cassandra > Issue Type: Improvement >Reporter: sankalp kohli >Assignee: sankalp kohli >Priority: Trivial > Attachments: CASSANDRA-12179_3.0.txt > > > Need to expose dynamic_snitch_update_interval_in_ms so that it does not > require a bounce. This is useful for large clusters where we can change this > value and see the impact. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-12180) Should be able to override compaction space check
sankalp kohli created CASSANDRA-12180: - Summary: Should be able to override compaction space check Key: CASSANDRA-12180 URL: https://issues.apache.org/jira/browse/CASSANDRA-12180 Project: Cassandra Issue Type: Improvement Reporter: sankalp kohli Priority: Trivial If there's not enough space for a compaction it won't do it and print the exception below. Sometimes we know compaction will free up lot of space since an ETL job could have inserted a lot of deletes. This override helps in this case. ERROR [CompactionExecutor:17] CassandraDaemon.java (line 258) Exception in thread Thread [CompactionExecutor:17,1,main] java.lang.RuntimeException: Not enough space for compaction, estimated sstables = 1552, expected write size = 260540558535 at org.apache.cassandra.db.compaction.CompactionTask.checkAvailableDiskSpace (CompactionTask.java:306) at org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask. java:106) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask. java:60) at org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask. java:59) at org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run (CompactionManager.java:198) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12132) Add metric to expose how many token ranges have one or more replicas down
[ https://issues.apache.org/jira/browse/CASSANDRA-12132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tyler Hobbs updated CASSANDRA-12132: Summary: Add metric to expose how many token ranges have one or more replicas down (was: two cassandra nodes have common records,we need to calculate it) > Add metric to expose how many token ranges have one or more replicas down > - > > Key: CASSANDRA-12132 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12132 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: stone > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12179) Make DynamicEndpointSnitch dynamic_snitch_update_interval_in_ms a JMX Prop
[ https://issues.apache.org/jira/browse/CASSANDRA-12179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sankalp kohli updated CASSANDRA-12179: -- Attachment: CASSANDRA-12179_3.0.txt > Make DynamicEndpointSnitch dynamic_snitch_update_interval_in_ms a JMX Prop > --- > > Key: CASSANDRA-12179 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12179 > Project: Cassandra > Issue Type: Bug >Reporter: sankalp kohli >Assignee: sankalp kohli >Priority: Trivial > Attachments: CASSANDRA-12179_3.0.txt > > > Need to expose dynamic_snitch_update_interval_in_ms so that it does not > require a bounce. This is useful for large clusters where we can change this > value and see the impact. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12179) Make DynamicEndpointSnitch dynamic_snitch_update_interval_in_ms a JMX Prop
[ https://issues.apache.org/jira/browse/CASSANDRA-12179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sankalp kohli updated CASSANDRA-12179: -- Status: Patch Available (was: Open) > Make DynamicEndpointSnitch dynamic_snitch_update_interval_in_ms a JMX Prop > --- > > Key: CASSANDRA-12179 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12179 > Project: Cassandra > Issue Type: Bug >Reporter: sankalp kohli >Assignee: sankalp kohli >Priority: Trivial > Attachments: CASSANDRA-12179_3.0.txt > > > Need to expose dynamic_snitch_update_interval_in_ms so that it does not > require a bounce. This is useful for large clusters where we can change this > value and see the impact. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-12179) Make DynamicEndpointSnitch dynamic_snitch_update_interval_in_ms a JMX Prop
sankalp kohli created CASSANDRA-12179: - Summary: Make DynamicEndpointSnitch dynamic_snitch_update_interval_in_ms a JMX Prop Key: CASSANDRA-12179 URL: https://issues.apache.org/jira/browse/CASSANDRA-12179 Project: Cassandra Issue Type: Bug Reporter: sankalp kohli Priority: Trivial Need to expose dynamic_snitch_update_interval_in_ms so that it does not require a bounce. This is useful for large clusters where we can change this value and see the impact. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (CASSANDRA-12179) Make DynamicEndpointSnitch dynamic_snitch_update_interval_in_ms a JMX Prop
[ https://issues.apache.org/jira/browse/CASSANDRA-12179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sankalp kohli reassigned CASSANDRA-12179: - Assignee: sankalp kohli > Make DynamicEndpointSnitch dynamic_snitch_update_interval_in_ms a JMX Prop > --- > > Key: CASSANDRA-12179 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12179 > Project: Cassandra > Issue Type: Bug >Reporter: sankalp kohli >Assignee: sankalp kohli >Priority: Trivial > > Need to expose dynamic_snitch_update_interval_in_ms so that it does not > require a bounce. This is useful for large clusters where we can change this > value and see the impact. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12040) If a level compaction fails due to no space it should schedule the next one
[ https://issues.apache.org/jira/browse/CASSANDRA-12040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15373470#comment-15373470 ] sankalp kohli commented on CASSANDRA-12040: --- You are right...reduced my patch to a single line :). +1 > If a level compaction fails due to no space it should schedule the next one > - > > Key: CASSANDRA-12040 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12040 > Project: Cassandra > Issue Type: Improvement >Reporter: sankalp kohli >Assignee: sankalp kohli >Priority: Minor > Attachments: CASSANDRA-12040_3.0.diff, CASSANDRA-12040_trunk.txt > > > If a level compaction fails the space check, it aborts but next time the > compactions are scheduled it will attempt the same one. It should skip it and > go to the next so it can find smaller compactions to do. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12178) Add prefixes to the name of snapshots created before a truncate or drop
[ https://issues.apache.org/jira/browse/CASSANDRA-12178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Geoffrey Yu updated CASSANDRA-12178: Attachment: 12178-trunk.txt > Add prefixes to the name of snapshots created before a truncate or drop > --- > > Key: CASSANDRA-12178 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12178 > Project: Cassandra > Issue Type: Improvement >Reporter: Geoffrey Yu >Assignee: Geoffrey Yu >Priority: Minor > Attachments: 12178-trunk.txt > > > It would be useful to be able to identify snapshots that are taken because a > table was truncated or dropped. We can do this by prepending a prefix to > snapshot names for snapshots that are created before a truncate/drop. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12178) Add prefixes to the name of snapshots created before a truncate or drop
[ https://issues.apache.org/jira/browse/CASSANDRA-12178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Geoffrey Yu updated CASSANDRA-12178: Status: Patch Available (was: Open) > Add prefixes to the name of snapshots created before a truncate or drop > --- > > Key: CASSANDRA-12178 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12178 > Project: Cassandra > Issue Type: Improvement >Reporter: Geoffrey Yu >Assignee: Geoffrey Yu >Priority: Minor > Attachments: 12178-trunk.txt > > > It would be useful to be able to identify snapshots that are taken because a > table was truncated or dropped. We can do this by prepending a prefix to > snapshot names for snapshots that are created before a truncate/drop. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-12178) Add prefixes to the name of snapshots created before a truncate or drop
Geoffrey Yu created CASSANDRA-12178: --- Summary: Add prefixes to the name of snapshots created before a truncate or drop Key: CASSANDRA-12178 URL: https://issues.apache.org/jira/browse/CASSANDRA-12178 Project: Cassandra Issue Type: Improvement Reporter: Geoffrey Yu Assignee: Geoffrey Yu Priority: Minor It would be useful to be able to identify snapshots that are taken because a table was truncated or dropped. We can do this by prepending a prefix to snapshot names for snapshots that are created before a truncate/drop. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12016) Create MessagingService mocking classes
[ https://issues.apache.org/jira/browse/CASSANDRA-12016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tyler Hobbs updated CASSANDRA-12016: Status: Ready to Commit (was: Patch Available) > Create MessagingService mocking classes > --- > > Key: CASSANDRA-12016 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12016 > Project: Cassandra > Issue Type: New Feature > Components: Testing >Reporter: Stefan Podkowinski >Assignee: Stefan Podkowinski > Fix For: 3.10 > > Attachments: 12016-trunk.patch > > > Interactions between clients and nodes in the cluster are taking place by > exchanging messages through the {{MessagingService}}. Black box testing for > message based systems is usually pretty easy, as we're just dealing with > messages in/out. My suggestion would be to add tests that make use of this > fact by mocking message exchanges via MessagingService. Given the right use > case, this would turn out to be a much simpler and more efficient alternative > for dtests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12016) Create MessagingService mocking classes
[ https://issues.apache.org/jira/browse/CASSANDRA-12016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tyler Hobbs updated CASSANDRA-12016: Resolution: Fixed Fix Version/s: 3.10 Status: Resolved (was: Ready to Commit) The tests look good, +1 Committed as {{7751588f7715386db0a92bfc4b5db9f151e15133}} to trunk. Thanks! > Create MessagingService mocking classes > --- > > Key: CASSANDRA-12016 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12016 > Project: Cassandra > Issue Type: New Feature > Components: Testing >Reporter: Stefan Podkowinski >Assignee: Stefan Podkowinski > Fix For: 3.10 > > Attachments: 12016-trunk.patch > > > Interactions between clients and nodes in the cluster are taking place by > exchanging messages through the {{MessagingService}}. Black box testing for > message based systems is usually pretty easy, as we're just dealing with > messages in/out. My suggestion would be to add tests that make use of this > fact by mocking message exchanges via MessagingService. Given the right use > case, this would turn out to be a much simpler and more efficient alternative > for dtests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
cassandra git commit: Create MessagingService mocking classes
Repository: cassandra Updated Branches: refs/heads/trunk 11b93152d -> 7751588f7 Create MessagingService mocking classes Patch by Stefan Podkowinski; reviewed by Tyler Hobbs for CASSANDRA-12016 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/7751588f Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/7751588f Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/7751588f Branch: refs/heads/trunk Commit: 7751588f7715386db0a92bfc4b5db9f151e15133 Parents: 11b9315 Author: Stefan PodkowinskiAuthored: Tue Jul 12 12:04:23 2016 -0500 Committer: Tyler Hobbs Committed: Tue Jul 12 12:04:23 2016 -0500 -- .../apache/cassandra/gms/FailureDetector.java | 9 +- .../apache/cassandra/net/MessagingService.java | 5 + .../cassandra/service/ActiveRepairService.java | 3 +- src/java/org/apache/cassandra/utils/Clock.java | 80 +++ .../org/apache/cassandra/utils/ExpiringMap.java | 4 +- test/unit/org/apache/cassandra/net/Matcher.java | 32 +++ .../apache/cassandra/net/MatcherResponse.java | 208 + .../cassandra/net/MockMessagingService.java | 144 .../cassandra/net/MockMessagingServiceTest.java | 97 .../apache/cassandra/net/MockMessagingSpy.java | 234 +++ .../cassandra/utils/FreeRunningClock.java | 46 11 files changed, 855 insertions(+), 7 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/7751588f/src/java/org/apache/cassandra/gms/FailureDetector.java -- diff --git a/src/java/org/apache/cassandra/gms/FailureDetector.java b/src/java/org/apache/cassandra/gms/FailureDetector.java index 964b4ad..7d8b88b 100644 --- a/src/java/org/apache/cassandra/gms/FailureDetector.java +++ b/src/java/org/apache/cassandra/gms/FailureDetector.java @@ -36,6 +36,7 @@ import org.slf4j.LoggerFactory; import org.apache.cassandra.config.DatabaseDescriptor; import org.apache.cassandra.io.FSWriteError; import org.apache.cassandra.io.util.FileUtils; +import org.apache.cassandra.utils.Clock; import org.apache.cassandra.utils.FBUtilities; /** @@ -52,7 +53,7 @@ public class FailureDetector implements IFailureDetector, FailureDetectorMBean private static final int DEBUG_PERCENTAGE = 80; // if the phi is larger than this percentage of the max, log a debug message private static final long DEFAULT_MAX_PAUSE = 5000L * 100L; // 5 seconds private static final long MAX_LOCAL_PAUSE_IN_NANOS = getMaxLocalPause(); -private long lastInterpret = System.nanoTime(); +private long lastInterpret = Clock.instance.nanoTime(); private long lastPause = 0L; private static long getMaxLocalPause() @@ -252,7 +253,7 @@ public class FailureDetector implements IFailureDetector, FailureDetectorMBean public void report(InetAddress ep) { -long now = System.nanoTime(); +long now = Clock.instance.nanoTime(); ArrivalWindow heartbeatWindow = arrivalSamples.get(ep); if (heartbeatWindow == null) { @@ -279,7 +280,7 @@ public class FailureDetector implements IFailureDetector, FailureDetectorMBean { return; } -long now = System.nanoTime(); +long now = Clock.instance.nanoTime(); long diff = now - lastInterpret; lastInterpret = now; if (diff > MAX_LOCAL_PAUSE_IN_NANOS) @@ -288,7 +289,7 @@ public class FailureDetector implements IFailureDetector, FailureDetectorMBean lastPause = now; return; } -if (System.nanoTime() - lastPause < MAX_LOCAL_PAUSE_IN_NANOS) +if (Clock.instance.nanoTime() - lastPause < MAX_LOCAL_PAUSE_IN_NANOS) { logger.debug("Still not marking nodes down due to local pause"); return; http://git-wip-us.apache.org/repos/asf/cassandra/blob/7751588f/src/java/org/apache/cassandra/net/MessagingService.java -- diff --git a/src/java/org/apache/cassandra/net/MessagingService.java b/src/java/org/apache/cassandra/net/MessagingService.java index 954bd9d..54d1183 100644 --- a/src/java/org/apache/cassandra/net/MessagingService.java +++ b/src/java/org/apache/cassandra/net/MessagingService.java @@ -353,6 +353,11 @@ public final class MessagingService implements MessagingServiceMBean messageSinks.add(sink); } +public void removeMessageSink(IMessageSink sink) +{ +messageSinks.remove(sink); +} + public void clearMessageSinks() { messageSinks.clear();
[jira] [Commented] (CASSANDRA-12177) sstabledump fails if sstable path includes dot
[ https://issues.apache.org/jira/browse/CASSANDRA-12177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15373231#comment-15373231 ] Chris Lohfink commented on CASSANDRA-12177: --- What version? This came up in [here|https://issues.apache.org/jira/browse/CASSANDRA-11330?focusedCommentId=15226927=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15226927] and I thought this was fixed in CASSANDRA-12002 > sstabledump fails if sstable path includes dot > -- > > Key: CASSANDRA-12177 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12177 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: Keith Wansbrough > > If there is a dot in the file path passed to sstabledump, it fails with an > error {{partitioner org.apache.cassandra.dht.Murmur3Partitioner does not > match system partitioner org.apache.cassandra.dht.LocalPartitioner.}} > I can work around this by renaming the directory containing the file, but it > seems like a bug. I expected the directory name to be irrelevant. > Example (assumes you have a keyspace test containing a table called sport, > but should repro with any keyspace/table): > {code} > $ cp -a /var/lib/cassandra/data/test/sport-ebe76350474e11e6879fc5e30fbb0e96 > testdir > $ sstabledump testdir/mb-1-big-Data.db > [ > { > "partition" : { > "key" : [ "2" ], > "position" : 0 > }, > "rows" : [ > { > "type" : "row", > "position" : 18, > "liveness_info" : { "tstamp" : "2016-07-11T10:15:22.766107Z" }, > "cells" : [ > { "name" : "score", "value" : "Golf" }, > { "name" : "sport_type", "value" : "5" } > ] > } > ] > } > ] > $ cp -a /var/lib/cassandra/data/test/sport-ebe76350474e11e6879fc5e30fbb0e96 > test.dir > $ sstabledump test.dir/mb-1-big-Data.db > ERROR 15:02:52 Cannot open /home/centos/test.dir/mb-1-big; partitioner > org.apache.cassandra.dht.Murmur3Partitioner does not match system partitioner > org.apache.cassandra.dht.LocalPartitioner. Note that the default partitioner > starting with Cassandra 1.2 is Murmur3Partitioner, so you will need to edit > that to match your old partitioner if upgrading. > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12045) Cassandra failure during write query at consistency LOCAL_QUORUM
[ https://issues.apache.org/jira/browse/CASSANDRA-12045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15373175#comment-15373175 ] Tyler Hobbs commented on CASSANDRA-12045: - 2GB is the theoretical max value size for a column. There are other limits (like the {{commitlog_segment_size_in_mb}}) that may prevent you from hitting that limit. In any case, it's not a good idea to put huge values into a single column or row. I recommend chunking values over 10mb into multiple rows. Plus, as Sylvain says, compression would probably save you a great deal of space. > Cassandra failure during write query at consistency LOCAL_QUORUM > -- > > Key: CASSANDRA-12045 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12045 > Project: Cassandra > Issue Type: Bug > Components: CQL, Local Write-Read Paths > Environment: Eclipse java environment >Reporter: Raghavendra Pinninti > Fix For: 3.x > > Original Estimate: 12h > Remaining Estimate: 12h > > While I am writing xml file into Cassandra table column I am facing following > exception.Its a 3 node cluster and All nodes are up. > {noformat} > com.datastax.driver.core.exceptions.WriteFailureException: Cassandra failure > during write query at consistency LOCAL_QUORUM (2 responses were required but > only 0 replica responded, 1 failed) at > com.datastax.driver.core.exceptions.WriteFailureException.copy(WriteFailureException.java:80) > at > com.datastax.driver.core.DriverThrowables.propagateCause(DriverThrowables.java:37) > at > com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:245) > at com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:55) > at com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:39) > at DBConnection.oracle2Cassandra(DBConnection.java:267) at > DBConnection.main(DBConnection.java:292) Caused by: > com.datastax.driver.core.exceptions.WriteFailureException: Cassandra failure > during write query at consistency LOCAL_QUORUM (2 responses were required but > only 0 replica responded, 1 failed) at > com.datastax.driver.core.exceptions.WriteFailureException.copy(WriteFailureException.java:91) > at com.datastax.driver.core.Responses$Error.asException(Responses.java:119) > at > com.datastax.driver.core.DefaultResultSetFuture.onSet(DefaultResultSetFuture.java:180) > at > com.datastax.driver.core.RequestHandler.setFinalResult(RequestHandler.java:186) > at > com.datastax.driver.core.RequestHandler.access$2300(RequestHandler.java:44) > at > com.datastax.driver.core.RequestHandler$SpeculativeExecution.setFinalResult(RequestHandler.java:754) > at > com.datastax.driver.core.RequestHandler$SpeculativeExecution.onSet(RequestHandler.java:576) > {noformat} > It would be great if someone helps me out from this situation. Thanks > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10635) Add metrics for authentication failures
[ https://issues.apache.org/jira/browse/CASSANDRA-10635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15373176#comment-15373176 ] Soumava Ghosh commented on CASSANDRA-10635: --- [~beobal] your changes look good to me. I agree, they should be meters, the rate of these failures would be good metric to have. Being able to disambiguate would have been nice, but as you say its a different feature, which is true. > Add metrics for authentication failures > --- > > Key: CASSANDRA-10635 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10635 > Project: Cassandra > Issue Type: Improvement >Reporter: Soumava Ghosh >Assignee: Soumava Ghosh >Priority: Minor > Fix For: 3.x > > Attachments: 10635-2.1.txt, 10635-2.2.txt, 10635-3.0.txt, > 10635-dtest.patch, 10635-trunk.patch > > > There should be no auth failures on a cluster in general. > Having metrics around the authentication code would help detect clients > that are connecting to the wrong cluster or have auth incorrectly configured. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12045) Cassandra failure during write query at consistency LOCAL_QUORUM
[ https://issues.apache.org/jira/browse/CASSANDRA-12045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tyler Hobbs updated CASSANDRA-12045: Description: While I am writing xml file into Cassandra table column I am facing following exception.Its a 3 node cluster and All nodes are up. {noformat} com.datastax.driver.core.exceptions.WriteFailureException: Cassandra failure during write query at consistency LOCAL_QUORUM (2 responses were required but only 0 replica responded, 1 failed) at com.datastax.driver.core.exceptions.WriteFailureException.copy(WriteFailureException.java:80) at com.datastax.driver.core.DriverThrowables.propagateCause(DriverThrowables.java:37) at com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:245) at com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:55) at com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:39) at DBConnection.oracle2Cassandra(DBConnection.java:267) at DBConnection.main(DBConnection.java:292) Caused by: com.datastax.driver.core.exceptions.WriteFailureException: Cassandra failure during write query at consistency LOCAL_QUORUM (2 responses were required but only 0 replica responded, 1 failed) at com.datastax.driver.core.exceptions.WriteFailureException.copy(WriteFailureException.java:91) at com.datastax.driver.core.Responses$Error.asException(Responses.java:119) at com.datastax.driver.core.DefaultResultSetFuture.onSet(DefaultResultSetFuture.java:180) at com.datastax.driver.core.RequestHandler.setFinalResult(RequestHandler.java:186) at com.datastax.driver.core.RequestHandler.access$2300(RequestHandler.java:44) at com.datastax.driver.core.RequestHandler$SpeculativeExecution.setFinalResult(RequestHandler.java:754) at com.datastax.driver.core.RequestHandler$SpeculativeExecution.onSet(RequestHandler.java:576) {noformat} It would be great if someone helps me out from this situation. Thanks was: While I am writing xml file into Cassandra table column I am facing following exception.Its a 3 node cluster and All nodes are up. com.datastax.driver.core.exceptions.WriteFailureException: Cassandra failure during write query at consistency LOCAL_QUORUM (2 responses were required but only 0 replica responded, 1 failed) at com.datastax.driver.core.exceptions.WriteFailureException.copy(WriteFailureException.java:80) at com.datastax.driver.core.DriverThrowables.propagateCause(DriverThrowables.java:37) at com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:245) at com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:55) at com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:39) at DBConnection.oracle2Cassandra(DBConnection.java:267) at DBConnection.main(DBConnection.java:292) Caused by: com.datastax.driver.core.exceptions.WriteFailureException: Cassandra failure during write query at consistency LOCAL_QUORUM (2 responses were required but only 0 replica responded, 1 failed) at com.datastax.driver.core.exceptions.WriteFailureException.copy(WriteFailureException.java:91) at com.datastax.driver.core.Responses$Error.asException(Responses.java:119) at com.datastax.driver.core.DefaultResultSetFuture.onSet(DefaultResultSetFuture.java:180) at com.datastax.driver.core.RequestHandler.setFinalResult(RequestHandler.java:186) at com.datastax.driver.core.RequestHandler.access$2300(RequestHandler.java:44) at com.datastax.driver.core.RequestHandler$SpeculativeExecution.setFinalResult(RequestHandler.java:754) at com.datastax.driver.core.RequestHandler$SpeculativeExecution.onSet(RequestHandler.java:576) It would be great if someone helps me out from this situation. Thanks > Cassandra failure during write query at consistency LOCAL_QUORUM > -- > > Key: CASSANDRA-12045 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12045 > Project: Cassandra > Issue Type: Bug > Components: CQL, Local Write-Read Paths > Environment: Eclipse java environment >Reporter: Raghavendra Pinninti > Fix For: 3.x > > Original Estimate: 12h > Remaining Estimate: 12h > > While I am writing xml file into Cassandra table column I am facing following > exception.Its a 3 node cluster and All nodes are up. > {noformat} > com.datastax.driver.core.exceptions.WriteFailureException: Cassandra failure > during write query at consistency LOCAL_QUORUM (2 responses were required but > only 0 replica responded, 1 failed) at > com.datastax.driver.core.exceptions.WriteFailureException.copy(WriteFailureException.java:80) > at > com.datastax.driver.core.DriverThrowables.propagateCause(DriverThrowables.java:37) > at >
[jira] [Commented] (CASSANDRA-12153) RestrictionSet.hasIN() is slow
[ https://issues.apache.org/jira/browse/CASSANDRA-12153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15373170#comment-15373170 ] Tyler Hobbs commented on CASSANDRA-12153: - Nice! I'm +1 on your code changes. Can you start a CI test run? bq. PS: To be honest, I was not expecting that the use of new LinkHashSet, stream() and lambdas was so bad from a performance point of view. I was surprised by how much more expensive it was as well. > RestrictionSet.hasIN() is slow > -- > > Key: CASSANDRA-12153 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12153 > Project: Cassandra > Issue Type: Improvement > Components: Coordination >Reporter: Tyler Hobbs >Assignee: Tyler Hobbs >Priority: Minor > Fix For: 3.x > > > While profiling local in-memory reads for CASSANDRA-10993, I noticed that > {{RestrictionSet.hasIN()}} was responsible for about 1% of the time. It > looks like it's mostly slow because it creates a new LinkedHashSet (which is > expensive to init) and uses streams. This can be replaced with a simple for > loop. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9613) Omit (de)serialization of state variable in UDAs
[ https://issues.apache.org/jira/browse/CASSANDRA-9613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15373138#comment-15373138 ] Tyler Hobbs commented on CASSANDRA-9613: I think it should be fine to commit the patch as-is. The tests look okay (all failures are known failures). +1 > Omit (de)serialization of state variable in UDAs > > > Key: CASSANDRA-9613 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9613 > Project: Cassandra > Issue Type: Improvement >Reporter: Robert Stupp >Assignee: Robert Stupp >Priority: Minor > Fix For: 3.x > > > Currently the result of each UDA's state function call is serialized and then > deserialized for the next state-function invocation and optionally final > function invocation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12150) cqlsh does not automatically downgrade CQL version
[ https://issues.apache.org/jira/browse/CASSANDRA-12150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15373151#comment-15373151 ] Tyler Hobbs commented on CASSANDRA-12150: - bq. I would need a code review of this commit, Tyler Hobbs would you be able to review it? +1 on that commit. bq. We should probably mention somewhere that, even though cqlsh now connects to older server versions, there may be problems and it may not work in all cases since we currently do not test cqlsh against older server versions Yes, I think we should at least put something in {{NEWS.txt}} for now. Perhaps if this causes frequent problems, we can have cqlsh print a warning when the C* version it connects to is different than the C* version it shipped with. > cqlsh does not automatically downgrade CQL version > -- > > Key: CASSANDRA-12150 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12150 > Project: Cassandra > Issue Type: Improvement > Components: CQL >Reporter: Yusuke Takata >Priority: Minor > Labels: cqlsh > Attachments: patch.txt > > > Cassandra drivers such as the Python driver can automatically connect a > supported version, > but I found that cqlsh does not automatically downgrade CQL version as the > following. > {code} > $ cqlsh > Connection error: ('Unable to connect to any servers', {'127.0.0.1': > ProtocolError("cql_version '3.4.2' is not supported by remote (w/ native > protocol). Supported versions: [u'3.4.0']",)}) > {code} > I think that the function is useful for cqlsh too. > Could someone review the attached patch? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9613) Omit (de)serialization of state variable in UDAs
[ https://issues.apache.org/jira/browse/CASSANDRA-9613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tyler Hobbs updated CASSANDRA-9613: --- Status: Ready to Commit (was: Patch Available) > Omit (de)serialization of state variable in UDAs > > > Key: CASSANDRA-9613 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9613 > Project: Cassandra > Issue Type: Improvement >Reporter: Robert Stupp >Assignee: Robert Stupp >Priority: Minor > Fix For: 3.x > > > Currently the result of each UDA's state function call is serialized and then > deserialized for the next state-function invocation and optionally final > function invocation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
cassandra git commit: Fix spammy logging in paxos
Repository: cassandra Updated Branches: refs/heads/trunk 7abae2b3f -> 11b93152d Fix spammy logging in paxos Patch by wdeng; reviewed by jmckenzie for CASSANDRA-12155 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/11b93152 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/11b93152 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/11b93152 Branch: refs/heads/trunk Commit: 11b93152df5dace2a3a303b589b1d2546cb1b144 Parents: 7abae2b Author: Wei DengAuthored: Tue Jul 12 11:52:06 2016 -0400 Committer: Josh McKenzie Committed: Tue Jul 12 11:52:06 2016 -0400 -- src/java/org/apache/cassandra/service/paxos/ProposeCallback.java | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/11b93152/src/java/org/apache/cassandra/service/paxos/ProposeCallback.java -- diff --git a/src/java/org/apache/cassandra/service/paxos/ProposeCallback.java b/src/java/org/apache/cassandra/service/paxos/ProposeCallback.java index 018dab9..b0bd163 100644 --- a/src/java/org/apache/cassandra/service/paxos/ProposeCallback.java +++ b/src/java/org/apache/cassandra/service/paxos/ProposeCallback.java @@ -59,7 +59,7 @@ public class ProposeCallback extends AbstractPaxosCallback public void response(MessageIn msg) { -logger.debug("Propose response {} from {}", msg.payload, msg.from); +logger.trace("Propose response {} from {}", msg.payload, msg.from); if (msg.payload) accepts.incrementAndGet();
[jira] [Updated] (CASSANDRA-12155) proposeCallback.java is too spammy for debug.log
[ https://issues.apache.org/jira/browse/CASSANDRA-12155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joshua McKenzie updated CASSANDRA-12155: Resolution: Fixed Fix Version/s: 3.9 Status: Resolved (was: Ready to Commit) [Committed|https://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=commit;h=11b93152df5dace2a3a303b589b1d2546cb1b144]. Thanks. > proposeCallback.java is too spammy for debug.log > > > Key: CASSANDRA-12155 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12155 > Project: Cassandra > Issue Type: Bug > Components: Observability >Reporter: Wei Deng >Assignee: Wei Deng >Priority: Minor > Fix For: 3.9 > > > As stated in [this wiki > page|https://wiki.apache.org/cassandra/LoggingGuidelines] derived from the > work on CASSANDRA-10241, the DEBUG level logging in debug.log is intended for > "+low frequency state changes or message passing. Non-critical path logs on > operation details, performance measurements or general troubleshooting > information.+" > However, it appears that in a production deployment of C* 3.x, the LWT > message passing from ProposeCallback.java gets printed every 1-2 seconds, > which overwhelms debug.log from presenting the other important DEBUG level > logging messages, like the following: > {noformat} > DEBUG [SharedPool-Worker-2] 2016-07-09 05:23:57,800 ProposeCallback.java:62 > - Propose response true from /10.240.0.2 > DEBUG [SharedPool-Worker-1] 2016-07-09 05:24:00,803 ProposeCallback.java:62 > - Propose response true from /10.240.0.2 > DEBUG [SharedPool-Worker-1] 2016-07-09 05:24:00,804 ProposeCallback.java:62 > - Propose response true from /10.240.0.3 > DEBUG [SharedPool-Worker-1] 2016-07-09 05:24:03,807 ProposeCallback.java:62 > - Propose response true from /10.240.0.2 > DEBUG [SharedPool-Worker-2] 2016-07-09 05:24:03,807 ProposeCallback.java:62 > - Propose response true from /10.240.0.3 > DEBUG [SharedPool-Worker-1] 2016-07-09 05:24:06,811 ProposeCallback.java:62 > - Propose response true from /10.240.0.2 > DEBUG [SharedPool-Worker-2] 2016-07-09 05:24:06,811 ProposeCallback.java:62 > - Propose response true from /10.240.0.3 > DEBUG [SharedPool-Worker-1] 2016-07-09 05:24:09,815 ProposeCallback.java:62 > - Propose response true from /10.240.0.2 > DEBUG [SharedPool-Worker-2] 2016-07-09 05:24:09,815 ProposeCallback.java:62 > - Propose response true from /10.240.0.3 > DEBUG [SharedPool-Worker-1] 2016-07-09 05:24:12,819 ProposeCallback.java:62 > - Propose response true from /10.240.0.2 > DEBUG [SharedPool-Worker-2] 2016-07-09 05:24:12,819 ProposeCallback.java:62 > - Propose response true from /10.240.0.3 > DEBUG [SharedPool-Worker-1] 2016-07-09 05:24:15,823 ProposeCallback.java:62 > - Propose response true from /10.240.0.2 > DEBUG [SharedPool-Worker-2] 2016-07-09 05:24:15,823 ProposeCallback.java:62 > - Propose response true from /10.240.0.3 > DEBUG [SharedPool-Worker-1] 2016-07-09 05:24:18,827 ProposeCallback.java:62 > - Propose response true from /10.240.0.2 > DEBUG [SharedPool-Worker-2] 2016-07-09 05:24:18,827 ProposeCallback.java:62 > - Propose response true from /10.240.0.3 > DEBUG [SharedPool-Worker-1] 2016-07-09 05:24:21,831 ProposeCallback.java:62 > - Propose response true from /10.240.0.2 > DEBUG [SharedPool-Worker-2] 2016-07-09 05:24:21,831 ProposeCallback.java:62 > - Propose response true from /10.240.0.3 > DEBUG [SharedPool-Worker-1] 2016-07-09 05:24:24,835 ProposeCallback.java:62 > - Propose response true from /10.240.0.2 > DEBUG [SharedPool-Worker-1] 2016-07-09 05:24:24,835 ProposeCallback.java:62 > - Propose response true from /10.240.0.3 > DEBUG [SharedPool-Worker-1] 2016-07-09 05:24:27,839 ProposeCallback.java:62 > - Propose response true from /10.240.0.2 > DEBUG [SharedPool-Worker-2] 2016-07-09 05:24:27,839 ProposeCallback.java:62 > - Propose response true from /10.240.0.3 > DEBUG [SharedPool-Worker-1] 2016-07-09 05:24:30,843 ProposeCallback.java:62 > - Propose response true from /10.240.0.2 > DEBUG [SharedPool-Worker-1] 2016-07-09 05:24:30,843 ProposeCallback.java:62 > - Propose response true from /10.240.0.3 > DEBUG [SharedPool-Worker-1] 2016-07-09 05:24:33,847 ProposeCallback.java:62 > - Propose response true from /10.240.0.3 > DEBUG [SharedPool-Worker-2] 2016-07-09 05:24:33,847 ProposeCallback.java:62 > - Propose response true from /10.240.0.2 > DEBUG [SharedPool-Worker-2] 2016-07-09 05:24:36,851 ProposeCallback.java:62 > - Propose response true from /10.240.0.3 > DEBUG [SharedPool-Worker-2] 2016-07-09 05:24:36,852 ProposeCallback.java:62 > - Propose response true from /10.240.0.2 > DEBUG [SharedPool-Worker-1] 2016-07-09 05:24:39,855 ProposeCallback.java:62 > - Propose response true from /10.240.0.2 > DEBUG
[jira] [Updated] (CASSANDRA-12155) proposeCallback.java is too spammy for debug.log
[ https://issues.apache.org/jira/browse/CASSANDRA-12155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joshua McKenzie updated CASSANDRA-12155: Status: Ready to Commit (was: Patch Available) > proposeCallback.java is too spammy for debug.log > > > Key: CASSANDRA-12155 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12155 > Project: Cassandra > Issue Type: Bug > Components: Observability >Reporter: Wei Deng >Assignee: Wei Deng >Priority: Minor > Fix For: 3.9 > > > As stated in [this wiki > page|https://wiki.apache.org/cassandra/LoggingGuidelines] derived from the > work on CASSANDRA-10241, the DEBUG level logging in debug.log is intended for > "+low frequency state changes or message passing. Non-critical path logs on > operation details, performance measurements or general troubleshooting > information.+" > However, it appears that in a production deployment of C* 3.x, the LWT > message passing from ProposeCallback.java gets printed every 1-2 seconds, > which overwhelms debug.log from presenting the other important DEBUG level > logging messages, like the following: > {noformat} > DEBUG [SharedPool-Worker-2] 2016-07-09 05:23:57,800 ProposeCallback.java:62 > - Propose response true from /10.240.0.2 > DEBUG [SharedPool-Worker-1] 2016-07-09 05:24:00,803 ProposeCallback.java:62 > - Propose response true from /10.240.0.2 > DEBUG [SharedPool-Worker-1] 2016-07-09 05:24:00,804 ProposeCallback.java:62 > - Propose response true from /10.240.0.3 > DEBUG [SharedPool-Worker-1] 2016-07-09 05:24:03,807 ProposeCallback.java:62 > - Propose response true from /10.240.0.2 > DEBUG [SharedPool-Worker-2] 2016-07-09 05:24:03,807 ProposeCallback.java:62 > - Propose response true from /10.240.0.3 > DEBUG [SharedPool-Worker-1] 2016-07-09 05:24:06,811 ProposeCallback.java:62 > - Propose response true from /10.240.0.2 > DEBUG [SharedPool-Worker-2] 2016-07-09 05:24:06,811 ProposeCallback.java:62 > - Propose response true from /10.240.0.3 > DEBUG [SharedPool-Worker-1] 2016-07-09 05:24:09,815 ProposeCallback.java:62 > - Propose response true from /10.240.0.2 > DEBUG [SharedPool-Worker-2] 2016-07-09 05:24:09,815 ProposeCallback.java:62 > - Propose response true from /10.240.0.3 > DEBUG [SharedPool-Worker-1] 2016-07-09 05:24:12,819 ProposeCallback.java:62 > - Propose response true from /10.240.0.2 > DEBUG [SharedPool-Worker-2] 2016-07-09 05:24:12,819 ProposeCallback.java:62 > - Propose response true from /10.240.0.3 > DEBUG [SharedPool-Worker-1] 2016-07-09 05:24:15,823 ProposeCallback.java:62 > - Propose response true from /10.240.0.2 > DEBUG [SharedPool-Worker-2] 2016-07-09 05:24:15,823 ProposeCallback.java:62 > - Propose response true from /10.240.0.3 > DEBUG [SharedPool-Worker-1] 2016-07-09 05:24:18,827 ProposeCallback.java:62 > - Propose response true from /10.240.0.2 > DEBUG [SharedPool-Worker-2] 2016-07-09 05:24:18,827 ProposeCallback.java:62 > - Propose response true from /10.240.0.3 > DEBUG [SharedPool-Worker-1] 2016-07-09 05:24:21,831 ProposeCallback.java:62 > - Propose response true from /10.240.0.2 > DEBUG [SharedPool-Worker-2] 2016-07-09 05:24:21,831 ProposeCallback.java:62 > - Propose response true from /10.240.0.3 > DEBUG [SharedPool-Worker-1] 2016-07-09 05:24:24,835 ProposeCallback.java:62 > - Propose response true from /10.240.0.2 > DEBUG [SharedPool-Worker-1] 2016-07-09 05:24:24,835 ProposeCallback.java:62 > - Propose response true from /10.240.0.3 > DEBUG [SharedPool-Worker-1] 2016-07-09 05:24:27,839 ProposeCallback.java:62 > - Propose response true from /10.240.0.2 > DEBUG [SharedPool-Worker-2] 2016-07-09 05:24:27,839 ProposeCallback.java:62 > - Propose response true from /10.240.0.3 > DEBUG [SharedPool-Worker-1] 2016-07-09 05:24:30,843 ProposeCallback.java:62 > - Propose response true from /10.240.0.2 > DEBUG [SharedPool-Worker-1] 2016-07-09 05:24:30,843 ProposeCallback.java:62 > - Propose response true from /10.240.0.3 > DEBUG [SharedPool-Worker-1] 2016-07-09 05:24:33,847 ProposeCallback.java:62 > - Propose response true from /10.240.0.3 > DEBUG [SharedPool-Worker-2] 2016-07-09 05:24:33,847 ProposeCallback.java:62 > - Propose response true from /10.240.0.2 > DEBUG [SharedPool-Worker-2] 2016-07-09 05:24:36,851 ProposeCallback.java:62 > - Propose response true from /10.240.0.3 > DEBUG [SharedPool-Worker-2] 2016-07-09 05:24:36,852 ProposeCallback.java:62 > - Propose response true from /10.240.0.2 > DEBUG [SharedPool-Worker-1] 2016-07-09 05:24:39,855 ProposeCallback.java:62 > - Propose response true from /10.240.0.2 > DEBUG [SharedPool-Worker-2] 2016-07-09 05:24:39,855 ProposeCallback.java:62 > - Propose response true from /10.240.0.3 > DEBUG [SharedPool-Worker-1] 2016-07-09 05:24:42,859 ProposeCallback.java:62 > -
[jira] [Commented] (CASSANDRA-11031) MultiTenant : support “ALLOW FILTERING" for Partition Key
[ https://issues.apache.org/jira/browse/CASSANDRA-11031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15373089#comment-15373089 ] Alex Petrov commented on CASSANDRA-11031: - Thank you for your patch [~jasonstack]. Sorry it took so long to review it. I'll do my best to keep all the future iterations as short as possible. >From the first glance, there are many cases that are missing. For example, >filtering by anything but {{=}} wouldn't work. {code} createTable("CREATE TABLE %s (a text, b int, v int, PRIMARY KEY ((a, b))) "); execute("INSERT INTO %s (a, b, v) VALUES ('a', 2, 1)"); execute("INSERT INTO %s (a, b, v) VALUES ('a', 3, 1)"); assertRows(execute("SELECT * FROM %s WHERE a = 'a' AND b > 0 ALLOW FILTERING"), row("a", 2, 1), row("a", 3, 1)); {code} Would throw {code} org.apache.cassandra.exceptions.InvalidRequestException: Only EQ and IN relation are supported on the partition key (unless you use the token() function) {code} I would start with the unit test, to be honest. Although dtests are also important. Paging tests might be also good. Could you please check out: * combinations of partition and clustering key filtering * compound partition keys and their filtering * non-EQ relations ({{LT}}, {{GT}} etc) * {{COMPACT STORAGE}} You can take some inspiration for tests from [CASSANDRA-11310]. > MultiTenant : support “ALLOW FILTERING" for Partition Key > - > > Key: CASSANDRA-11031 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11031 > Project: Cassandra > Issue Type: New Feature > Components: CQL >Reporter: ZhaoYang >Assignee: ZhaoYang >Priority: Minor > Fix For: 3.x > > Attachments: CASSANDRA-11031-3.7.patch > > > Currently, Allow Filtering only works for secondary Index column or > clustering columns. And it's slow, because Cassandra will read all data from > SSTABLE from hard-disk to memory to filter. > But we can support allow filtering on Partition Key, as far as I know, > Partition Key is in memory, so we can easily filter them, and then read > required data from SSTable. > This will similar to "Select * from table" which scan through entire cluster. > CREATE TABLE multi_tenant_table ( > tenant_id text, > pk2 text, > c1 text, > c2 text, > v1 text, > v2 text, > PRIMARY KEY ((tenant_id,pk2),c1,c2) > ) ; > Select * from multi_tenant_table where tenant_id = "datastax" allow filtering; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-10635) Add metrics for authentication failures
[ https://issues.apache.org/jira/browse/CASSANDRA-10635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sam Tunnicliffe updated CASSANDRA-10635: Status: Ready to Commit (was: Awaiting Feedback) > Add metrics for authentication failures > --- > > Key: CASSANDRA-10635 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10635 > Project: Cassandra > Issue Type: Improvement >Reporter: Soumava Ghosh >Assignee: Soumava Ghosh >Priority: Minor > Fix For: 3.x > > Attachments: 10635-2.1.txt, 10635-2.2.txt, 10635-3.0.txt, > 10635-dtest.patch, 10635-trunk.patch > > > There should be no auth failures on a cluster in general. > Having metrics around the authentication code would help detect clients > that are connecting to the wrong cluster or have auth incorrectly configured. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10635) Add metrics for authentication failures
[ https://issues.apache.org/jira/browse/CASSANDRA-10635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15373042#comment-15373042 ] Sam Tunnicliffe commented on CASSANDRA-10635: - [~soumava] I've opened a [dtest PR|https://github.com/riptano/cassandra-dtest/pull/1090] with your new test. Once that's approved, unless you object I'll commit the version with the meters from my branch. > Add metrics for authentication failures > --- > > Key: CASSANDRA-10635 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10635 > Project: Cassandra > Issue Type: Improvement >Reporter: Soumava Ghosh >Assignee: Soumava Ghosh >Priority: Minor > Fix For: 3.x > > Attachments: 10635-2.1.txt, 10635-2.2.txt, 10635-3.0.txt, > 10635-dtest.patch, 10635-trunk.patch > > > There should be no auth failures on a cluster in general. > Having metrics around the authentication code would help detect clients > that are connecting to the wrong cluster or have auth incorrectly configured. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12035) Structure for tpstats output (JSON, YAML)
[ https://issues.apache.org/jira/browse/CASSANDRA-12035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15373033#comment-15373033 ] Alex Petrov commented on CASSANDRA-12035: - [~hnishi] could you please take another look at the whole patch? I've also refactored {{TableStats}} to extract all the logic into the corresponding holder, and made several renames of variables in the {{TpStatsHolder}}. Also, in {{TpStatsPrinter}} I've made the first row 30, not 25 symbols wide since because of PerDiskMemtableFlushWriter_0 (long name) the table was a bit shifted. > Structure for tpstats output (JSON, YAML) > - > > Key: CASSANDRA-12035 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12035 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Hiroyuki Nishi >Assignee: Hiroyuki Nishi >Priority: Minor > Attachments: CASSANDRA-12035-trunk.patch, tablestats_result.json, > tablestats_result.txt, tablestats_result.yaml, tpstats_output.yaml, > tpstats_result.json, tpstats_result.txt, tpstats_result.yaml > > > In CASSANDRA-5977, some extra output formats such as JSON and YAML were added > for nodetool tablestats. > Similarly, I would like to add the output formats in nodetool tpstats. > Also, I tried to refactor the tablestats's code about the output formats to > integrate the existing code with my code. > Please review the attached patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-12177) sstabledump fails if sstable path includes dot
Keith Wansbrough created CASSANDRA-12177: Summary: sstabledump fails if sstable path includes dot Key: CASSANDRA-12177 URL: https://issues.apache.org/jira/browse/CASSANDRA-12177 Project: Cassandra Issue Type: Bug Components: Tools Reporter: Keith Wansbrough If there is a dot in the file path passed to sstabledump, it fails with an error {{partitioner org.apache.cassandra.dht.Murmur3Partitioner does not match system partitioner org.apache.cassandra.dht.LocalPartitioner.}} I can work around this by renaming the directory containing the file, but it seems like a bug. I expected the directory name to be irrelevant. Example (assumes you have a keyspace test containing a table called sport, but should repro with any keyspace/table): {code} $ cp -a /var/lib/cassandra/data/test/sport-ebe76350474e11e6879fc5e30fbb0e96 testdir $ sstabledump testdir/mb-1-big-Data.db [ { "partition" : { "key" : [ "2" ], "position" : 0 }, "rows" : [ { "type" : "row", "position" : 18, "liveness_info" : { "tstamp" : "2016-07-11T10:15:22.766107Z" }, "cells" : [ { "name" : "score", "value" : "Golf" }, { "name" : "sport_type", "value" : "5" } ] } ] } ] $ cp -a /var/lib/cassandra/data/test/sport-ebe76350474e11e6879fc5e30fbb0e96 test.dir $ sstabledump test.dir/mb-1-big-Data.db ERROR 15:02:52 Cannot open /home/centos/test.dir/mb-1-big; partitioner org.apache.cassandra.dht.Murmur3Partitioner does not match system partitioner org.apache.cassandra.dht.LocalPartitioner. Note that the default partitioner starting with Cassandra 1.2 is Murmur3Partitioner, so you will need to edit that to match your old partitioner if upgrading. {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12035) Structure for tpstats output (JSON, YAML)
[ https://issues.apache.org/jira/browse/CASSANDRA-12035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Petrov updated CASSANDRA-12035: Attachment: tpstats_output.yaml > Structure for tpstats output (JSON, YAML) > - > > Key: CASSANDRA-12035 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12035 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Hiroyuki Nishi >Assignee: Hiroyuki Nishi >Priority: Minor > Attachments: CASSANDRA-12035-trunk.patch, tablestats_result.json, > tablestats_result.txt, tablestats_result.yaml, tpstats_output.yaml, > tpstats_result.json, tpstats_result.txt, tpstats_result.yaml > > > In CASSANDRA-5977, some extra output formats such as JSON and YAML were added > for nodetool tablestats. > Similarly, I would like to add the output formats in nodetool tpstats. > Also, I tried to refactor the tablestats's code about the output formats to > integrate the existing code with my code. > Please review the attached patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12035) Structure for tpstats output (JSON, YAML)
[ https://issues.apache.org/jira/browse/CASSANDRA-12035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15372934#comment-15372934 ] Alex Petrov commented on CASSANDRA-12035: - I've made one more change: switched from flow style output back to block to make the output look more yaml-y. Right now because of the flow-style it looks more like json in some parts. I've uploaded the new version for comparison. > Structure for tpstats output (JSON, YAML) > - > > Key: CASSANDRA-12035 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12035 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Hiroyuki Nishi >Assignee: Hiroyuki Nishi >Priority: Minor > Attachments: CASSANDRA-12035-trunk.patch, tablestats_result.json, > tablestats_result.txt, tablestats_result.yaml, tpstats_output.yaml, > tpstats_result.json, tpstats_result.txt, tpstats_result.yaml > > > In CASSANDRA-5977, some extra output formats such as JSON and YAML were added > for nodetool tablestats. > Similarly, I would like to add the output formats in nodetool tpstats. > Also, I tried to refactor the tablestats's code about the output formats to > integrate the existing code with my code. > Please review the attached patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-12176) dtest failure in materialized_views_test.TestMaterializedViews.complex_repair_test
Sean McCarthy created CASSANDRA-12176: - Summary: dtest failure in materialized_views_test.TestMaterializedViews.complex_repair_test Key: CASSANDRA-12176 URL: https://issues.apache.org/jira/browse/CASSANDRA-12176 Project: Cassandra Issue Type: Test Reporter: Sean McCarthy Assignee: DS Test Eng Attachments: node1.log, node1_debug.log, node1_gc.log, node2.log, node2_debug.log, node2_gc.log, node3.log, node3_debug.log, node3_gc.log, node4.log, node4_debug.log, node4_gc.log, node5.log, node5_debug.log, node5_gc.log example failure: http://cassci.datastax.com/job/cassandra-3.9_novnode_dtest/8/testReport/materialized_views_test/TestMaterializedViews/complex_repair_test Failed on CassCI build cassandra-3.9_novnode_dtest #8 {code} Stacktrace File "/usr/lib/python2.7/unittest/case.py", line 329, in run testMethod() File "/home/automaton/cassandra-dtest/materialized_views_test.py", line 956, in complex_repair_test session.execute("CREATE TABLE ks.t (id int PRIMARY KEY, v int, v2 text, v3 decimal)" File "cassandra/cluster.py", line 1941, in cassandra.cluster.Session.execute (cassandra/cluster.c:33642) return self.execute_async(query, parameters, trace, custom_payload, timeout, execution_profile).result() File "cassandra/cluster.py", line 3629, in cassandra.cluster.ResponseFuture.result (cassandra/cluster.c:69369) raise self._final_exception ' {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-12175) Raise error on using NetworkTopologyStrategy w/o any DCs
Stefan Podkowinski created CASSANDRA-12175: -- Summary: Raise error on using NetworkTopologyStrategy w/o any DCs Key: CASSANDRA-12175 URL: https://issues.apache.org/jira/browse/CASSANDRA-12175 Project: Cassandra Issue Type: Improvement Reporter: Stefan Podkowinski Priority: Minor Sometimes it happens that users will create a keyspace using NetworkTopologyStrategy but at the same time forget to specify the corresponding data-centers. The only point where you'll notice your mistake will be after the first insert or select statement. Even then, the error message can be confusing, especially for beginners. {noformat} CREATE KEYSPACE test WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy' }; USE test; CREATE TABLE airplanes ( name text PRIMARY KEY, manufacturer ascii, year int, mach float ); INSERT INTO airplanes (name, manufacturer, year, mach) VALUES ('P38-Lightning', 'Lockheed', 1937, 0.7); Unavailable: code=1000 [Unavailable exception] message="Cannot achieve consistency level ONE" info={'required_replicas': 1, 'alive_replicas': 0, 'consistency': 'ONE'} {noformat} I don't see any point why you should be able to use NetworkTopologyStrategy without any DCs, so I'd suggest to raise an error in this situation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12035) Structure for tpstats output (JSON, YAML)
[ https://issues.apache.org/jira/browse/CASSANDRA-12035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15372871#comment-15372871 ] Alex Petrov commented on CASSANDRA-12035: - Yes, that was something I've meant. I'll squash commits and do another round of testing + run CI. > Structure for tpstats output (JSON, YAML) > - > > Key: CASSANDRA-12035 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12035 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Hiroyuki Nishi >Assignee: Hiroyuki Nishi >Priority: Minor > Attachments: CASSANDRA-12035-trunk.patch, tablestats_result.json, > tablestats_result.txt, tablestats_result.yaml, tpstats_result.json, > tpstats_result.txt, tpstats_result.yaml > > > In CASSANDRA-5977, some extra output formats such as JSON and YAML were added > for nodetool tablestats. > Similarly, I would like to add the output formats in nodetool tpstats. > Also, I tried to refactor the tablestats's code about the output formats to > integrate the existing code with my code. > Please review the attached patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12174) COPY FROM should raise error for non-existing input files
[ https://issues.apache.org/jira/browse/CASSANDRA-12174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Podkowinski updated CASSANDRA-12174: --- Component/s: Tools > COPY FROM should raise error for non-existing input files > - > > Key: CASSANDRA-12174 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12174 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Stefan Podkowinski >Priority: Minor > Labels: lhf > > Currently the CSV COPY FROM command will not raise any error for non-existing > paths. Instead only "0 rows imported" will be shown as result. > As the COPY FROM command is often used for tutorials and getting started > guides, I'd suggest to give a clear error message in case of a missing input > file. Without such error it can be confusing for the user to see the command > actually finish, without any clues why no rows have been imported. > {noformat} > CREATE KEYSPACE test > WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy', 'datacenter1' : 1 > }; > USE test; > CREATE TABLE airplanes ( > name text PRIMARY KEY, > manufacturer ascii, > year int, > mach float > ); > COPY airplanes (name, manufacturer, year, mach) FROM '/tmp/1234-doesnotexist'; > Using 3 child processes > Starting copy of test.airplanes with columns [name, manufacturer, year, mach]. > Processed: 0 rows; Rate: 0 rows/s; Avg. rate: 0 rows/s > 0 rows imported from 0 files in 0.216 seconds (0 skipped). > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-12174) COPY FROM should raise error for non-existing input files
Stefan Podkowinski created CASSANDRA-12174: -- Summary: COPY FROM should raise error for non-existing input files Key: CASSANDRA-12174 URL: https://issues.apache.org/jira/browse/CASSANDRA-12174 Project: Cassandra Issue Type: Improvement Reporter: Stefan Podkowinski Priority: Minor Currently the CSV COPY FROM command will not raise any error for non-existing paths. Instead only "0 rows imported" will be shown as result. As the COPY FROM command is often used for tutorials and getting started guides, I'd suggest to give a clear error message in case of a missing input file. Without such error it can be confusing for the user to see the command actually finish, without any clues why no rows have been imported. {noformat} CREATE KEYSPACE test WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy', 'datacenter1' : 1 }; USE test; CREATE TABLE airplanes ( name text PRIMARY KEY, manufacturer ascii, year int, mach float ); COPY airplanes (name, manufacturer, year, mach) FROM '/tmp/1234-doesnotexist'; Using 3 child processes Starting copy of test.airplanes with columns [name, manufacturer, year, mach]. Processed: 0 rows; Rate: 0 rows/s; Avg. rate: 0 rows/s 0 rows imported from 0 files in 0.216 seconds (0 skipped). {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12174) COPY FROM should raise error for non-existing input files
[ https://issues.apache.org/jira/browse/CASSANDRA-12174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Podkowinski updated CASSANDRA-12174: --- Labels: lhf (was: lhf tooling) > COPY FROM should raise error for non-existing input files > - > > Key: CASSANDRA-12174 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12174 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Stefan Podkowinski >Priority: Minor > Labels: lhf > > Currently the CSV COPY FROM command will not raise any error for non-existing > paths. Instead only "0 rows imported" will be shown as result. > As the COPY FROM command is often used for tutorials and getting started > guides, I'd suggest to give a clear error message in case of a missing input > file. Without such error it can be confusing for the user to see the command > actually finish, without any clues why no rows have been imported. > {noformat} > CREATE KEYSPACE test > WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy', 'datacenter1' : 1 > }; > USE test; > CREATE TABLE airplanes ( > name text PRIMARY KEY, > manufacturer ascii, > year int, > mach float > ); > COPY airplanes (name, manufacturer, year, mach) FROM '/tmp/1234-doesnotexist'; > Using 3 child processes > Starting copy of test.airplanes with columns [name, manufacturer, year, mach]. > Processed: 0 rows; Rate: 0 rows/s; Avg. rate: 0 rows/s > 0 rows imported from 0 files in 0.216 seconds (0 skipped). > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9608) Support Java 9
[ https://issues.apache.org/jira/browse/CASSANDRA-9608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15372821#comment-15372821 ] Robert Stupp commented on CASSANDRA-9608: - [~carlosabad], thanks for your effort! I've already created [a branch|https://github.com/snazy/cassandra/tree/9608-java9-trunk] and thought I left a comment about it in this ticket - apologies for that. I basically found the same issues. But I'm a bit opposed to the use of {{ReentrantLock}} since there are usually a huge amount of {{AtomicBTreePartition}} instances and each {{ReentrantLock}} can potentially create lot of dependent objects - tl;dr it potentially introduces a lot of GC pressure. So, what we would need here is an exclusive lock, which doesn't need to be fair, that is just good enough - not sure whether [my approach|https://github.com/snazy/cassandra/commit/c67c532fec9b2b073ecdf70b50b80440fb972f31#diff-7246e27576858f45f3f2678b9be03bfeR105] is good enough, though. Would be glad to have you on board and tackle this ticket! Many of the tests fail because we evaluate the system property {{java.vm.version}} in out agent library [jamm at this line|https://github.com/jbellis/jamm/blob/17fe5661d3706ac8bdcc2cc1a1d747efa00157fa/src/org/github/jamm/MemoryLayoutSpecification.java#L190]. The meaning of the version has changed completely. Before Java 9, it looked like {{25.92-b14}} but with Java 9 is looks like {{9-ea+121}} - i.e. the agent's code throws a {{StringIndexOutOfBoundsException}}. A viable solution (viable workaround is probably a better term, though) could be to let jamm produce Java 8 byte code and remove the {{java.vm.version}} check and/or change the code in jamm. We use jamm to calculate on-heap usage of objects. We might have to support both Java 8 and Java 9 (as soon as Java 9 is GA) for some period of time - similar to the transition from Java 7 to Java 8, where we built C* on Java 7 but kind-of supported Java 8 before actually just C* 3.0 required Java 8. I think a major version (maybe 5.x?) would be required for such a switch. Just want to say, that both the code and the build.xml file need to work against Java 8 _and_ 9. (Requiring ant 1.9.7 is not an issue imo.) > Support Java 9 > -- > > Key: CASSANDRA-9608 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9608 > Project: Cassandra > Issue Type: Task >Reporter: Robert Stupp >Priority: Minor > > This ticket is intended to group all issues found to support Java 9 in the > future. > From what I've found out so far: > * Maven dependency {{com.sun:tools:jar:0}} via cobertura cannot be resolved. > It can be easily solved using this patch: > {code} > - artifactId="cobertura"/> > + artifactId="cobertura"> > + > + > {code} > * Another issue is that {{sun.misc.Unsafe}} no longer contains the methods > {{monitorEnter}} + {{monitorExit}}. These methods are used by > {{o.a.c.utils.concurrent.Locks}} which is only used by > {{o.a.c.db.AtomicBTreeColumns}}. > I don't mind to start working on this yet since Java 9 is in a too early > development phase. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12144) Undeletable rows after upgrading from 2.2.4 to 3.0.7
[ https://issues.apache.org/jira/browse/CASSANDRA-12144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Petrov updated CASSANDRA-12144: Status: Patch Available (was: Open) > Undeletable rows after upgrading from 2.2.4 to 3.0.7 > > > Key: CASSANDRA-12144 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12144 > Project: Cassandra > Issue Type: Bug >Reporter: Stanislav Vishnevskiy >Assignee: Alex Petrov > > We upgraded our cluster today and now have a some rows that refuse to delete. > Here are some example traces. > https://gist.github.com/vishnevskiy/36aa18c468344ea22d14f9fb9b99171d > Even weirder. > Updating the row and querying it back results in 2 rows even though the id is > the clustering key. > {noformat} > user_id| id | since| type > ---++--+-- > 116138050710536192 | 153047019424972800 | null |0 > 116138050710536192 | 153047019424972800 | 2016-05-30 14:53:08+ |2 > {noformat} > And then deleting it again only removes the new one. > {noformat} > cqlsh:discord_relationships> DELETE FROM relationships WHERE user_id = > 116138050710536192 AND id = 153047019424972800; > cqlsh:discord_relationships> SELECT * FROM relationships WHERE user_id = > 116138050710536192 AND id = 153047019424972800; > user_id| id | since| type > ++--+-- > 116138050710536192 | 153047019424972800 | 2016-05-30 14:53:08+ |2 > {noformat} > We tried repairing, compacting, scrubbing. No Luck. > Not sure what to do. Is anyone aware of this? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12144) Undeletable rows after upgrading from 2.2.4 to 3.0.7
[ https://issues.apache.org/jira/browse/CASSANDRA-12144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15372780#comment-15372780 ] Alex Petrov commented on CASSANDRA-12144: - Created broken sstables for both 2.0 and 3.0 storage formats and wrote some unit tests to reproduce the failures and make the review possibly easier. |[3.0|https://github.com/ifesdjeen/cassandra/tree/12144-3.0] |[utest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-12144-3.0-testall/] |[dtest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-12144-3.0-dtest/] |[upgrade tests|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/upgrade_tests-all-12144-3.0/]| |[trunk|https://github.com/ifesdjeen/cassandra/tree/12144-trunk] |[utest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-12144-trunk-testall/] |[dtest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-12144-trunk-dtest/] |[upgrade tests|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/upgrade_tests-all-12144-trunk/]| I'm re-running the tests again (although only changes were my added sstables and tests). Unfortunately, upgrade tests are not very representative as they're throwing similar failures (in similar amounts on [trunk|https://cassci.datastax.com/view/Upgrades/job/upgrade_tests-all/lastCompletedBuild/testReport/]), also failures look more like assertion mismatches. > Undeletable rows after upgrading from 2.2.4 to 3.0.7 > > > Key: CASSANDRA-12144 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12144 > Project: Cassandra > Issue Type: Bug >Reporter: Stanislav Vishnevskiy >Assignee: Alex Petrov > > We upgraded our cluster today and now have a some rows that refuse to delete. > Here are some example traces. > https://gist.github.com/vishnevskiy/36aa18c468344ea22d14f9fb9b99171d > Even weirder. > Updating the row and querying it back results in 2 rows even though the id is > the clustering key. > {noformat} > user_id| id | since| type > ---++--+-- > 116138050710536192 | 153047019424972800 | null |0 > 116138050710536192 | 153047019424972800 | 2016-05-30 14:53:08+ |2 > {noformat} > And then deleting it again only removes the new one. > {noformat} > cqlsh:discord_relationships> DELETE FROM relationships WHERE user_id = > 116138050710536192 AND id = 153047019424972800; > cqlsh:discord_relationships> SELECT * FROM relationships WHERE user_id = > 116138050710536192 AND id = 153047019424972800; > user_id| id | since| type > ++--+-- > 116138050710536192 | 153047019424972800 | 2016-05-30 14:53:08+ |2 > {noformat} > We tried repairing, compacting, scrubbing. No Luck. > Not sure what to do. Is anyone aware of this? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11730) [windows] dtest failure in jmx_auth_test.TestJMXAuth.basic_auth_test
[ https://issues.apache.org/jira/browse/CASSANDRA-11730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sam Tunnicliffe updated CASSANDRA-11730: Status: Patch Available (was: Open) Although unrelated to the original failures (which occur only on windows) the [recent failures|http://cassci.datastax.com/job/trunk_dtest/1301/testReport/jmx_auth_test/TestJMXAuth/basic_auth_test/] on the main dtest jobs are because I omitted to update the error messages following CASSANDRA-12076, so I've [fixed that now|https://github.com/riptano/cassandra-dtest/commit/a10aba768d7a6c590e4a97baeb5e4c2740389f2a]. The actual problem with windows is most likely caused by {{jmxutils.apply_jmx_authentication}} not taking into account the differences between {{cassandra-env.sh}} and {{cassandra-env.ps1}}. I've pushed a dtest branch [here|https://github.com/beobal/cassandra-dtest/tree/11730] with a potential fix, but I've no way of running the actual test to verify it. > [windows] dtest failure in jmx_auth_test.TestJMXAuth.basic_auth_test > > > Key: CASSANDRA-11730 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11730 > Project: Cassandra > Issue Type: Bug >Reporter: Russ Hatch >Assignee: Sam Tunnicliffe > Labels: dtest, windows > Fix For: 3.x > > > looks to be failing on each run so far: > http://cassci.datastax.com/job/trunk_dtest_win32/406/testReport/jmx_auth_test/TestJMXAuth/basic_auth_test > Failed on CassCI build trunk_dtest_win32 #406 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (CASSANDRA-11730) [windows] dtest failure in jmx_auth_test.TestJMXAuth.basic_auth_test
[ https://issues.apache.org/jira/browse/CASSANDRA-11730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sam Tunnicliffe reassigned CASSANDRA-11730: --- Assignee: Sam Tunnicliffe (was: Joshua McKenzie) > [windows] dtest failure in jmx_auth_test.TestJMXAuth.basic_auth_test > > > Key: CASSANDRA-11730 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11730 > Project: Cassandra > Issue Type: Bug >Reporter: Russ Hatch >Assignee: Sam Tunnicliffe > Labels: dtest, windows > Fix For: 3.x > > > looks to be failing on each run so far: > http://cassci.datastax.com/job/trunk_dtest_win32/406/testReport/jmx_auth_test/TestJMXAuth/basic_auth_test > Failed on CassCI build trunk_dtest_win32 #406 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
cassandra git commit: switch the metadata commponents map to an EnumMap
Repository: cassandra Updated Branches: refs/heads/trunk 91392edbe -> 7abae2b3f switch the metadata commponents map to an EnumMap Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/7abae2b3 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/7abae2b3 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/7abae2b3 Branch: refs/heads/trunk Commit: 7abae2b3fd0f8464b8f1bb0efef59009348c2360 Parents: 91392ed Author: Dave BrosiusAuthored: Tue Jul 12 06:43:34 2016 -0400 Committer: Dave Brosius Committed: Tue Jul 12 06:43:34 2016 -0400 -- .../cassandra/io/sstable/metadata/LegacyMetadataSerializer.java | 2 +- .../apache/cassandra/io/sstable/metadata/MetadataCollector.java | 3 ++- .../apache/cassandra/io/sstable/metadata/MetadataSerializer.java | 4 ++-- 3 files changed, 5 insertions(+), 4 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/7abae2b3/src/java/org/apache/cassandra/io/sstable/metadata/LegacyMetadataSerializer.java -- diff --git a/src/java/org/apache/cassandra/io/sstable/metadata/LegacyMetadataSerializer.java b/src/java/org/apache/cassandra/io/sstable/metadata/LegacyMetadataSerializer.java index 505de49..253b4f6 100644 --- a/src/java/org/apache/cassandra/io/sstable/metadata/LegacyMetadataSerializer.java +++ b/src/java/org/apache/cassandra/io/sstable/metadata/LegacyMetadataSerializer.java @@ -81,7 +81,7 @@ public class LegacyMetadataSerializer extends MetadataSerializer @Override public Map deserialize(Descriptor descriptor, EnumSet types) throws IOException { -Map components = Maps.newHashMap(); +Map components = new EnumMap<>(MetadataType.class); File statsFile = new File(descriptor.filenameFor(Component.STATS)); if (!statsFile.exists() && types.contains(MetadataType.STATS)) http://git-wip-us.apache.org/repos/asf/cassandra/blob/7abae2b3/src/java/org/apache/cassandra/io/sstable/metadata/MetadataCollector.java -- diff --git a/src/java/org/apache/cassandra/io/sstable/metadata/MetadataCollector.java b/src/java/org/apache/cassandra/io/sstable/metadata/MetadataCollector.java index 299bc87..be064f1 100644 --- a/src/java/org/apache/cassandra/io/sstable/metadata/MetadataCollector.java +++ b/src/java/org/apache/cassandra/io/sstable/metadata/MetadataCollector.java @@ -20,6 +20,7 @@ package org.apache.cassandra.io.sstable.metadata; import java.nio.ByteBuffer; import java.util.ArrayList; import java.util.Collections; +import java.util.EnumMap; import java.util.List; import java.util.Map; @@ -295,7 +296,7 @@ public class MetadataCollector implements PartitionStatisticsCollector public Map finalizeMetadata(String partitioner, double bloomFilterFPChance, long repairedAt, SerializationHeader header) { -Map components = Maps.newHashMap(); +Map components = new EnumMap<>(MetadataType.class); components.put(MetadataType.VALIDATION, new ValidationMetadata(partitioner, bloomFilterFPChance)); components.put(MetadataType.STATS, new StatsMetadata(estimatedPartitionSize, estimatedCellPerPartitionCount, http://git-wip-us.apache.org/repos/asf/cassandra/blob/7abae2b3/src/java/org/apache/cassandra/io/sstable/metadata/MetadataSerializer.java -- diff --git a/src/java/org/apache/cassandra/io/sstable/metadata/MetadataSerializer.java b/src/java/org/apache/cassandra/io/sstable/metadata/MetadataSerializer.java index ae1787a..85a71ed 100644 --- a/src/java/org/apache/cassandra/io/sstable/metadata/MetadataSerializer.java +++ b/src/java/org/apache/cassandra/io/sstable/metadata/MetadataSerializer.java @@ -84,7 +84,7 @@ public class MetadataSerializer implements IMetadataSerializer if (!statsFile.exists()) { logger.trace("No sstable stats for {}", descriptor); -components = Maps.newHashMap(); +components = new EnumMap<>(MetadataType.class); components.put(MetadataType.STATS, MetadataCollector.defaultStatsMetadata()); } else @@ -104,7 +104,7 @@ public class MetadataSerializer implements IMetadataSerializer public Map deserialize(Descriptor descriptor, FileDataInput in, EnumSet types) throws IOException { -
[jira] [Commented] (CASSANDRA-12126) CAS Reads Inconsistencies
[ https://issues.apache.org/jira/browse/CASSANDRA-12126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15372670#comment-15372670 ] Stefan Podkowinski commented on CASSANDRA-12126: Another take on how to test coordination aspects for this ticket would be to make use of the MessagingService mocking classes implemented in CASSANDRA-12016. I've created a couple of tests [here|https://github.com/spodkowinski/cassandra/tree/WIP-12126/test/unit/org/apache/cassandra/service/paxos] to get a better idea how this would look like. Although limited to observing the behavior of a single node/state machine, it's probably more lightweight and easier to implement than doing the same using dtests or Jepsen. As of the described edge case, I'd agree with [~kohlisankalp]'s suggestion (if I understood correctly) to do an additional proposal round. However, it would be nice to optimize this a bit so we don't trigger new proposals for each and every SERIAL read. I did a first implementation for this [here|https://github.com/spodkowinski/cassandra/commit/96ec151992f49c773e5af5d85ce69ec87d8b7bc5] (with [CASReadTriggerEmptyProposal|https://github.com/spodkowinski/cassandra/blob/WIP-12126/test/unit/org/apache/cassandra/service/paxos/CASReadTriggerEmptyProposal.java] as corresponding test) for the sake of discussion. > CAS Reads Inconsistencies > -- > > Key: CASSANDRA-12126 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12126 > Project: Cassandra > Issue Type: Bug >Reporter: sankalp kohli > > While looking at the CAS code in Cassandra, I found a potential issue with > CAS Reads. Here is how it can happen with RF=3 > 1) You issue a CAS Write and it fails in the propose phase. A machine replies > true to a propose and saves the commit in accepted filed. The other two > machines B and C does not get to the accept phase. > Current state is that machine A has this commit in paxos table as accepted > but not committed and B and C does not. > 2) Issue a CAS Read and it goes to only B and C. You wont be able to read the > value written in step 1. This step is as if nothing is inflight. > 3) Issue another CAS Read and it goes to A and B. Now we will discover that > there is something inflight from A and will propose and commit it with the > current ballot. Now we can read the value written in step 1 as part of this > CAS read. > If we skip step 3 and instead run step 4, we will never learn about value > written in step 1. > 4. Issue a CAS Write and it involves only B and C. This will succeed and > commit a different value than step 1. Step 1 value will never be seen again > and was never seen before. > If you read the Lamport “paxos made simple” paper and read section 2.3. It > talks about this issue which is how learners can find out if majority of the > acceptors have accepted the proposal. > In step 3, it is correct that we propose the value again since we dont know > if it was accepted by majority of acceptors. When we ask majority of > acceptors, and more than one acceptors but not majority has something in > flight, we have no way of knowing if it is accepted by majority of acceptors. > So this behavior is correct. > However we need to fix step 2, since it caused reads to not be linearizable > with respect to writes and other reads. In this case, we know that majority > of acceptors have no inflight commit which means we have majority that > nothing was accepted by majority. I think we should run a propose step here > with empty commit and that will cause write written in step 1 to not be > visible ever after. > With this fix, we will either see data written in step 1 on next serial read > or will never see it which is what we want. -- This message was sent by Atlassian JIRA (v6.3.4#6332)