[jira] [Commented] (CASSANDRA-6477) Global indexes
[ https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503664#comment-14503664 ] Jonathan Ellis commented on CASSANDRA-6477: --- bq. While I agree that a {{VIEW}} in the SQL world has more utility, so does {{SELECT}} Correcut, but a view should still be able to represent what ever {{SELECT}} can! It's not reasonable to expect it to do more, but it's absolutely reasonable to expect it to match, because that's the definition. bq. My personal preference would be to have a cardinality option clause with option values like low, medium, high, and unique. I don't think we're at the point where we can afford that high a level of abstraction. Global indexes -- Key: CASSANDRA-6477 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477 Project: Cassandra Issue Type: New Feature Components: API, Core Reporter: Jonathan Ellis Assignee: Carl Yeksigian Labels: cql Fix For: 3.0 Local indexes are suitable for low-cardinality data, where spreading the index across the cluster is a Good Thing. However, for high-cardinality data, local indexes require querying most nodes in the cluster even if only a handful of rows is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6477) Global indexes
[ https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503870#comment-14503870 ] Jonathan Ellis commented on CASSANDRA-6477: --- One advantage to MV is that people are somewhat more used to the MV lagging the underlying data. PG goes so far as requiring you to manually issue refresh commands. I think that makes them unusable, so I only mention that to illustrate an extreme -- with that precedent, having us say MV are eventually consistent sounds quite reasonable! (Oracle has had self-updating MV forever. I'm not actually sure what kind of transactionality guarantees you get there.) Global indexes -- Key: CASSANDRA-6477 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477 Project: Cassandra Issue Type: New Feature Components: API, Core Reporter: Jonathan Ellis Assignee: Carl Yeksigian Labels: cql Fix For: 3.0 Local indexes are suitable for low-cardinality data, where spreading the index across the cluster is a Good Thing. However, for high-cardinality data, local indexes require querying most nodes in the cluster even if only a handful of rows is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6477) Global indexes
[ https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504086#comment-14504086 ] Jack Krupansky commented on CASSANDRA-6477: --- It would be helpful if someone were to update the description and primary use case(s) for this feature. My understanding of the original use case was to avoid the fan out from the coordinator node on an indexed query - the global index would contain the partition keys for matched rows so that only the node(s) containing those partition key(s) would be needed. So, my question at this stage is whether the intention is that the initial cut of MV would include a focus on that performance optimization use case, or merely focus on the increased general flexibility of MV instead. Would the initial implementation of MV even necessarily use a GI? Would local vs. global index be an option to be specified? Also, whether it is GI or MV, what guidance will the spec, doc, and training give users as to its performance and scalability? My concern with GI was that it works well for small to medium-sized clusters, but not with very large clusters. So, what would the largest cluster that a user could use a GI for? And also how many GI's make sense. For example, with 1 billion rows per node, and 50 nodes, and a GI on 10 columns, that would be... 1B * 50 * 10 = 500 billion index entries on each node, right? Seems like a bit much for a JVM heap or even off-heap memory. Maybe 500M * 20 * 4 = 40 billion index entries per node would be a wiser upper limit, and even that may be a bit extreme. Global indexes -- Key: CASSANDRA-6477 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477 Project: Cassandra Issue Type: New Feature Components: API, Core Reporter: Jonathan Ellis Assignee: Carl Yeksigian Labels: cql Fix For: 3.0 Local indexes are suitable for low-cardinality data, where spreading the index across the cluster is a Good Thing. However, for high-cardinality data, local indexes require querying most nodes in the cluster even if only a handful of rows is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8502) Static columns returning null for pages after first
[ https://issues.apache.org/jira/browse/CASSANDRA-8502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503786#comment-14503786 ] Sebastian Estevez commented on CASSANDRA-8502: -- Hello, I see this status is Patch Available since Feb. Will it be closed out with Tyler's fix? Static columns returning null for pages after first --- Key: CASSANDRA-8502 URL: https://issues.apache.org/jira/browse/CASSANDRA-8502 Project: Cassandra Issue Type: Bug Components: Core Reporter: Flavien Charlon Assignee: Tyler Hobbs Fix For: 2.0.15 Attachments: 8502-2.0.txt, null-static-column.txt When paging is used for a query containing a static column, the first page contains the right value for the static column, but subsequent pages have null null for the static column instead of the expected value. Repro steps: - Create a table with a static column - Create a partition with 500 cells - Using cqlsh, query that partition Actual result: - You will see that first, the static column appears as expected, but if you press a key after ---MORE---, the static columns will appear as null. See the attached file for a repro of the output. I am using a single node cluster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-2848) Make the Client API support passing down timeouts
[ https://issues.apache.org/jira/browse/CASSANDRA-2848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-2848: -- Fix Version/s: 3.1 Assignee: Sylvain Lebresne Assigning to Sylvain so he can delegate it. Make the Client API support passing down timeouts - Key: CASSANDRA-2848 URL: https://issues.apache.org/jira/browse/CASSANDRA-2848 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Chris Goffinet Assignee: Sylvain Lebresne Priority: Minor Fix For: 3.1 Having a max server RPC timeout is good for worst case, but many applications that have middleware in front of Cassandra, might have higher timeout requirements. In a fail fast environment, if my application starting at say the front-end, only has 20ms to process a request, and it must connect to X services down the stack, by the time it hits Cassandra, we might only have 10ms. I propose we provide the ability to specify the timeout on each call we do optionally. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7523) add date and time types
[ https://issues.apache.org/jira/browse/CASSANDRA-7523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503890#comment-14503890 ] Joshua McKenzie commented on CASSANDRA-7523: Planning on reject of empty BB and protecting sub 4.0 clients as well for 3.0 - going to open other tickets to track that after revert. add date and time types --- Key: CASSANDRA-7523 URL: https://issues.apache.org/jira/browse/CASSANDRA-7523 Project: Cassandra Issue Type: New Feature Components: API Reporter: Jonathan Ellis Assignee: Joshua McKenzie Priority: Minor Labels: client-impacting, docs Fix For: 2.1.5 http://www.postgresql.org/docs/9.1/static/datatype-datetime.html (we already have timestamp; interval is out of scope for now, and see CASSANDRA-6350 for discussion on timestamp-with-time-zone. but date/time should be pretty easy to add.) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7814) enable describe on indices
[ https://issues.apache.org/jira/browse/CASSANDRA-7814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504027#comment-14504027 ] Stefania commented on CASSANDRA-7814: - It turns out the special --tag-build command line option is not required, here is the command to add the git tag to the zip file name and root directory: {code} python setup.py egg_info -b-`git rev-parse --short HEAD` sdist --formats=zip {code} enable describe on indices -- Key: CASSANDRA-7814 URL: https://issues.apache.org/jira/browse/CASSANDRA-7814 Project: Cassandra Issue Type: Improvement Components: Core Reporter: radha Assignee: Stefania Priority: Minor Fix For: 2.1.5 Describe index should be supported, right now, the only way is to export the schema and find what it really is before updating/dropping the index. verified in [cqlsh 3.1.8 | Cassandra 1.2.18.1 | CQL spec 3.0.0 | Thrift protocol 19.36.2] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9205) Allow per statement time outs or request cancel method
[ https://issues.apache.org/jira/browse/CASSANDRA-9205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503633#comment-14503633 ] sankalp kohli commented on CASSANDRA-9205: -- We should do this in this way CASSANDRA-2848 Allow per statement time outs or request cancel method -- Key: CASSANDRA-9205 URL: https://issues.apache.org/jira/browse/CASSANDRA-9205 Project: Cassandra Issue Type: Improvement Reporter: Vishy Kasar Fix For: 3.1 Cassandra lets user specify time outs for various operations globally in yaml. It would be great if we could set different timeouts for CQL statements in different contexts, rather than just having a global timeouts in yaml. We have client requests that need to time out in a short duration vs some maintenance requests that we know take long. The only choice we have now is to set the server time out to the highest needed. User can certainly do session.executeAsync on the client side and wait for certain time on the returned future. However when user cancels the future on time out, nothing is done on the server side. We have seen cases where cassandra replicas were going over thousands of tombstones and causing OOMs way after client timed out. This can be implemented either by passing the time out along with query to server or by providing a cancel method similar to http://docs.oracle.com/javase/6/docs/api/java/sql/Statement.html It is understood that server may not be able to timeout/cancel the requests in all cases. So this is a request to server to do it's best effort to timeout/cancel. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-9217) Problems with cqlsh copy command
Brian Cantoni created CASSANDRA-9217: Summary: Problems with cqlsh copy command Key: CASSANDRA-9217 URL: https://issues.apache.org/jira/browse/CASSANDRA-9217 Project: Cassandra Issue Type: Bug Reporter: Brian Cantoni On the current 2.1 branch I notice a few (possibly related) problems with cqlsh copy commands. I'm writing them here together but we can separate if there are different causes. *1. Cannot import from CSV if column name is 'date'* Test file monthly.csv contents: {noformat} stationid,metric,date LAE,barometricpressure,2014-01-01 00:00:00+ LAE,barometricpressure,2014-02-01 00:00:00+ LAE,barometricpressure,2014-03-01 00:00:00+ {noformat} Steps: {noformat} CREATE KEYSPACE IF NOT EXISTS weathercql WITH replication = { 'class': 'SimpleStrategy', 'replication_factor': '1' }; CREATE TABLE IF NOT EXISTS weathercql.monthly ( stationid text, metric text, date timestamp, primary key (stationid, metric, date) ); COPY weathercql.monthly (stationid, metric, date) FROM 'monthly.csv' WITH HEADER='true'; {noformat} Result: the copy command fails unless date is enclosed in double quotes: {noformat} cqlsh COPY weathercql.monthly (stationid, metric, date) FROM 'monthly.csv' WITH HEADER='true'; Improper COPY command. cqlsh COPY weathercql.monthly (stationid, metric, date) FROM 'monthly.csv' WITH HEADER='true'; 3 rows imported in 0.096 seconds. {noformat} If I instead name the 'date' column as 'datex', it works without quotes. The same steps work on Cassandra 2.1.4 (release build). *2. Cannot copy to CSV* Sample data: {noformat} create keyspace if not exists test with replication = {'class':'SimpleStrategy', 'replication_factor':1}; create table if not exists test.kv (key int primary key, value text); insert into test.kv (key,value) values (1,'alpha'); insert into test.kv (key,value) values (2,'beta'); insert into test.kv (key,value) values (3,'charlie'); {noformat} When you try to export to CSV, it throws what appears to be a Python error, and the file is not created correctly: {noformat} cqlsh copy test.kv (key,value) to 'test.csv'; global name 'meter' is not defined {noformat} The same steps work on Cassandra 2.1.4 (release build). *3. Copy from CSV inside CQL command file doesn't work* File kv.csv: {noformat} key,value 1,'a' 2,'b' 3,'c' {noformat} File kv.cql: {noformat} create keyspace if not exists test with replication = {'class': 'SimpleStrategy', 'replication_factor':1}; create table if not exists test.kv (key int primary key, value text); truncate test.kv; copy test.kv (key, value) from 'kv.csv' with header='true'; select * from test.kv; {noformat} When command file passed to cqlsh, an error is reported on the `copy` command and it doesn't work: {noformat} $ bin/cqlsh -f kv.cql kv.cql:5:descriptor 'lower' requires a 'str' object but received a 'unicode' key | value -+--- (0 rows) {noformat} The same commands work correctly when run directly inside cqlsh or when executed with -e option like: {{bin/cqlsh -e copy test.kv (key, value) from 'kv.csv' with header='true';}}. This third issue appears to also be broken in 2.1.4 and 2.1.3 release builds, but works in 2.1.2. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6477) Global indexes
[ https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503711#comment-14503711 ] Aleksey Yeschenko commented on CASSANDRA-6477: -- Aggregation is probably unfeasible. Having the rest of it would be amazing. +1 to that. Global indexes -- Key: CASSANDRA-6477 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477 Project: Cassandra Issue Type: New Feature Components: API, Core Reporter: Jonathan Ellis Assignee: Carl Yeksigian Labels: cql Fix For: 3.0 Local indexes are suitable for low-cardinality data, where spreading the index across the cluster is a Good Thing. However, for high-cardinality data, local indexes require querying most nodes in the cluster even if only a handful of rows is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9217) Problems with cqlsh copy command
[ https://issues.apache.org/jira/browse/CASSANDRA-9217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Yeschenko updated CASSANDRA-9217: - Assignee: Tyler Hobbs Problems with cqlsh copy command Key: CASSANDRA-9217 URL: https://issues.apache.org/jira/browse/CASSANDRA-9217 Project: Cassandra Issue Type: Bug Reporter: Brian Cantoni Assignee: Tyler Hobbs On the current 2.1 branch I notice a few (possibly related) problems with cqlsh copy commands. I'm writing them here together but we can separate if there are different causes. *1. Cannot import from CSV if column name is 'date'* Test file monthly.csv contents: {noformat} stationid,metric,date LAE,barometricpressure,2014-01-01 00:00:00+ LAE,barometricpressure,2014-02-01 00:00:00+ LAE,barometricpressure,2014-03-01 00:00:00+ {noformat} Steps: {noformat} CREATE KEYSPACE IF NOT EXISTS weathercql WITH replication = { 'class': 'SimpleStrategy', 'replication_factor': '1' }; CREATE TABLE IF NOT EXISTS weathercql.monthly ( stationid text, metric text, date timestamp, primary key (stationid, metric, date) ); COPY weathercql.monthly (stationid, metric, date) FROM 'monthly.csv' WITH HEADER='true'; {noformat} Result: the copy command fails unless date is enclosed in double quotes: {noformat} cqlsh COPY weathercql.monthly (stationid, metric, date) FROM 'monthly.csv' WITH HEADER='true'; Improper COPY command. cqlsh COPY weathercql.monthly (stationid, metric, date) FROM 'monthly.csv' WITH HEADER='true'; 3 rows imported in 0.096 seconds. {noformat} If I instead name the 'date' column as 'datex', it works without quotes. The same steps work on Cassandra 2.1.4 (release build). *2. Cannot copy to CSV* Sample data: {noformat} create keyspace if not exists test with replication = {'class':'SimpleStrategy', 'replication_factor':1}; create table if not exists test.kv (key int primary key, value text); insert into test.kv (key,value) values (1,'alpha'); insert into test.kv (key,value) values (2,'beta'); insert into test.kv (key,value) values (3,'charlie'); {noformat} When you try to export to CSV, it throws what appears to be a Python error, and the file is not created correctly: {noformat} cqlsh copy test.kv (key,value) to 'test.csv'; global name 'meter' is not defined {noformat} The same steps work on Cassandra 2.1.4 (release build). *3. Copy from CSV inside CQL command file doesn't work* File kv.csv: {noformat} key,value 1,'a' 2,'b' 3,'c' {noformat} File kv.cql: {noformat} create keyspace if not exists test with replication = {'class': 'SimpleStrategy', 'replication_factor':1}; create table if not exists test.kv (key int primary key, value text); truncate test.kv; copy test.kv (key, value) from 'kv.csv' with header='true'; select * from test.kv; {noformat} When command file passed to cqlsh, an error is reported on the `copy` command and it doesn't work: {noformat} $ bin/cqlsh -f kv.cql kv.cql:5:descriptor 'lower' requires a 'str' object but received a 'unicode' key | value -+--- (0 rows) {noformat} The same commands work correctly when run directly inside cqlsh or when executed with -e option like: {{bin/cqlsh -e copy test.kv (key, value) from 'kv.csv' with header='true';}}. This third issue appears to also be broken in 2.1.4 and 2.1.3 release builds, but works in 2.1.2. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9217) Problems with cqlsh copy command
[ https://issues.apache.org/jira/browse/CASSANDRA-9217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504031#comment-14504031 ] Brian Cantoni commented on CASSANDRA-9217: -- I did some git bisecting and each of the 3 issues points to a logical commit which changed the behaviors: 1: commit 107545 for CASSANDRA-7523 (add date and time types) 2: commit 711090 for CASSANDRA-8225 (improve cqlsh copy from perf) 3: commit c49f66 for CASSANDRA-8638 (handle unicode BOM at file start) Of these issues, #2 is probably the most important one. For #1 I imagine the remedy is to not use a field called date, and the scenario for #3 is probably not common. Problems with cqlsh copy command Key: CASSANDRA-9217 URL: https://issues.apache.org/jira/browse/CASSANDRA-9217 Project: Cassandra Issue Type: Bug Reporter: Brian Cantoni Assignee: Tyler Hobbs Labels: cqlsh Fix For: 2.1.5 On the current 2.1 branch I notice a few (possibly related) problems with cqlsh copy commands. I'm writing them here together but we can separate if there are different causes. *1. Cannot import from CSV if column name is 'date'* Test file monthly.csv contents: {noformat} stationid,metric,date LAE,barometricpressure,2014-01-01 00:00:00+ LAE,barometricpressure,2014-02-01 00:00:00+ LAE,barometricpressure,2014-03-01 00:00:00+ {noformat} Steps: {noformat} CREATE KEYSPACE IF NOT EXISTS weathercql WITH replication = { 'class': 'SimpleStrategy', 'replication_factor': '1' }; CREATE TABLE IF NOT EXISTS weathercql.monthly ( stationid text, metric text, date timestamp, primary key (stationid, metric, date) ); COPY weathercql.monthly (stationid, metric, date) FROM 'monthly.csv' WITH HEADER='true'; {noformat} Result: the copy command fails unless date is enclosed in double quotes: {noformat} cqlsh COPY weathercql.monthly (stationid, metric, date) FROM 'monthly.csv' WITH HEADER='true'; Improper COPY command. cqlsh COPY weathercql.monthly (stationid, metric, date) FROM 'monthly.csv' WITH HEADER='true'; 3 rows imported in 0.096 seconds. {noformat} If I instead name the 'date' column as 'datex', it works without quotes. The same steps work on Cassandra 2.1.4 (release build). *2. Cannot copy to CSV* Sample data: {noformat} create keyspace if not exists test with replication = {'class':'SimpleStrategy', 'replication_factor':1}; create table if not exists test.kv (key int primary key, value text); insert into test.kv (key,value) values (1,'alpha'); insert into test.kv (key,value) values (2,'beta'); insert into test.kv (key,value) values (3,'charlie'); {noformat} When you try to export to CSV, it throws what appears to be a Python error, and the file is not created correctly: {noformat} cqlsh copy test.kv (key,value) to 'test.csv'; global name 'meter' is not defined {noformat} The same steps work on Cassandra 2.1.4 (release build). *3. Copy from CSV inside CQL command file doesn't work* File kv.csv: {noformat} key,value 1,'a' 2,'b' 3,'c' {noformat} File kv.cql: {noformat} create keyspace if not exists test with replication = {'class': 'SimpleStrategy', 'replication_factor':1}; create table if not exists test.kv (key int primary key, value text); truncate test.kv; copy test.kv (key, value) from 'kv.csv' with header='true'; select * from test.kv; {noformat} When command file passed to cqlsh, an error is reported on the `copy` command and it doesn't work: {noformat} $ bin/cqlsh -f kv.cql kv.cql:5:descriptor 'lower' requires a 'str' object but received a 'unicode' key | value -+--- (0 rows) {noformat} The same commands work correctly when run directly inside cqlsh or when executed with -e option like: {{bin/cqlsh -e copy test.kv (key, value) from 'kv.csv' with header='true';}}. This third issue appears to also be broken in 2.1.4 and 2.1.3 release builds, but works in 2.1.2. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8358) Bundled tools shouldn't be using Thrift API
[ https://issues.apache.org/jira/browse/CASSANDRA-8358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504083#comment-14504083 ] Aleksey Yeschenko commented on CASSANDRA-8358: -- Force pushed another updated (and squashed) version to the same branch - https://github.com/iamaleksey/cassandra/commits/8358. It adds some more cleanup on top of Philip's, in particular some around SSTableLoader.Client implementations, but it's still far from clean - because of original code dirtiness. Things that need fixing: - NativeSSTableLoaderClient must support connecting over SSL. This is a regression - the original code did support this. - NSSTLC TalbeMetadata to CFMetaData code is broken. I think we should, for now, do the ugly thing and reimplement what sstableloader was doing, and SELECT stuff from schema table manually, then do the equivalent of {{ThriftConversion.fromThriftCqlRow()}} call, now unused. Bundled tools shouldn't be using Thrift API --- Key: CASSANDRA-8358 URL: https://issues.apache.org/jira/browse/CASSANDRA-8358 Project: Cassandra Issue Type: Improvement Reporter: Aleksey Yeschenko Assignee: Philip Thompson Fix For: 3.0 In 2.1, we switched cqlsh to the python-driver. In 3.0, we got rid of cassandra-cli. Yet there is still code that's using legacy Thrift API. We want to convert it all to use the java-driver instead. 1. BulkLoader uses Thrift to query the schema tables. It should be using java-driver metadata APIs directly instead. 2. o.a.c.hadoop.cql3.CqlRecordWriter is using Thrift 3. o.a.c.hadoop.ColumnFamilyRecordReader is using Thrift 4. o.a.c.hadoop.AbstractCassandraStorage is using Thrift 5. o.a.c.hadoop.pig.CqlStorage is using Thrift Some of the things listed above use Thrift to get the list of partition key columns or clustering columns. Those should be converted to use the Metadata API of the java-driver. Somewhat related to that, we also have badly ported code from Thrift in o.a.c.hadoop.cql3.CqlRecordReader (see fetchKeys()) that manually fetches columns from schema tables instead of properly using the driver's Metadata API. We need all of it fixed. One exception, for now, is o.a.c.hadoop.AbstractColumnFamilyInputFormat - it's using Thrift for its describe_splits_ex() call that cannot be currently replaced by any java-driver call (?). Once this is done, we can stop starting Thrift RPC port by default in cassandra.yaml. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8789) OutboundTcpConnectionPool should route messages to sockets by size not type
[ https://issues.apache.org/jira/browse/CASSANDRA-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503945#comment-14503945 ] Michael Kjellman commented on CASSANDRA-8789: - My testing has shown that relying on message size as a heuristic to determine the channel/socket to write to has adverse effects under load. The problem is this mixes high priority Command verbs (e.g GOSSIP_DIGEST_SYN/GOSSIP_DIGEST_ACK) - that cannot be delayed in any way due to the current implementation of FailureDetector - with lower priority Response/Data (e.g MUTATION/READ/REQUEST_RESPONSE) verbs. The effect of this is that nodes will flap and be considered incorrectly DOWN due to failure in sending Gossip verbs which are now queued behind lower priority messages. The implementation of MessagingService is fire and forget, however we do expect for most messages some form of ACK. For instance, each MUTATION expects a REQUEST_RESPONSE within a given timeout; otherwise a hint is generated. Here lies the problem: the REQUEST_RESPONSE verb is 6 bytes (with no payload -- so now considered small). We also have INTERNAL_RESPONSE (also 6 bytes). By using size instead of priority, or the old hard coded Command/Data implementation, (sending high priority messages like GOSSIP over one channel and normal/low priority messages over another) this means the REQUEST_RESPONSE for each MUTATION after this change will now be sent over the same channel that used to be reserved for GOSSIP (or other high priority Command) verbs. If the kernel buffers backup sufficiently (although we have the NO_DELAY option on the socket, it isn't very difficult under moderate/high load to still saturate the NIC) we've now moved an ACK message for every MUTATION onto the same socket that is sending GOSSIP messages. Eventually if we backup with enough small messages we likely will end up unable to send *important* messages (e.g GOSSIP_DIGEST_SYN/GOSSIP_DIGEST_ACK), and FD will falsely be triggered and nodes will be marked DOWN incorrectly. Additionally, once we hit this condition, we end up flapping as GOSSIP messages eventually get thru which compounds the problem. h4. How to reproduce: I'm unable to figure out the new stress so I ran the stress from 2.0 against trunk (commit sha 1fab7b785dc5e440a773828ff17e927a1f3c2e5f from 4/20/15) with all defaults except for changing the replication factor from it's default of 1 to 3. I'm pretty sure the reason I can't easily reproduce with the new stress is I seem to be failing to figure out the command line parsing to change it from the default of 8 threads back to the 30 threads default that was in the old stress. While it's crazy to run with 30 threads, this simulates enough traffic on my 2014 MacBook Pro to actually backup the kernel buffers on loopback which will trigger this. 1) Setup a 3 node ccm cluster locally with all defaults (ccm create tcptest --install-dir=/Users/username/pathto/cassandra-apache/ ccm populate -n 3 ccm start) 2) Run stress from 2.0 using all defaults aside from specifying a RF=3 (tools/bin/cassandra-stress -l 3) 3) Monitor FailureDetector messages in the logs, overall load written, etc h4. Expected Results: # Without these changes, stress will not timeout while inserting data. With this change, I've now observed timeouts starting 50% of the way thru the 1 million records. {noformat} Operation [303198] retried 10 times - error inserting key 0303198 ((TTransportException): java.net.SocketException: Broken pipe) {noformat} # Although MUTATION messages should/are expected to be dropped under high load etc, GOSSIP messages should not fail in being written to the socket in a timely manner to avoid FD (FailureDetector) from incorrectly marking nodes DOWN incorrectly. # Amount of inserted load reported in nodetool ring should be ~250MB using the 2.0 stress tool. On my machine I saw a final load of 1.44MB on node(1), and only ~65MB on node(2,3). This is due to FD marking the nodes down and dropping mutations and creating hints. (Additionally, once in this state, memory overhead get's even worse as we generate unnecessary hints because in the prior design we were able to actually write to the socket.) h4. Alternative Proposal I'm 100% on board with using a more priority based system to better utilize the two channels/sockets we have. For instance: MUTATION(2), READ_REPAIR(3), REQUEST_RESPONSE(2), REPLICATION_FINISHED(1), INTERNAL_RESPONSE(1), COUNTER_MUTATION(2), GOSSIP_DIGEST_SYN(1), GOSSIP_DIGEST_ACK(1), GOSSIP_DIGEST_ACK2(1), That way we can use the priorities to route small messages like SNAPSHOT, TRUNCATE, GOSSIP_DIGEST_SYN over the high-priority channel and the normal-priority messages over the other channel/socket. OutboundTcpConnectionPool should route messages to sockets by size not type
[jira] [Commented] (CASSANDRA-8789) OutboundTcpConnectionPool should route messages to sockets by size not type
[ https://issues.apache.org/jira/browse/CASSANDRA-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504050#comment-14504050 ] Ariel Weisberg commented on CASSANDRA-8789: --- I can reproduce this using the 2.0 version of stress which is interesting. It didn't reproduce with a write only workload of stress on trunk. The why of that is probably interesting is well. I will look into it more tomorrow. OutboundTcpConnectionPool should route messages to sockets by size not type --- Key: CASSANDRA-8789 URL: https://issues.apache.org/jira/browse/CASSANDRA-8789 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Ariel Weisberg Assignee: Ariel Weisberg Fix For: 3.0 Attachments: 8789.diff I was looking at this trying to understand what messages flow over which connection. For reads the request goes out over the command connection and the response comes back over the ack connection. For writes the request goes out over the command connection and the response comes back over the command connection. Reads get a dedicated socket for responses. Mutation commands and responses both travel over the same socket along with read requests. Sockets are used uni-directional so there are actually four sockets in play and four threads at each node (2 inbounded, 2 outbound). CASSANDRA-488 doesn't leave a record of what the impact of this change was. If someone remembers what situations were made better it would be good to know. I am not clear on when/how this is helpful. The consumer side shouldn't be blocking so the only head of line blocking issue is the time it takes to transfer data over the wire. If message size is the cause of blocking issues then the current design mixes small messages and large messages on the same connection retaining the head of line blocking. Read requests share the same connection as write requests (which are large), and write acknowledgments (which are small) share the same connections as write requests. The only winner is read acknowledgements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
cassandra git commit: remove no-op casts
Repository: cassandra Updated Branches: refs/heads/trunk 48f644686 - 1fab7b785 remove no-op casts Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/1fab7b78 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/1fab7b78 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/1fab7b78 Branch: refs/heads/trunk Commit: 1fab7b785dc5e440a773828ff17e927a1f3c2e5f Parents: 48f6446 Author: Dave Brosius dbros...@mebigfatguy.com Authored: Mon Apr 20 12:12:39 2015 -0400 Committer: Dave Brosius dbros...@mebigfatguy.com Committed: Mon Apr 20 12:12:39 2015 -0400 -- src/java/org/apache/cassandra/cql3/Tuples.java| 2 +- src/java/org/apache/cassandra/db/Memtable.java| 4 ++-- .../org/apache/cassandra/db/marshal/UserType.java | 2 +- .../org/apache/cassandra/dht/Murmur3Partitioner.java | 4 ++-- src/java/org/apache/cassandra/dht/Range.java | 4 ++-- src/java/org/apache/cassandra/gms/Gossiper.java | 10 +- .../cassandra/hadoop/ColumnFamilyRecordReader.java| 2 +- .../apache/cassandra/hadoop/cql3/CqlRecordReader.java | 2 +- .../apache/cassandra/hadoop/pig/CassandraStorage.java | 14 +++--- .../org/apache/cassandra/metrics/RestorableMeter.java | 2 +- .../org/apache/cassandra/streaming/StreamManager.java | 4 ++-- .../org/apache/cassandra/tools/SSTableImport.java | 2 +- .../apache/cassandra/utils/FastByteOperations.java| 4 ++-- .../org/apache/cassandra/utils/HistogramBuilder.java | 2 +- src/java/org/apache/cassandra/utils/IntervalTree.java | 2 +- 15 files changed, 30 insertions(+), 30 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/1fab7b78/src/java/org/apache/cassandra/cql3/Tuples.java -- diff --git a/src/java/org/apache/cassandra/cql3/Tuples.java b/src/java/org/apache/cassandra/cql3/Tuples.java index 92ccbce..89fecd0 100644 --- a/src/java/org/apache/cassandra/cql3/Tuples.java +++ b/src/java/org/apache/cassandra/cql3/Tuples.java @@ -259,7 +259,7 @@ public class Tuples { // Collections have this small hack that validate cannot be called on a serialized object, // but the deserialization does the validation (so we're fine). -List? l = (List?)type.getSerializer().deserializeForNativeProtocol(value, options.getProtocolVersion()); +List? l = type.getSerializer().deserializeForNativeProtocol(value, options.getProtocolVersion()); assert type.getElementsType() instanceof TupleType; TupleType tupleType = (TupleType) type.getElementsType(); http://git-wip-us.apache.org/repos/asf/cassandra/blob/1fab7b78/src/java/org/apache/cassandra/db/Memtable.java -- diff --git a/src/java/org/apache/cassandra/db/Memtable.java b/src/java/org/apache/cassandra/db/Memtable.java index 2381f26..aa5fb1b 100644 --- a/src/java/org/apache/cassandra/db/Memtable.java +++ b/src/java/org/apache/cassandra/db/Memtable.java @@ -414,10 +414,10 @@ public class Memtable ConcurrentNavigableMapRowPosition, Object rows = new ConcurrentSkipListMap(); final Object val = new Object(); for (int i = 0 ; i count ; i++) -rows.put(allocator.clone(new BufferDecoratedKey(new LongToken((long) i), ByteBufferUtil.EMPTY_BYTE_BUFFER), group), val); +rows.put(allocator.clone(new BufferDecoratedKey(new LongToken(i), ByteBufferUtil.EMPTY_BYTE_BUFFER), group), val); double avgSize = ObjectSizes.measureDeep(rows) / (double) count; rowOverhead = (int) ((avgSize - Math.floor(avgSize)) 0.05 ? Math.floor(avgSize) : Math.ceil(avgSize)); -rowOverhead -= ObjectSizes.measureDeep(new LongToken((long) 0)); +rowOverhead -= ObjectSizes.measureDeep(new LongToken(0)); rowOverhead += AtomicBTreeColumns.EMPTY_SIZE; allocator.setDiscarding(); allocator.setDiscarded(); http://git-wip-us.apache.org/repos/asf/cassandra/blob/1fab7b78/src/java/org/apache/cassandra/db/marshal/UserType.java -- diff --git a/src/java/org/apache/cassandra/db/marshal/UserType.java b/src/java/org/apache/cassandra/db/marshal/UserType.java index ec97a7f..45c5f0e 100644 --- a/src/java/org/apache/cassandra/db/marshal/UserType.java +++ b/src/java/org/apache/cassandra/db/marshal/UserType.java @@ -178,7 +178,7 @@ public class UserType extends TupleType { for (Object fieldName : keys) { -if (!stringFieldNames.contains((String) fieldName)) +if (!stringFieldNames.contains(fieldName)) throw
[jira] [Commented] (CASSANDRA-6477) Global indexes
[ https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503698#comment-14503698 ] Jonathan Ellis commented on CASSANDRA-6477: --- I'm warming up to the idea of calling it MV but only IF we're committed to fleshing it out to match SELECT. To start that can match our current envisioned functionality: CREATE MATERIALIZED VIEW users_by_age AS SELECT age, user_id, x, y, z FROM users PRIMARY KEY (age, user_id) but next we need to add support for WHERE and UDF: CREATE MATERIALIZED VIEW users_by_age AS SELECT age, user_id, substring(phone_number, 3) AS area_code, y, z FROM users WHERE area_code in (512, 513, 514) PRIMARY KEY (age, user_id) Building the view should take advantage of local indexes where applicable. Ideally we would support aggregation in MV as well. Not sure if that is feasible. Global indexes -- Key: CASSANDRA-6477 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477 Project: Cassandra Issue Type: New Feature Components: API, Core Reporter: Jonathan Ellis Assignee: Carl Yeksigian Labels: cql Fix For: 3.0 Local indexes are suitable for low-cardinality data, where spreading the index across the cluster is a Good Thing. However, for high-cardinality data, local indexes require querying most nodes in the cluster even if only a handful of rows is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7523) add date and time types
[ https://issues.apache.org/jira/browse/CASSANDRA-7523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503813#comment-14503813 ] Aleksey Yeschenko commented on CASSANDRA-7523: -- No concerns. 3.0 only WFM. Would really like us to reject empty BB as a valid value there though. add date and time types --- Key: CASSANDRA-7523 URL: https://issues.apache.org/jira/browse/CASSANDRA-7523 Project: Cassandra Issue Type: New Feature Components: API Reporter: Jonathan Ellis Assignee: Joshua McKenzie Priority: Minor Labels: client-impacting, docs Fix For: 2.1.5 http://www.postgresql.org/docs/9.1/static/datatype-datetime.html (we already have timestamp; interval is out of scope for now, and see CASSANDRA-6350 for discussion on timestamp-with-time-zone. but date/time should be pretty easy to add.) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8899) cqlsh - not able to get row count with select(*) for large table
[ https://issues.apache.org/jira/browse/CASSANDRA-8899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503876#comment-14503876 ] Jeff Liu commented on CASSANDRA-8899: - Yes, [~blerer]. The problem is worked around by increasing timeout value. cqlsh - not able to get row count with select(*) for large table Key: CASSANDRA-8899 URL: https://issues.apache.org/jira/browse/CASSANDRA-8899 Project: Cassandra Issue Type: Bug Environment: Cassandra 2.1.2 ubuntu12.04 Reporter: Jeff Liu Assignee: Benjamin Lerer I'm getting errors when running a query that looks at a large number of rows. {noformat} cqlsh:events select count(*) from catalog; count --- 1 (1 rows) cqlsh:events select count(*) from catalog limit 11000; count --- 11000 (1 rows) cqlsh:events select count(*) from catalog limit 5; errors={}, last_host=127.0.0.1 cqlsh:events {noformat} We are not able to make the select * query to get row count. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6477) Global indexes
[ https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503936#comment-14503936 ] Jack Krupansky commented on CASSANDRA-6477: --- Oracle has lots of options for the REFRESH clause of the CREATE MATERIALIZED VIEW statement: http://docs.oracle.com/cd/B19306_01/server.102/b14200/statements_6002.htm Notes on that syntax: http://docs.oracle.com/cd/B19306_01/server.102/b14200/statements_6002.htm#i2064161 Full MV syntax: http://docs.oracle.com/cd/B19306_01/server.102/b14200/statements_6002.htm You can request that a materialized view be automatically refreshed when the base tables are updated using the REFRESH ON COMMIT option. The update transaction pauses while the views are updated - Specify ON COMMIT to indicate that a fast refresh is to occur whenever the database commits a transaction that operates on a master table of the materialized view. This clause may increase the time taken to complete the commit, because the database performs the refresh operation as part of the commit process. You can also refresh on time intervals, on demand, or no refresh ever. Originally MV was known as SNAPSHOT - a one-time snapshot of a view of the base tables/query. Oracle has a FAST refresh, which depends on a MATERIALIZED VIEW LOG, which must be created for the base table(s). Otherwise a COMPLETE refresh is required. Global indexes -- Key: CASSANDRA-6477 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477 Project: Cassandra Issue Type: New Feature Components: API, Core Reporter: Jonathan Ellis Assignee: Carl Yeksigian Labels: cql Fix For: 3.0 Local indexes are suitable for low-cardinality data, where spreading the index across the cluster is a Good Thing. However, for high-cardinality data, local indexes require querying most nodes in the cluster even if only a handful of rows is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7523) add date and time types
[ https://issues.apache.org/jira/browse/CASSANDRA-7523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503617#comment-14503617 ] Joshua McKenzie commented on CASSANDRA-7523: I vote we target 3.0. add date and time types --- Key: CASSANDRA-7523 URL: https://issues.apache.org/jira/browse/CASSANDRA-7523 Project: Cassandra Issue Type: New Feature Components: API Reporter: Jonathan Ellis Assignee: Joshua McKenzie Priority: Minor Labels: client-impacting, docs Fix For: 2.1.5 http://www.postgresql.org/docs/9.1/static/datatype-datetime.html (we already have timestamp; interval is out of scope for now, and see CASSANDRA-6350 for discussion on timestamp-with-time-zone. but date/time should be pretty easy to add.) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-2848) Make the Client API support passing down timeouts
[ https://issues.apache.org/jira/browse/CASSANDRA-2848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503867#comment-14503867 ] Aleksey Yeschenko commented on CASSANDRA-2848: -- Would it make sense to use CASSANDRA-8553? Make the Client API support passing down timeouts - Key: CASSANDRA-2848 URL: https://issues.apache.org/jira/browse/CASSANDRA-2848 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Chris Goffinet Assignee: Sylvain Lebresne Priority: Minor Fix For: 3.1 Having a max server RPC timeout is good for worst case, but many applications that have middleware in front of Cassandra, might have higher timeout requirements. In a fail fast environment, if my application starting at say the front-end, only has 20ms to process a request, and it must connect to X services down the stack, by the time it hits Cassandra, we might only have 10ms. I propose we provide the ability to specify the timeout on each call we do optionally. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9205) Allow per statement time outs or request cancel method
[ https://issues.apache.org/jira/browse/CASSANDRA-9205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams updated CASSANDRA-9205: Fix Version/s: (was: 2.1.5) 3.1 Allow per statement time outs or request cancel method -- Key: CASSANDRA-9205 URL: https://issues.apache.org/jira/browse/CASSANDRA-9205 Project: Cassandra Issue Type: Improvement Reporter: Vishy Kasar Fix For: 3.1 Cassandra lets user specify time outs for various operations globally in yaml. It would be great if we could set different timeouts for CQL statements in different contexts, rather than just having a global timeouts in yaml. We have client requests that need to time out in a short duration vs some maintenance requests that we know take long. The only choice we have now is to set the server time out to the highest needed. User can certainly do session.executeAsync on the client side and wait for certain time on the returned future. However when user cancels the future on time out, nothing is done on the server side. We have seen cases where cassandra replicas were going over thousands of tombstones and causing OOMs way after client timed out. This can be implemented either by passing the time out along with query to server or by providing a cancel method similar to http://docs.oracle.com/javase/6/docs/api/java/sql/Statement.html It is understood that server may not be able to timeout/cancel the requests in all cases. So this is a request to server to do it's best effort to timeout/cancel. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9205) Allow per statement time outs or request cancel method
[ https://issues.apache.org/jira/browse/CASSANDRA-9205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams updated CASSANDRA-9205: Fix Version/s: 2.1.5 Allow per statement time outs or request cancel method -- Key: CASSANDRA-9205 URL: https://issues.apache.org/jira/browse/CASSANDRA-9205 Project: Cassandra Issue Type: Improvement Reporter: Vishy Kasar Fix For: 3.1 Cassandra lets user specify time outs for various operations globally in yaml. It would be great if we could set different timeouts for CQL statements in different contexts, rather than just having a global timeouts in yaml. We have client requests that need to time out in a short duration vs some maintenance requests that we know take long. The only choice we have now is to set the server time out to the highest needed. User can certainly do session.executeAsync on the client side and wait for certain time on the returned future. However when user cancels the future on time out, nothing is done on the server side. We have seen cases where cassandra replicas were going over thousands of tombstones and causing OOMs way after client timed out. This can be implemented either by passing the time out along with query to server or by providing a cancel method similar to http://docs.oracle.com/javase/6/docs/api/java/sql/Statement.html It is understood that server may not be able to timeout/cancel the requests in all cases. So this is a request to server to do it's best effort to timeout/cancel. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-7523) add date and time types
[ https://issues.apache.org/jira/browse/CASSANDRA-7523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503617#comment-14503617 ] Joshua McKenzie edited comment on CASSANDRA-7523 at 4/20/15 9:12 PM: - I vote we target 3.0. edit: [~slebresne] / [~iamaleksey]: Do either of you have any concerns with that? It's a pretty clean revert since it's mostly new code. was (Author: joshuamckenzie): I vote we target 3.0. add date and time types --- Key: CASSANDRA-7523 URL: https://issues.apache.org/jira/browse/CASSANDRA-7523 Project: Cassandra Issue Type: New Feature Components: API Reporter: Jonathan Ellis Assignee: Joshua McKenzie Priority: Minor Labels: client-impacting, docs Fix For: 2.1.5 http://www.postgresql.org/docs/9.1/static/datatype-datetime.html (we already have timestamp; interval is out of scope for now, and see CASSANDRA-6350 for discussion on timestamp-with-time-zone. but date/time should be pretty easy to add.) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8789) OutboundTcpConnectionPool should route messages to sockets by size not type
[ https://issues.apache.org/jira/browse/CASSANDRA-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503977#comment-14503977 ] Brandon Williams commented on CASSANDRA-8789: - I agree on priority-based messaging. Gossip is fairly low throughput, but also very important to get delivered and should take priority. OutboundTcpConnectionPool should route messages to sockets by size not type --- Key: CASSANDRA-8789 URL: https://issues.apache.org/jira/browse/CASSANDRA-8789 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Ariel Weisberg Assignee: Ariel Weisberg Fix For: 3.0 Attachments: 8789.diff I was looking at this trying to understand what messages flow over which connection. For reads the request goes out over the command connection and the response comes back over the ack connection. For writes the request goes out over the command connection and the response comes back over the command connection. Reads get a dedicated socket for responses. Mutation commands and responses both travel over the same socket along with read requests. Sockets are used uni-directional so there are actually four sockets in play and four threads at each node (2 inbounded, 2 outbound). CASSANDRA-488 doesn't leave a record of what the impact of this change was. If someone remembers what situations were made better it would be good to know. I am not clear on when/how this is helpful. The consumer side shouldn't be blocking so the only head of line blocking issue is the time it takes to transfer data over the wire. If message size is the cause of blocking issues then the current design mixes small messages and large messages on the same connection retaining the head of line blocking. Read requests share the same connection as write requests (which are large), and write acknowledgments (which are small) share the same connections as write requests. The only winner is read acknowledgements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9217) Problems with cqlsh copy command
[ https://issues.apache.org/jira/browse/CASSANDRA-9217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Thompson updated CASSANDRA-9217: --- Fix Version/s: 2.1.5 Labels: cqlsh (was: ) Problems with cqlsh copy command Key: CASSANDRA-9217 URL: https://issues.apache.org/jira/browse/CASSANDRA-9217 Project: Cassandra Issue Type: Bug Reporter: Brian Cantoni Assignee: Tyler Hobbs Labels: cqlsh Fix For: 2.1.5 On the current 2.1 branch I notice a few (possibly related) problems with cqlsh copy commands. I'm writing them here together but we can separate if there are different causes. *1. Cannot import from CSV if column name is 'date'* Test file monthly.csv contents: {noformat} stationid,metric,date LAE,barometricpressure,2014-01-01 00:00:00+ LAE,barometricpressure,2014-02-01 00:00:00+ LAE,barometricpressure,2014-03-01 00:00:00+ {noformat} Steps: {noformat} CREATE KEYSPACE IF NOT EXISTS weathercql WITH replication = { 'class': 'SimpleStrategy', 'replication_factor': '1' }; CREATE TABLE IF NOT EXISTS weathercql.monthly ( stationid text, metric text, date timestamp, primary key (stationid, metric, date) ); COPY weathercql.monthly (stationid, metric, date) FROM 'monthly.csv' WITH HEADER='true'; {noformat} Result: the copy command fails unless date is enclosed in double quotes: {noformat} cqlsh COPY weathercql.monthly (stationid, metric, date) FROM 'monthly.csv' WITH HEADER='true'; Improper COPY command. cqlsh COPY weathercql.monthly (stationid, metric, date) FROM 'monthly.csv' WITH HEADER='true'; 3 rows imported in 0.096 seconds. {noformat} If I instead name the 'date' column as 'datex', it works without quotes. The same steps work on Cassandra 2.1.4 (release build). *2. Cannot copy to CSV* Sample data: {noformat} create keyspace if not exists test with replication = {'class':'SimpleStrategy', 'replication_factor':1}; create table if not exists test.kv (key int primary key, value text); insert into test.kv (key,value) values (1,'alpha'); insert into test.kv (key,value) values (2,'beta'); insert into test.kv (key,value) values (3,'charlie'); {noformat} When you try to export to CSV, it throws what appears to be a Python error, and the file is not created correctly: {noformat} cqlsh copy test.kv (key,value) to 'test.csv'; global name 'meter' is not defined {noformat} The same steps work on Cassandra 2.1.4 (release build). *3. Copy from CSV inside CQL command file doesn't work* File kv.csv: {noformat} key,value 1,'a' 2,'b' 3,'c' {noformat} File kv.cql: {noformat} create keyspace if not exists test with replication = {'class': 'SimpleStrategy', 'replication_factor':1}; create table if not exists test.kv (key int primary key, value text); truncate test.kv; copy test.kv (key, value) from 'kv.csv' with header='true'; select * from test.kv; {noformat} When command file passed to cqlsh, an error is reported on the `copy` command and it doesn't work: {noformat} $ bin/cqlsh -f kv.cql kv.cql:5:descriptor 'lower' requires a 'str' object but received a 'unicode' key | value -+--- (0 rows) {noformat} The same commands work correctly when run directly inside cqlsh or when executed with -e option like: {{bin/cqlsh -e copy test.kv (key, value) from 'kv.csv' with header='true';}}. This third issue appears to also be broken in 2.1.4 and 2.1.3 release builds, but works in 2.1.2. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8789) OutboundTcpConnectionPool should route messages to sockets by size not type
[ https://issues.apache.org/jira/browse/CASSANDRA-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504005#comment-14504005 ] Benedict commented on CASSANDRA-8789: - I don't doubt there are problems with this, but I'm not sure they're significantly worse under the new scheme than the old... Currently messages are split along the following boundaries: REQUEST_RESPONSE, INTERNAL_RESPONSE, GOSSIP, READ, MUTATION, COUNTER_MUTATION, ANTI_ENTROPY, MIGRATION, MISC, TRACING, READ_REPAIR; READ_RESPONSE is half of the problem messages you highlighted, and in many workloads likely significantly more of a problem than mutations (since with clustering data they have the potential to deliver much larger payloads), and they currently operate on the same channel as gossip. The main difference is that you won't see them on a pure stress write workload; a mixed workload you would. So if this is a potentially serious problem, it is likely already being exhibited. I should make clear that I'm not disputing there's a problem - this seems very clearly something we want to avoid. But I don't think we have made matters _worse_ with this ticket (though the profile has perhaps changed). Introducing extra channels that are managed via NIO for whom we have no throughput requirements, only latency, seems like a potential solution to this. Or a priority queue and a capped send buffer size (capped low for slow WAN connections, for instance). I would quite like to see us abstract MessagingService so that not only the transport can be pluggable, but it can be different per end-point (e.g. cross-dc), and per message type. I think all of these endeavours are orthogonal to this ticket, though, and deserve their own. OutboundTcpConnectionPool should route messages to sockets by size not type --- Key: CASSANDRA-8789 URL: https://issues.apache.org/jira/browse/CASSANDRA-8789 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Ariel Weisberg Assignee: Ariel Weisberg Fix For: 3.0 Attachments: 8789.diff I was looking at this trying to understand what messages flow over which connection. For reads the request goes out over the command connection and the response comes back over the ack connection. For writes the request goes out over the command connection and the response comes back over the command connection. Reads get a dedicated socket for responses. Mutation commands and responses both travel over the same socket along with read requests. Sockets are used uni-directional so there are actually four sockets in play and four threads at each node (2 inbounded, 2 outbound). CASSANDRA-488 doesn't leave a record of what the impact of this change was. If someone remembers what situations were made better it would be good to know. I am not clear on when/how this is helpful. The consumer side shouldn't be blocking so the only head of line blocking issue is the time it takes to transfer data over the wire. If message size is the cause of blocking issues then the current design mixes small messages and large messages on the same connection retaining the head of line blocking. Read requests share the same connection as write requests (which are large), and write acknowledgments (which are small) share the same connections as write requests. The only winner is read acknowledgements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6477) Global indexes
[ https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503881#comment-14503881 ] Jeff Jirsa commented on CASSANDRA-6477: --- As an end user, MV + matching SELECT described above looks appealing, and would match what I would expect from hearing the name and being familiar with VIEWs in other databases. Global indexes -- Key: CASSANDRA-6477 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477 Project: Cassandra Issue Type: New Feature Components: API, Core Reporter: Jonathan Ellis Assignee: Carl Yeksigian Labels: cql Fix For: 3.0 Local indexes are suitable for low-cardinality data, where spreading the index across the cluster is a Good Thing. However, for high-cardinality data, local indexes require querying most nodes in the cluster even if only a handful of rows is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9217) Problems with cqlsh copy command
[ https://issues.apache.org/jira/browse/CASSANDRA-9217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503987#comment-14503987 ] Sebastian Estevez commented on CASSANDRA-9217: -- If someone reaches this Jira and is in dire need of loading some bulk data. Give this a try: https://github.com/brianmhess/cassandra-loader Problems with cqlsh copy command Key: CASSANDRA-9217 URL: https://issues.apache.org/jira/browse/CASSANDRA-9217 Project: Cassandra Issue Type: Bug Reporter: Brian Cantoni Assignee: Tyler Hobbs Labels: cqlsh Fix For: 2.1.5 On the current 2.1 branch I notice a few (possibly related) problems with cqlsh copy commands. I'm writing them here together but we can separate if there are different causes. *1. Cannot import from CSV if column name is 'date'* Test file monthly.csv contents: {noformat} stationid,metric,date LAE,barometricpressure,2014-01-01 00:00:00+ LAE,barometricpressure,2014-02-01 00:00:00+ LAE,barometricpressure,2014-03-01 00:00:00+ {noformat} Steps: {noformat} CREATE KEYSPACE IF NOT EXISTS weathercql WITH replication = { 'class': 'SimpleStrategy', 'replication_factor': '1' }; CREATE TABLE IF NOT EXISTS weathercql.monthly ( stationid text, metric text, date timestamp, primary key (stationid, metric, date) ); COPY weathercql.monthly (stationid, metric, date) FROM 'monthly.csv' WITH HEADER='true'; {noformat} Result: the copy command fails unless date is enclosed in double quotes: {noformat} cqlsh COPY weathercql.monthly (stationid, metric, date) FROM 'monthly.csv' WITH HEADER='true'; Improper COPY command. cqlsh COPY weathercql.monthly (stationid, metric, date) FROM 'monthly.csv' WITH HEADER='true'; 3 rows imported in 0.096 seconds. {noformat} If I instead name the 'date' column as 'datex', it works without quotes. The same steps work on Cassandra 2.1.4 (release build). *2. Cannot copy to CSV* Sample data: {noformat} create keyspace if not exists test with replication = {'class':'SimpleStrategy', 'replication_factor':1}; create table if not exists test.kv (key int primary key, value text); insert into test.kv (key,value) values (1,'alpha'); insert into test.kv (key,value) values (2,'beta'); insert into test.kv (key,value) values (3,'charlie'); {noformat} When you try to export to CSV, it throws what appears to be a Python error, and the file is not created correctly: {noformat} cqlsh copy test.kv (key,value) to 'test.csv'; global name 'meter' is not defined {noformat} The same steps work on Cassandra 2.1.4 (release build). *3. Copy from CSV inside CQL command file doesn't work* File kv.csv: {noformat} key,value 1,'a' 2,'b' 3,'c' {noformat} File kv.cql: {noformat} create keyspace if not exists test with replication = {'class': 'SimpleStrategy', 'replication_factor':1}; create table if not exists test.kv (key int primary key, value text); truncate test.kv; copy test.kv (key, value) from 'kv.csv' with header='true'; select * from test.kv; {noformat} When command file passed to cqlsh, an error is reported on the `copy` command and it doesn't work: {noformat} $ bin/cqlsh -f kv.cql kv.cql:5:descriptor 'lower' requires a 'str' object but received a 'unicode' key | value -+--- (0 rows) {noformat} The same commands work correctly when run directly inside cqlsh or when executed with -e option like: {{bin/cqlsh -e copy test.kv (key, value) from 'kv.csv' with header='true';}}. This third issue appears to also be broken in 2.1.4 and 2.1.3 release builds, but works in 2.1.2. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-9194) Delete-only workloads crash Cassandra
[ https://issues.apache.org/jira/browse/CASSANDRA-9194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503090#comment-14503090 ] Jim Witschey edited comment on CASSANDRA-9194 at 4/21/15 4:48 AM: -- I'm afraid I don't follow. bq. This is already behaving properly in 2.1 [The results|https://gist.github.com/mambocab/08577841f4f74a722910] of [the dtest|https://github.com/riptano/cassandra-dtest/blob/master/deletion_test.py#L44] show that the tracked memtable size is 0 in 2.1 after performing 100 deletions -- {{MemtableLiveDataSize}} is reported as 0 over JMX even when {{MemtableColumnsCount}} is 100. Is that behavior correct? I may not have been clear, but that test fails on all released 2.0 and 2.1 versions. Also, I don't understand why the amount of memory to track for tombstones is arbitrary in 2.0. was (Author: mambocab): I'm afraid I don't follow. bq. This is already behaving properly in 2.1 [The results of the dtest|https://github.com/riptano/cassandra-dtest/blob/master/deletion_test.py#L44] show that the tracked memtable size is 0 in 2.1 after performing 100 deletions -- {{MemtableLiveDataSize}} is reported as 0 over JMX even when {{MemtableColumnsCount}} is 100. Is that behavior correct? I may not have been clear, but that test fails on all released 2.0 and 2.1 versions. Also, I don't understand why the amount of memory to track for tombstones is arbitrary in 2.0. Delete-only workloads crash Cassandra - Key: CASSANDRA-9194 URL: https://issues.apache.org/jira/browse/CASSANDRA-9194 Project: Cassandra Issue Type: Bug Environment: 2.0.14 Reporter: Robert Wille Assignee: Benedict Fix For: 2.0.15 Attachments: 9194.txt The size of a tombstone is not properly accounted for in the memtable. A memtable which has only tombstones will never get flushed. It will grow until the JVM runs out of memory. The following program easily demonstrates the problem. {code} Cluster.Builder builder = Cluster.builder(); Cluster c = builder.addContactPoints(cas121.devf3.com).build(); Session s = c.connect(); s.execute(CREATE KEYSPACE IF NOT EXISTS test WITH replication = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 }); s.execute(CREATE TABLE IF NOT EXISTS test.test(id INT PRIMARY KEY)); PreparedStatement stmt = s.prepare(DELETE FROM test.test WHERE id = :id); int id = 0; while (true) { s.execute(stmt.bind(id)); id++; }{code} This program should run forever, but eventually Cassandra runs out of heap and craps out. You needn't wait for Cassandra to crash. If you run nodetool cfstats test.test while it is running, you'll see Memtable cell count grow, but Memtable data size will remain 0. This issue was fixed once before. I received a patch for version 2.0.5 (I believe), which contained the fix, but the fix has apparently been lost, because it is clearly broken, and I don't see the fix in the change logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8789) OutboundTcpConnectionPool should route messages to sockets by size not type
[ https://issues.apache.org/jira/browse/CASSANDRA-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504158#comment-14504158 ] Benedict commented on CASSANDRA-8789: - 2.0 stress, AFAICR, does not load balance. By default 2.1 does (smart thrift routing round-robins the owning nodes for any token). So all of the writes to the cluster are likely being piped through a single node in the 2.0 experiment (so over just two tcp connections), instead of evenly spread over six. OutboundTcpConnectionPool should route messages to sockets by size not type --- Key: CASSANDRA-8789 URL: https://issues.apache.org/jira/browse/CASSANDRA-8789 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Ariel Weisberg Assignee: Ariel Weisberg Fix For: 3.0 Attachments: 8789.diff I was looking at this trying to understand what messages flow over which connection. For reads the request goes out over the command connection and the response comes back over the ack connection. For writes the request goes out over the command connection and the response comes back over the command connection. Reads get a dedicated socket for responses. Mutation commands and responses both travel over the same socket along with read requests. Sockets are used uni-directional so there are actually four sockets in play and four threads at each node (2 inbounded, 2 outbound). CASSANDRA-488 doesn't leave a record of what the impact of this change was. If someone remembers what situations were made better it would be good to know. I am not clear on when/how this is helpful. The consumer side shouldn't be blocking so the only head of line blocking issue is the time it takes to transfer data over the wire. If message size is the cause of blocking issues then the current design mixes small messages and large messages on the same connection retaining the head of line blocking. Read requests share the same connection as write requests (which are large), and write acknowledgments (which are small) share the same connections as write requests. The only winner is read acknowledgements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-8789) OutboundTcpConnectionPool should route messages to sockets by size not type
[ https://issues.apache.org/jira/browse/CASSANDRA-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504158#comment-14504158 ] Benedict edited comment on CASSANDRA-8789 at 4/21/15 2:05 AM: -- 2.0 stress, AFAICR, does not load balance. By default 2.1 does (smart thrift routing round-robins the owning nodes for any token). So all of the writes to the cluster are likely being piped through a single node in the 2.0 experiment (so over just two tcp connections), instead of evenly spread all three (i.e. six tcp connections). was (Author: benedict): 2.0 stress, AFAICR, does not load balance. By default 2.1 does (smart thrift routing round-robins the owning nodes for any token). So all of the writes to the cluster are likely being piped through a single node in the 2.0 experiment (so over just two tcp connections), instead of evenly spread over six. OutboundTcpConnectionPool should route messages to sockets by size not type --- Key: CASSANDRA-8789 URL: https://issues.apache.org/jira/browse/CASSANDRA-8789 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Ariel Weisberg Assignee: Ariel Weisberg Fix For: 3.0 Attachments: 8789.diff I was looking at this trying to understand what messages flow over which connection. For reads the request goes out over the command connection and the response comes back over the ack connection. For writes the request goes out over the command connection and the response comes back over the command connection. Reads get a dedicated socket for responses. Mutation commands and responses both travel over the same socket along with read requests. Sockets are used uni-directional so there are actually four sockets in play and four threads at each node (2 inbounded, 2 outbound). CASSANDRA-488 doesn't leave a record of what the impact of this change was. If someone remembers what situations were made better it would be good to know. I am not clear on when/how this is helpful. The consumer side shouldn't be blocking so the only head of line blocking issue is the time it takes to transfer data over the wire. If message size is the cause of blocking issues then the current design mixes small messages and large messages on the same connection retaining the head of line blocking. Read requests share the same connection as write requests (which are large), and write acknowledgments (which are small) share the same connections as write requests. The only winner is read acknowledgements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9215) Test with degraded or unreliable networks
[ https://issues.apache.org/jira/browse/CASSANDRA-9215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503536#comment-14503536 ] Russ Hatch commented on CASSANDRA-9215: --- There are some container networking projects that offer interesting possibilities here, particularly the test tool [Blockade|https://blockade.readthedocs.org/en/latest/]. Hasn't been tried is practice afaik. There is also [Weave|https://zettio.github.io/weave/features.html], which basically adds an additional container that acts like an ethernet switch. From there it's possible to use stuff like 'tc' to slow packets and the like. Weave also supports containers spanning multiple hosts. I have experimented with Weave and there's some notes captured [here|https://github.com/knifewine/cstar_test_docker#bonus-points-mucking-with-the-container-network]. From what I understand, there aren't any plans to add container support to CCM so container tech might not be useful for test infra. Test with degraded or unreliable networks - Key: CASSANDRA-9215 URL: https://issues.apache.org/jira/browse/CASSANDRA-9215 Project: Cassandra Issue Type: Test Reporter: Ariel Weisberg I have tried to test WAN replication using routing nodes with various configurations to simulate a bad network and not had good results with realistically reproducing a WAN performance. The fake WAN performed better than the real one. I think we need to do at least some of our testing over a link between data centers that are at least as distance as US East - US West. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-9211) Keep history of SSTable metadata
Björn Hegerfors created CASSANDRA-9211: -- Summary: Keep history of SSTable metadata Key: CASSANDRA-9211 URL: https://issues.apache.org/jira/browse/CASSANDRA-9211 Project: Cassandra Issue Type: Wish Reporter: Björn Hegerfors Similar to the request in CASSANDRA-8078, I'm interested in SSTables' lineage. Specifically, I want to visualize the behaviors of compaction strategies based on real data. For example, for STCS I might want to generate something like this image: http://www.datastax.com/wp-content/uploads/2011/10/size-tiered-1.png. For LCS and DTCS, other properties than size are interesting. As Marcus responded in CASSANDRA-8078, there is already tracking of ancestors in the SSTable metadata. But as far as I know, the metadata gets garbage collected along with the SSTable itself. So what I propose is to persist metadata in a system table. Maybe some, maybe all metadata. Like the compaction_history table, this should have a default TTL of something like one week or just one day. But users can freely modify/remove the TTL. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9211) Keep history of SSTable metadata
[ https://issues.apache.org/jira/browse/CASSANDRA-9211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Björn Hegerfors updated CASSANDRA-9211: --- Priority: Minor (was: Major) Keep history of SSTable metadata Key: CASSANDRA-9211 URL: https://issues.apache.org/jira/browse/CASSANDRA-9211 Project: Cassandra Issue Type: Wish Reporter: Björn Hegerfors Priority: Minor Similar to the request in CASSANDRA-8078, I'm interested in SSTables' lineage. Specifically, I want to visualize the behaviors of compaction strategies based on real data. For example, for STCS I might want to generate something like this image: http://www.datastax.com/wp-content/uploads/2011/10/size-tiered-1.png. For LCS and DTCS, other properties than size are interesting. As Marcus responded in CASSANDRA-8078, there is already tracking of ancestors in the SSTable metadata. But as far as I know, the metadata gets garbage collected along with the SSTable itself. So what I propose is to persist metadata in a system table. Maybe some, maybe all metadata. Like the compaction_history table, this should have a default TTL of something like one week or just one day. But users can freely modify/remove the TTL. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[2/2] cassandra git commit: Distinguish between null and unset in the native protocol (v4)
Distinguish between null and unset in the native protocol (v4) patch by odpeer; reviewed by blerer for CASSANDRA-7304 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/48f64468 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/48f64468 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/48f64468 Branch: refs/heads/trunk Commit: 48f644686b48357354f16c74b02b6d2c450a8c2d Parents: bf51f24 Author: Oded Peer peer.o...@gmail.com Authored: Mon Apr 20 15:28:09 2015 +0200 Committer: Sylvain Lebresne sylv...@datastax.com Committed: Mon Apr 20 15:30:19 2015 +0200 -- CHANGES.txt | 1 + NEWS.txt| 12 ++ doc/native_protocol_v4.spec | 9 +- .../org/apache/cassandra/cql3/Attributes.java | 7 ++ .../apache/cassandra/cql3/ColumnCondition.java | 12 +- .../org/apache/cassandra/cql3/Constants.java| 19 +++- src/java/org/apache/cassandra/cql3/Lists.java | 43 --- src/java/org/apache/cassandra/cql3/Maps.java| 43 +-- .../org/apache/cassandra/cql3/QueryOptions.java | 4 +- .../apache/cassandra/cql3/QueryProcessor.java | 3 + src/java/org/apache/cassandra/cql3/Sets.java| 38 --- src/java/org/apache/cassandra/cql3/Tuples.java | 8 ++ .../org/apache/cassandra/cql3/UserTypes.java| 4 + .../cassandra/cql3/functions/FunctionCall.java | 7 +- .../cql3/restrictions/AbstractRestriction.java | 2 + .../restrictions/SingleColumnRestriction.java | 20 +++- .../cql3/statements/ModificationStatement.java | 8 +- .../cql3/statements/RequestValidations.java | 17 +++ .../cql3/statements/SelectStatement.java| 5 +- .../db/composites/CompositesBuilder.java| 23 +++- .../apache/cassandra/db/marshal/UserType.java | 5 + .../org/apache/cassandra/transport/CBUtil.java | 27 - .../transport/messages/BatchMessage.java| 2 +- .../transport/messages/ExecuteMessage.java | 2 +- .../apache/cassandra/utils/ByteBufferUtil.java | 3 + .../org/apache/cassandra/cql3/CQLTester.java| 11 ++ .../apache/cassandra/cql3/CollectionsTest.java | 100 + .../cassandra/cql3/ColumnConditionTest.java | 7 ++ .../cassandra/cql3/ContainsRelationTest.java| 12 ++ .../apache/cassandra/cql3/ModificationTest.java | 112 +++ .../cassandra/cql3/MultiColumnRelationTest.java | 22 .../cql3/SingleColumnRelationTest.java | 47 +++- .../apache/cassandra/cql3/TupleTypeTest.java| 13 +++ .../apache/cassandra/cql3/UserTypesTest.java| 15 +++ 34 files changed, 594 insertions(+), 69 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/48f64468/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 3dbd42f..d1d6dea 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 3.0 + * Distinguish between null and unset in protocol v4 (CASSANDRA-7304) * Add user/role permissions for user-defined functions (CASSANDRA-7557) * Allow cassandra config to be updated to restart daemon without unloading classes (CASSANDRA-9046) * Don't initialize compaction writer before checking if iter is empty (CASSANDRA-9117) http://git-wip-us.apache.org/repos/asf/cassandra/blob/48f64468/NEWS.txt -- diff --git a/NEWS.txt b/NEWS.txt index 7db07f0..03008de 100644 --- a/NEWS.txt +++ b/NEWS.txt @@ -58,6 +58,18 @@ New features - The node now keeps up when streaming is failed during bootstrapping. You can use new `nodetool bootstrap resume` command to continue streaming after resolving an issue. + - Protocol version 4 specifies that bind variables do not require having a + value when executing a statement. Bind variables without a value are + called 'unset'. The 'unset' bind variable is serialized as the int + value '-2' without following bytes. + In an EXECUTE or BATCH request an unset bind value does not modify the value and + does not create a tombstone, an unset bind ttl is treated as 'unlimited', + an unset bind timestamp is treated as 'now', an unset bind counter operation + does not change the counter value. + Unset tuple field, UDT field and map key are not allowed. + In a QUERY request an unset limit is treated as 'unlimited'. + Unset WHERE clauses with unset partition column, clustering column + or index column are not allowed. Upgrading http://git-wip-us.apache.org/repos/asf/cassandra/blob/48f64468/doc/native_protocol_v4.spec -- diff --git a/doc/native_protocol_v4.spec
[jira] [Commented] (CASSANDRA-6542) nodetool removenode hangs
[ https://issues.apache.org/jira/browse/CASSANDRA-6542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14502781#comment-14502781 ] Phil Yang commented on CASSANDRA-6542: -- Streaming hanging is a familiar trouble in my cluster (2.0/2.1). In my experience, the node that being restarted recently won't hang the streaming. So each time when I want to add or remove a node, I will restart all nodes one by one. Hope this tricky solution may solve your troubles temporarily. nodetool removenode hangs - Key: CASSANDRA-6542 URL: https://issues.apache.org/jira/browse/CASSANDRA-6542 Project: Cassandra Issue Type: Bug Components: Core Environment: Ubuntu 12, 1.2.11 DSE Reporter: Eric Lubow Assignee: Yuki Morishita Running *nodetool removenode $host-id* doesn't actually remove the node from the ring. I've let it run anywhere from 5 minutes to 3 days and there are no messages in the log about it hanging or failing, the command just sits there running. So the regular response has been to run *nodetool removenode $host-id*, give it about 10-15 minutes and then run *nodetool removenode force*. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
cassandra git commit: fix nodetool names that reference column families patcb by dbrosius reviewed by stefania for cassandra-8872
Repository: cassandra Updated Branches: refs/heads/trunk 57b557839 - bf51f24a6 fix nodetool names that reference column families patcb by dbrosius reviewed by stefania for cassandra-8872 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/bf51f24a Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/bf51f24a Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/bf51f24a Branch: refs/heads/trunk Commit: bf51f24a6ffd01f330a6251e3a62ad32a1f31b9a Parents: 57b5578 Author: Dave Brosius dbros...@mebigfatguy.com Authored: Mon Apr 20 09:02:29 2015 -0400 Committer: Dave Brosius dbros...@mebigfatguy.com Committed: Mon Apr 20 09:02:29 2015 -0400 -- CHANGES.txt | 1 + .../org/apache/cassandra/tools/NodeTool.java| 2 + .../cassandra/tools/nodetool/CfHistograms.java | 112 +-- .../cassandra/tools/nodetool/CfStats.java | 295 +- .../tools/nodetool/TableHistograms.java | 129 .../cassandra/tools/nodetool/TableStats.java| 312 +++ 6 files changed, 456 insertions(+), 395 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/bf51f24a/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 99a9e7d..3dbd42f 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -93,6 +93,7 @@ * Resumable bootstrap streaming (CASSANDRA-8838, CASSANDRA-8942) * Allow scrub for secondary index (CASSANDRA-5174) * Save repair data to system table (CASSANDRA-5839) + * fix nodetool names that reference column families (CASSANDRA-8872) 2.1.5 * Make anticompaction visible in compactionstats (CASSANDRA-9098) http://git-wip-us.apache.org/repos/asf/cassandra/blob/bf51f24a/src/java/org/apache/cassandra/tools/NodeTool.java -- diff --git a/src/java/org/apache/cassandra/tools/NodeTool.java b/src/java/org/apache/cassandra/tools/NodeTool.java index 3cc307b..90657ba 100644 --- a/src/java/org/apache/cassandra/tools/NodeTool.java +++ b/src/java/org/apache/cassandra/tools/NodeTool.java @@ -54,7 +54,9 @@ public class NodeTool Ring.class, NetStats.class, CfStats.class, +TableStats.class, CfHistograms.class, +TableHistograms.class, Cleanup.class, ClearSnapshot.class, Compact.class, http://git-wip-us.apache.org/repos/asf/cassandra/blob/bf51f24a/src/java/org/apache/cassandra/tools/nodetool/CfHistograms.java -- diff --git a/src/java/org/apache/cassandra/tools/nodetool/CfHistograms.java b/src/java/org/apache/cassandra/tools/nodetool/CfHistograms.java index a9e43fd..69d3b4a 100644 --- a/src/java/org/apache/cassandra/tools/nodetool/CfHistograms.java +++ b/src/java/org/apache/cassandra/tools/nodetool/CfHistograms.java @@ -17,113 +17,13 @@ */ package org.apache.cassandra.tools.nodetool; -import static com.google.common.base.Preconditions.checkArgument; -import static java.lang.String.format; -import io.airlift.command.Arguments; import io.airlift.command.Command; -import java.util.ArrayList; -import java.util.List; - -import org.apache.cassandra.metrics.CassandraMetricsRegistry; -import org.apache.cassandra.tools.NodeProbe; -import org.apache.cassandra.tools.NodeTool.NodeToolCmd; -import org.apache.cassandra.utils.EstimatedHistogram; -import org.apache.commons.lang3.ArrayUtils; - -@Command(name = cfhistograms, description = Print statistic histograms for a given column family) -public class CfHistograms extends NodeToolCmd +/** + * @deprecated use TableHistograms + */ +@Command(name = cfhistograms, hidden = true, description = Print statistic histograms for a given column family) +@Deprecated +public class CfHistograms extends TableHistograms { -@Arguments(usage = keyspace table, description = The keyspace and table name) -private ListString args = new ArrayList(); - -@Override -public void execute(NodeProbe probe) -{ -checkArgument(args.size() == 2, cfhistograms requires ks and cf args); - -String keyspace = args.get(0); -String cfname = args.get(1); - -// calculate percentile of row size and column count -long[] estimatedRowSize = (long[]) probe.getColumnFamilyMetric(keyspace, cfname, EstimatedRowSizeHistogram); -long[] estimatedColumnCount = (long[]) probe.getColumnFamilyMetric(keyspace, cfname, EstimatedColumnCountHistogram); - -// build arrays to store percentile values -double[] estimatedRowSizePercentiles = new double[7]; -double[]
[jira] [Commented] (CASSANDRA-7523) add date and time types
[ https://issues.apache.org/jira/browse/CASSANDRA-7523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14502751#comment-14502751 ] Sylvain Lebresne commented on CASSANDRA-7523: - bq. Snuck the v4 protocol change into the merge to trunk I'm slightly confused here, I don't see any code that seems to prevent the new codes to be used for older versions of the protocol (v1, v2 and v3). add date and time types --- Key: CASSANDRA-7523 URL: https://issues.apache.org/jira/browse/CASSANDRA-7523 Project: Cassandra Issue Type: New Feature Components: API Reporter: Jonathan Ellis Assignee: Joshua McKenzie Priority: Minor Labels: client-impacting, docs Fix For: 2.1.5 http://www.postgresql.org/docs/9.1/static/datatype-datetime.html (we already have timestamp; interval is out of scope for now, and see CASSANDRA-6350 for discussion on timestamp-with-time-zone. but date/time should be pretty easy to add.) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[1/2] cassandra git commit: Distinguish between null and unset in the native protocol (v4)
Repository: cassandra Updated Branches: refs/heads/trunk bf51f24a6 - 48f644686 http://git-wip-us.apache.org/repos/asf/cassandra/blob/48f64468/test/unit/org/apache/cassandra/cql3/TupleTypeTest.java -- diff --git a/test/unit/org/apache/cassandra/cql3/TupleTypeTest.java b/test/unit/org/apache/cassandra/cql3/TupleTypeTest.java index ce935e3..48f0caf 100644 --- a/test/unit/org/apache/cassandra/cql3/TupleTypeTest.java +++ b/test/unit/org/apache/cassandra/cql3/TupleTypeTest.java @@ -98,4 +98,17 @@ public class TupleTypeTest extends CQLTester assertInvalidSyntax(INSERT INTO %s (k, t) VALUES (0, ())); assertInvalid(INSERT INTO %s (k, t) VALUES (0, (2, 'foo', 3.1, 'bar'))); } + +@Test +public void testTupleWithUnsetValues() throws Throwable +{ +createTable(CREATE TABLE %s (k int PRIMARY KEY, t tupleint, text, double)); +// invalid positional field substitution +assertInvalidMessage(Invalid unset value for tuple field number 1, +INSERT INTO %s (k, t) VALUES(0, (3, ?, 2.1)), unset()); + +createIndex(CREATE INDEX tuple_index ON %s (t)); +// select using unset +assertInvalidMessage(Invalid unset value for tuple field number 0, SELECT * FROM %s WHERE k = ? and t = (?,?,?), unset(), unset(), unset(), unset()); +} } http://git-wip-us.apache.org/repos/asf/cassandra/blob/48f64468/test/unit/org/apache/cassandra/cql3/UserTypesTest.java -- diff --git a/test/unit/org/apache/cassandra/cql3/UserTypesTest.java b/test/unit/org/apache/cassandra/cql3/UserTypesTest.java index 76315cf..3b381fc 100644 --- a/test/unit/org/apache/cassandra/cql3/UserTypesTest.java +++ b/test/unit/org/apache/cassandra/cql3/UserTypesTest.java @@ -111,4 +111,19 @@ public class UserTypesTest extends CQLTester row(1, null), row(2, 2)); } + +@Test +public void testUDTWithUnsetValues() throws Throwable +{ +// set up +String myType = createType(CREATE TYPE %s (x int, y int)); +String myOtherType = createType(CREATE TYPE %s (a frozen + myType + )); +createTable(CREATE TABLE %s (k int PRIMARY KEY, v frozen + myType + , z frozen + myOtherType + )); + +assertInvalidMessage(Invalid unset value for field 'y' of user defined type + myType, +INSERT INTO %s (k, v) VALUES (10, {x:?, y:?}), 1, unset()); + +assertInvalidMessage(Invalid unset value for field 'y' of user defined type + myType, +INSERT INTO %s (k, v, z) VALUES (10, {x:?, y:?}, {a:{x: ?, y: ?}}), 1, 1, 1, unset()); +} }
[jira] [Commented] (CASSANDRA-8267) Only stream from unrepaired sstables during incremental repair
[ https://issues.apache.org/jira/browse/CASSANDRA-8267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14502786#comment-14502786 ] Marcus Eriksson commented on CASSANDRA-8267: [~yukim] I pushed a trunk-rebased version here: https://github.com/krummas/cassandra/commits/marcuse/8267-trunk Only stream from unrepaired sstables during incremental repair -- Key: CASSANDRA-8267 URL: https://issues.apache.org/jira/browse/CASSANDRA-8267 Project: Cassandra Issue Type: Bug Reporter: Marcus Eriksson Assignee: Marcus Eriksson Fix For: 3.0 Attachments: 0001-Only-stream-from-unrepaired-sstables-during-incremen.patch, 8267-trunk.patch Seems we stream from all sstables even if we do incremental repair, we should limit this to only stream from the unrepaired sstables if we do incremental repair -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8609) Remove depency of hadoop to internals (Cell/CellName)
[ https://issues.apache.org/jira/browse/CASSANDRA-8609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14502788#comment-14502788 ] Philip Thompson commented on CASSANDRA-8609: This is not a duplicate, we will need something additional on top of CASSANDRA-8358. Remove depency of hadoop to internals (Cell/CellName) - Key: CASSANDRA-8609 URL: https://issues.apache.org/jira/browse/CASSANDRA-8609 Project: Cassandra Issue Type: Bug Reporter: Sylvain Lebresne Assignee: Philip Thompson Fix For: 3.0 Attachments: CASSANDRA-8609-3.0-branch.txt For some reason most of the Hadoop code (ColumnFamilyRecordReader, CqlStorage, ...) uses the {{Cell}} and {{CellName}} classes. That dependency is entirely artificial: all this code is really client code that communicate with Cassandra over thrift/native protocol and there is thus no reason for it to use internal classes. And in fact, thoses classes are used in a very crude way, as a {{PairByteBuffer, ByteBuffer}} really. But this dependency is really painful when we make changes to the internals. Further, every time we do so, I believe we break some of those the APIs due to the change. This has been painful for CASSANDRA-5417 and this is now painful for CASSANDRA-8099. But while I somewhat hack over it in CASSANDRA-5417, this was a mistake and we should have removed the depency back then. So let do that now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (CASSANDRA-9166) Prepared statements using functions in collection literals aren't invalidated when functions are dropped
[ https://issues.apache.org/jira/browse/CASSANDRA-9166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Lerer reassigned CASSANDRA-9166: - Assignee: Benjamin Lerer Prepared statements using functions in collection literals aren't invalidated when functions are dropped Key: CASSANDRA-9166 URL: https://issues.apache.org/jira/browse/CASSANDRA-9166 Project: Cassandra Issue Type: Bug Reporter: Sam Tunnicliffe Assignee: Benjamin Lerer Labels: cql, functions Fix For: 3.0 When a function is dropped, any prepared statements which reference it need to be removed from the prepared statement cache. The default implementation of {{Term#usesFunction}} in {{Term.NonTerminal}} is not overriden in all the places it should be. The {{DelayedValue}} classes in {{Lists}}, {{Sets}}, {{Maps}} and {{Tuples}} may all make use of function calls. {code} CREATE KEYSPACE ks WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1}; CREATE TABLE ks.t1 (k int PRIMARY KEY, v listint); CREATE FUNCTION ks.echo_int(input int) RETURNS int LANGUAGE javascript AS 'input'; {code} a prepared statement of the form: {code} INSERT INTO ks.t1 (k, v) VALUES (?, [ks.echo_int(?)]); {code} should be dropped when {{ks.echo_int(int)}} is, but currently that isn't the case. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8984) Introduce Transactional API for behaviours that can corrupt system state
[ https://issues.apache.org/jira/browse/CASSANDRA-8984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14502800#comment-14502800 ] Benedict commented on CASSANDRA-8984: - bq. I think all of us could benefit from some more of that Agreed. I've pushed a rebased version, which I hope isn't too painful to give a final +1 to. In retrospect, I realise that post-review merges would be more helpful, so that we can easily see what has been done to fix. I'll try to stick to that policy in future. Introduce Transactional API for behaviours that can corrupt system state Key: CASSANDRA-8984 URL: https://issues.apache.org/jira/browse/CASSANDRA-8984 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Fix For: 2.1.5 Attachments: 8984_windows_timeout.txt As a penultimate (and probably final for 2.1, if we agree to introduce it there) round of changes to the internals managing sstable writing, I've introduced a new API called Transactional that I hope will make it much easier to write correct behaviour. As things stand we conflate a lot of behaviours into methods like close - the recent changes unpicked some of these, but didn't go far enough. My proposal here introduces an interface designed to support four actions (on top of their normal function): * prepareToCommit * commit * abort * cleanup In normal operation, once we have finished constructing a state change we call prepareToCommit; once all such state changes are prepared, we call commit. If at any point everything fails, abort is called. In _either_ case, cleanup is called at the very last. These transactional objects are all AutoCloseable, with the behaviour being to rollback any changes unless commit has completed successfully. The changes are actually less invasive than it might sound, since we did recently introduce abort in some places, as well as have commit like methods. This simply formalises the behaviour, and makes it consistent between all objects that interact in this way. Much of the code change is boilerplate, such as moving an object into a try-declaration, although the change is still non-trivial. What it _does_ do is eliminate a _lot_ of special casing that we have had since 2.1 was released. The data tracker API changes and compaction leftover cleanups should finish the job with making this much easier to reason about, but this change I think is worthwhile considering for 2.1, since we've just overhauled this entire area (and not released these changes), and this change is essentially just the finishing touches, so the risk is minimal and the potential gains reasonably significant. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9211) Keep history of SSTable metadata
[ https://issues.apache.org/jira/browse/CASSANDRA-9211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14502687#comment-14502687 ] Björn Hegerfors commented on CASSANDRA-9211: I have started with a first pass at this system table in this branch: https://github.com/Bj0rnen/cassandra/tree/Bj0rnen/SSTableMetadataHistory. I haven't tried yet, but figuring out when to do insertions to this table could be difficult. I think it should be when the SSTable first gets loaded by Cassandra, but if you restart Cassandra, I guess all SSTables will be loaded again, and then it seems unnecessary to insert them another time. But it wouldn't be too bad. If anyone can point me to a good place for writing to this table, that would be helpful! I'm not only interested in SSTables that come from compaction, since I want to know metadata about the ones that are the input to a compaction operation, not just the output. Keep history of SSTable metadata Key: CASSANDRA-9211 URL: https://issues.apache.org/jira/browse/CASSANDRA-9211 Project: Cassandra Issue Type: Wish Reporter: Björn Hegerfors Similar to the request in CASSANDRA-8078, I'm interested in SSTables' lineage. Specifically, I want to visualize the behaviors of compaction strategies based on real data. For example, for STCS I might want to generate something like this image: http://www.datastax.com/wp-content/uploads/2011/10/size-tiered-1.png. For LCS and DTCS, other properties than size are interesting. As Marcus responded in CASSANDRA-8078, there is already tracking of ancestors in the SSTable metadata. But as far as I know, the metadata gets garbage collected along with the SSTable itself. So what I propose is to persist metadata in a system table. Maybe some, maybe all metadata. Like the compaction_history table, this should have a default TTL of something like one week or just one day. But users can freely modify/remove the TTL. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8872) fix nodetool names that reference column families
[ https://issues.apache.org/jira/browse/CASSANDRA-8872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14502464#comment-14502464 ] Stefania commented on CASSANDRA-8872: - There is a column family that escaped in the description of tablehistograms, at line 34 of TableHistograms.java. The rest is +1, ready for commit! fix nodetool names that reference column families - Key: CASSANDRA-8872 URL: https://issues.apache.org/jira/browse/CASSANDRA-8872 Project: Cassandra Issue Type: Improvement Reporter: Jon Haddad Assignee: Dave Brosius Priority: Trivial Fix For: 3.0 Attachments: 8872.txt Let's be consistent with our naming. We should rename cfhistograms cfstats to tablehistograms and tablestats. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9140) Scrub should handle corrupted compressed chunks
[ https://issues.apache.org/jira/browse/CASSANDRA-9140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14502604#comment-14502604 ] Stefania commented on CASSANDRA-9140: - I prepared a patch for trunk and one for 2.0 since there is an issue in {{CompressedRandomAccessReader}} that is only on trunk and since the test compression parameters loading is different on trunk. Then I noticed that the merge from 2.0 to 2.1 was not straightforward either and so I saved the 2.1 patch as well, all attached as links. The main change, other than {{testScrubCorruptedCounterRow()}} itself, is to make the scrub algorithm keep on seeking to partition positions read from the index rather than giving up at the second attempt, since when a compression chunk is corrupted many partitions will be lost, not just one. Also, I don't think it's right to assume that the key read from the data file is correct when it is different from the key read from the index, because in case of corrupted data we would most likely read junk. In fact the existing test, {{testScrubCorruptedCounterRow()}}, was passing just because we would try to read beyond the file end and therefore end up with a null key due to the EOF exception. When I increased the number of partitions in the test (without compression), it started to report one empty row and one bad row, rather than just one bad row. [~thobbs], let me know if you can take this review or if we need to find someone else. Scrub should handle corrupted compressed chunks --- Key: CASSANDRA-9140 URL: https://issues.apache.org/jira/browse/CASSANDRA-9140 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Tyler Hobbs Assignee: Stefania Fix For: 2.0.15, 2.1.5 Scrub can handle corruption within a row, but can't handle corruption of a compressed sstable that results in being unable to decompress a chunk. Since the majority of Cassandra users are probably running with compression enabled, it's important that scrub be able to handle this (likely more common) form of sstable corruption. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-9212) QueryHandler to return custom payloads for testing
Adam Holmberg created CASSANDRA-9212: Summary: QueryHandler to return custom payloads for testing Key: CASSANDRA-9212 URL: https://issues.apache.org/jira/browse/CASSANDRA-9212 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Adam Holmberg Priority: Minor Fix For: 3.0 While implementing custom payloads in client libraries, it is useful to have a QueryHandler that returns custom payloads. I'm wondering if the project would be amenable to including a QueryHandler that returns any payload sent with the request. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8603) Cut tombstone memory footprint in half for cql deletes
[ https://issues.apache.org/jira/browse/CASSANDRA-8603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14502865#comment-14502865 ] Sylvain Lebresne commented on CASSANDRA-8603: - bq. Sylvain Lebresne could you confirm my analysis? Yes, that sounds right. Cut tombstone memory footprint in half for cql deletes -- Key: CASSANDRA-8603 URL: https://issues.apache.org/jira/browse/CASSANDRA-8603 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Dominic Letz Assignee: Dominic Letz Labels: tombstone Attachments: cassandra-2.0.11-8603.txt, cassandra-2.1-8603.txt, cassandra-2.1-8603_v2.txt, system.log As CQL does not yet support range deletes every delete from CQL results in a Semi-RangeTombstone which actually has the same start and end values - but until today they are copies. Effectively doubling the required heap memory to store the RangeTombstone. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-6542) nodetool removenode hangs
[ https://issues.apache.org/jira/browse/CASSANDRA-6542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14502397#comment-14502397 ] Study Hsueh edited comment on CASSANDRA-6542 at 4/20/15 2:03 PM: - Also observed in 2.1.3 on CentOS 6.6 The nodes status Log. Host: 192.168.1.13 $ nodetool status Datacenter: datacenter1 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens OwnsHost ID Rack UN 192.168.1.29 22.95 GB 256 ? 506910ae-a07b-4f74-8feb-a3f2b141dea5 rack1 UN 192.168.1.28 19.68 GB 256 ? ed79b6ee-cae0-48f9-a420-338058e1f2c5 rack1 UN 192.168.1.13 25.72 GB 256 ? 595ea5ef-cecf-44c7-aa7f-424648791751 rack1 DN 192.168.1.27 ? 256 ? 2ca22f3d-f8d8-4bde-8cdc-de649056cf9c rack1 UN 192.168.1.26 20.71 GB 256 ? 3c880801-8499-4b16-bce4-2bfbc79bed43 rack1 $ nodetool removenode force 2ca22f3d-f8d8-4bde-8cdc-de649056cf9c # nodetool removenode hangs $ nodetool removenode status RemovalStatus: Removing token (-9132940871846770123). Waiting for replication confirmation from [/192.168.1.29,/192.168.1.28,/192.168.1.26]. Host: 192.168.1.28 $ nodetool status Datacenter: datacenter1 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens OwnsHost ID Rack UN 192.168.1.29 22.96 GB 256 ? 506910ae-a07b-4f74-8feb-a3f2b141dea5 rack1 UN 192.168.1.28 19.69 GB 256 ? ed79b6ee-cae0-48f9-a420-338058e1f2c5 rack1 UN 192.168.1.13 30.43 GB 256 ? 595ea5ef-cecf-44c7-aa7f-424648791751 rack1 UN 192.168.1.26 20.72 GB 256 ? 3c880801-8499-4b16-bce4-2bfbc79bed43 rack1 $ nodetool removenode status RemovalStatus: No token removals in process. Host: 192.168.1.29 $ nodetool status Datacenter: datacenter1 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens OwnsHost ID Rack UN 192.168.1.29 22.96 GB 256 ? 506910ae-a07b-4f74-8feb-a3f2b141dea5 rack1 UN 192.168.1.28 19.69 GB 256 ? ed79b6ee-cae0-48f9-a420-338058e1f2c5 rack1 UN 192.168.1.13 30.43 GB 256 ? 595ea5ef-cecf-44c7-aa7f-424648791751 rack1 UN 192.168.1.26 20.72 GB 256 ? 3c880801-8499-4b16-bce4-2bfbc79bed43 rack1 $ nodetool removenode status RemovalStatus: No token removals in process. Host: 192.168.1.26 nodetool status Datacenter: datacenter1 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens OwnsHost ID Rack UN 192.168.1.29 22.96 GB 256 ? 506910ae-a07b-4f74-8feb-a3f2b141dea5 rack1 UN 192.168.1.28 19.69 GB 256 ? ed79b6ee-cae0-48f9-a420-338058e1f2c5 rack1 UN 192.168.1.13 30.43 GB 256 ? 595ea5ef-cecf-44c7-aa7f-424648791751 rack1 UN 192.168.1.26 20.72 GB 256 ? 3c880801-8499-4b16-bce4-2bfbc79bed43 rack1 $ nodetool removenode status RemovalStatus: No token removals in process. was (Author: study): Also observed in 2.1.13 on CentOS 6.6 The nodes status Log. Host: 192.168.1.13 $ nodetool status Datacenter: datacenter1 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens OwnsHost ID Rack UN 192.168.1.29 22.95 GB 256 ? 506910ae-a07b-4f74-8feb-a3f2b141dea5 rack1 UN 192.168.1.28 19.68 GB 256 ? ed79b6ee-cae0-48f9-a420-338058e1f2c5 rack1 UN
[jira] [Commented] (CASSANDRA-9136) Improve error handling when table is queried before the schema has fully propagated
[ https://issues.apache.org/jira/browse/CASSANDRA-9136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14502902#comment-14502902 ] Sylvain Lebresne commented on CASSANDRA-9136: - It's not unreasonable per-se, but the fact that you have to manually pass how much bytes you've deserialized when throwing the exception makes this a bit error prone in general imo, even though it's arguably easy enough to proof check in this particular case (it would also make it slightly more annoying to add support for {{EncodedDataInputStream}} if we wanted too for instance, though that's a minor point). The intial idea I had was to use something like {{BytesReadTracker}} to make the counting automatic, but I'm married to that idea either though since it adds a small overhead in general which I don't like. Overall, I respect wanting to improve this but I think I'm of the opinion that simply making the error message a lot more clear should be good enough and that it's not worth trying to be too smart in recovering. Not a strong opinion though, just a data point. Improve error handling when table is queried before the schema has fully propagated --- Key: CASSANDRA-9136 URL: https://issues.apache.org/jira/browse/CASSANDRA-9136 Project: Cassandra Issue Type: Bug Components: Core Environment: 3 Nodes GCE, N1-Standard-2, Ubuntu 12, 1 Node on 2.1.4, 2 on 2.0.14 Reporter: Russell Alexander Spitzer Assignee: Tyler Hobbs Fix For: 2.1.5 This error occurs during a rolling upgrade between 2.0.14 and 2.1.4. h3. Repo With all the nodes on 2.0.14 make the following tables {code} CREATE KEYSPACE test WITH replication = { 'class': 'SimpleStrategy', 'replication_factor': '2' }; USE test; CREATE TABLE compact ( k int, c int, d int, PRIMARY KEY ((k), c) ) WITH COMPACT STORAGE; CREATE TABLE norm ( k int, c int, d int, PRIMARY KEY ((k), c) ) ; {code} Then load some data into these tables. I used the python driver {code} from cassandra.cluster import Cluster s = Cluster().connect() for x in range (1000): for y in range (1000): s.execute_async(INSERT INTO test.compact (k,c,d) VALUES (%d,%d,%d)%(x,y,y)) s.execute_async(INSERT INTO test.norm (k,c,d) VALUES (%d,%d,%d)%(x,y,y)) {code} Upgrade one node from 2.0.14 - 2.1.4 From the 2.1.4 node, create a new table. Query that table On the 2.0.14 nodes you get these exceptions because the schema didn't propagate there. This exception kills the TCP connection between the nodes. {code} ERROR [Thread-19] 2015-04-08 18:48:45,337 CassandraDaemon.java (line 258) Exception in thread Thread[Thread-19,5,main] java.lang.NullPointerException at org.apache.cassandra.db.RangeSliceCommandSerializer.deserialize(RangeSliceCommand.java:247) at org.apache.cassandra.db.RangeSliceCommandSerializer.deserialize(RangeSliceCommand.java:156) at org.apache.cassandra.net.MessageIn.read(MessageIn.java:99) at org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:149) at org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:131) at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:74) {code} Run cqlsh on the upgraded node and queries will fail until the TCP connection is established again, easiest to repo with CL = ALL {code} cqlsh SELECT count(*) FROM test.norm where k = 22 ; ReadTimeout: code=1200 [Coordinator node timed out waiting for replica nodes' responses] message=Operation timed out - received only 1 responses. info={'received_responses': 1, 'required_responses': 2, 'consistency': 'ALL'} cqlsh SELECT count(*) FROM test.norm where k = 21 ; ReadTimeout: code=1200 [Coordinator node timed out waiting for replica nodes' responses] message=Operation timed out - received only 1 responses. info={'received_responses': 1, 'required_responses': 2, 'consistency': 'ALL'} {code} So connection made: {code} DEBUG [Thread-227] 2015-04-09 05:09:02,718 IncomingTcpConnection.java (line 107) Set version for /10.240.14.115 to 8 (will use 7) {code} Connection broken by query of table before schema propagated: {code} ERROR [Thread-227] 2015-04-09 05:10:24,015 CassandraDaemon.java (line 258) Exception in thread Thread[Thread-227,5,main] java.lang.NullPointerException at org.apache.cassandra.db.RangeSliceCommandSerializer.deserialize(RangeSliceCommand.java:247) at org.apache.cassandra.db.RangeSliceCommandSerializer.deserialize(RangeSliceCommand.java:156) at org.apache.cassandra.net.MessageIn.read(MessageIn.java:99) at
[jira] [Commented] (CASSANDRA-8014) NPE in Message.java line 324
[ https://issues.apache.org/jira/browse/CASSANDRA-8014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14502921#comment-14502921 ] Peter Haggerty commented on CASSANDRA-8014: --- We just saw this again on 2.0.11 in very similar circumstances (gently shutting down cassandra with disable commands before terminating it): {code} ERROR [RPC-Thread:50] 2015-04-20 14:14:23,165 CassandraDaemon.java (line 199) Exception in thread Thread[RPC-Thread:50,5,main] java.lang.RuntimeException: java.lang.NullPointerException at com.lmax.disruptor.FatalExceptionHandler.handleEventException(FatalExceptionHandler.java:45) at com.lmax.disruptor.WorkProcessor.run(WorkProcessor.java:126) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.NullPointerException at com.thinkaurelius.thrift.Message.getInputTransport(Message.java:338) at com.thinkaurelius.thrift.Message.invoke(Message.java:308) at com.thinkaurelius.thrift.Message$Invocation.execute(Message.java:90) at com.thinkaurelius.thrift.TDisruptorServer$InvocationHandler.onEvent(TDisruptorServer.java:695) at com.thinkaurelius.thrift.TDisruptorServer$InvocationHandler.onEvent(TDisruptorServer.java:689) at com.lmax.disruptor.WorkProcessor.run(WorkProcessor.java:112) ... 3 more {code} NPE in Message.java line 324 Key: CASSANDRA-8014 URL: https://issues.apache.org/jira/browse/CASSANDRA-8014 Project: Cassandra Issue Type: Bug Components: Core Environment: Cassandra 2.0.9 Reporter: Peter Haggerty Assignee: Pavel Yaskevich Attachments: NPE_Message.java_line-324.txt We received this when a server was rebooting and attempted to shut Cassandra down while it was still quite busy. While it's normal for us to have a handful of the RejectedExecution exceptions on a sudden shutdown like this these NPEs in Message.java are new. The attached file include the logs from StorageServiceShutdownHook to the Logging initialized after the server restarts and Cassandra comes back up. {code}ERROR [pool-10-thread-2] 2014-09-29 08:33:44,055 Message.java (line 324) Unexpected throwable while invoking! java.lang.NullPointerException at com.thinkaurelius.thrift.util.mem.Buffer.size(Buffer.java:83) at com.thinkaurelius.thrift.util.mem.FastMemoryOutputTransport.expand(FastMemoryOutputTransport.java:84) at com.thinkaurelius.thrift.util.mem.FastMemoryOutputTransport.write(FastMemoryOutputTransport.java:167) at org.apache.thrift.transport.TFramedTransport.flush(TFramedTransport.java:156) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:55) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at com.thinkaurelius.thrift.Message.invoke(Message.java:314) at com.thinkaurelius.thrift.Message$Invocation.execute(Message.java:90) at com.thinkaurelius.thrift.TDisruptorServer$InvocationHandler.onEvent(TDisruptorServer.java:638) at com.thinkaurelius.thrift.TDisruptorServer$InvocationHandler.onEvent(TDisruptorServer.java:632) at com.lmax.disruptor.WorkProcessor.run(WorkProcessor.java:112) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9212) QueryHandler to return custom payloads for testing
[ https://issues.apache.org/jira/browse/CASSANDRA-9212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Holmberg updated CASSANDRA-9212: - Attachment: 9212.txt Attached patch adds a custom QueryHandler that can be switched in for testing purposes. Alternatively, we could make the default QueryHandler do this, but this approach seems less intrusive. QueryHandler to return custom payloads for testing -- Key: CASSANDRA-9212 URL: https://issues.apache.org/jira/browse/CASSANDRA-9212 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Adam Holmberg Priority: Minor Fix For: 3.0 Attachments: 9212.txt While implementing custom payloads in client libraries, it is useful to have a QueryHandler that returns custom payloads. I'm wondering if the project would be amenable to including a QueryHandler that returns any payload sent with the request. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7168) Add repair aware consistency levels
[ https://issues.apache.org/jira/browse/CASSANDRA-7168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14502981#comment-14502981 ] Sylvain Lebresne commented on CASSANDRA-7168: - bq. I'd rather just apply this as an optimization to all CL ONE, replacing the data/digest split that is almost certainly less useful. I'll admit that I find this a bit scary. This means relying on the repaired time in a way that I personally don't feel yet very comfortable with. At the very least, I think we should preserve the option to do full data queries. And as much as I understand the willingness to simplify the code, I would personally be a lot more comfortable if this was living aside the existing mechanism at first. I can agree however than having a specific CL is weird since that really apply to pretty much all CL. I'd be fine with adding a flag in the native protocol to allow or disallow that feature for instance. Add repair aware consistency levels --- Key: CASSANDRA-7168 URL: https://issues.apache.org/jira/browse/CASSANDRA-7168 Project: Cassandra Issue Type: Improvement Components: Core Reporter: T Jake Luciani Labels: performance Fix For: 3.1 With CASSANDRA-5351 and CASSANDRA-2424 I think there is an opportunity to avoid a lot of extra disk I/O when running queries with higher consistency levels. Since repaired data is by definition consistent and we know which sstables are repaired, we can optimize the read path by having a REPAIRED_QUORUM which breaks reads into two phases: 1) Read from one replica the result from the repaired sstables. 2) Read from a quorum only the un-repaired data. For the node performing 1) we can pipeline the call so it's a single hop. In the long run (assuming data is repaired regularly) we will end up with much closer to CL.ONE performance while maintaining consistency. Some things to figure out: - If repairs fail on some nodes we can have a situation where we don't have a consistent repaired state across the replicas. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8014) NPE in Message.java line 324
[ https://issues.apache.org/jira/browse/CASSANDRA-8014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Haggerty updated CASSANDRA-8014: -- Environment: Cassandra 2.0.9, Cassandra 2.0.11 (was: Cassandra 2.0.9) NPE in Message.java line 324 Key: CASSANDRA-8014 URL: https://issues.apache.org/jira/browse/CASSANDRA-8014 Project: Cassandra Issue Type: Bug Components: Core Environment: Cassandra 2.0.9, Cassandra 2.0.11 Reporter: Peter Haggerty Assignee: Pavel Yaskevich Attachments: NPE_Message.java_line-324.txt We received this when a server was rebooting and attempted to shut Cassandra down while it was still quite busy. While it's normal for us to have a handful of the RejectedExecution exceptions on a sudden shutdown like this these NPEs in Message.java are new. The attached file include the logs from StorageServiceShutdownHook to the Logging initialized after the server restarts and Cassandra comes back up. {code}ERROR [pool-10-thread-2] 2014-09-29 08:33:44,055 Message.java (line 324) Unexpected throwable while invoking! java.lang.NullPointerException at com.thinkaurelius.thrift.util.mem.Buffer.size(Buffer.java:83) at com.thinkaurelius.thrift.util.mem.FastMemoryOutputTransport.expand(FastMemoryOutputTransport.java:84) at com.thinkaurelius.thrift.util.mem.FastMemoryOutputTransport.write(FastMemoryOutputTransport.java:167) at org.apache.thrift.transport.TFramedTransport.flush(TFramedTransport.java:156) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:55) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at com.thinkaurelius.thrift.Message.invoke(Message.java:314) at com.thinkaurelius.thrift.Message$Invocation.execute(Message.java:90) at com.thinkaurelius.thrift.TDisruptorServer$InvocationHandler.onEvent(TDisruptorServer.java:638) at com.thinkaurelius.thrift.TDisruptorServer$InvocationHandler.onEvent(TDisruptorServer.java:632) at com.lmax.disruptor.WorkProcessor.run(WorkProcessor.java:112) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8718) nodetool cleanup causes segfault
[ https://issues.apache.org/jira/browse/CASSANDRA-8718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503263#comment-14503263 ] Justin Poole commented on CASSANDRA-8718: - Sorry for late comment; but I confirm that upgrading from 2.0.12 to 2.0.13 fixes the issue. Thank you! nodetool cleanup causes segfault Key: CASSANDRA-8718 URL: https://issues.apache.org/jira/browse/CASSANDRA-8718 Project: Cassandra Issue Type: Bug Reporter: Maxim Ivanov Assignee: Joshua McKenzie Priority: Minor Fix For: 2.0.15 Attachments: java_hs_err.log When doing cleanup on C* 2.0.12 following error crashes the java process: {code} INFO 17:59:02,800 Cleaning up SSTableReader(path='/data/sdd/cassandra_prod/vdna/analytics/vdna-analytics-jb-21670-Data.db') # # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x7f750890268e, pid=28039, tid=140130222446336 # # JRE version: Java(TM) SE Runtime Environment (7.0_71-b14) (build 1.7.0_71-b14) # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.71-b01 mixed mode linux-amd64 compressed oops) # Problematic frame: # J 2655 C2 org.apache.cassandra.io.sstable.IndexSummary.binarySearch(Lorg/apache/cassandra/db/RowPosition;)I (88 bytes) @ 0x7f750890268e [0x7f7508902580+0x10e] # # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try ulimit -c unlimited before starting Java again # # An error report file with more information is saved as: # /var/lib/cassandra_prod/hs_err_pid28039.log Compiled method (c2) 913167265 4849 org.apache.cassandra.dht.Token::maxKeyBound (24 bytes) total in heap [0x7f7508572450,0x7f7508573318] = 3784 relocation [0x7f7508572570,0x7f7508572618] = 168 main code [0x7f7508572620,0x7f7508572cc0] = 1696 stub code [0x7f7508572cc0,0x7f7508572cf8] = 56 oops [0x7f7508572cf8,0x7f7508572d90] = 152 scopes data[0x7f7508572d90,0x7f7508573118] = 904 scopes pcs [0x7f7508573118,0x7f7508573268] = 336 dependencies [0x7f7508573268,0x7f7508573280] = 24 handler table [0x7f7508573280,0x7f75085732e0] = 96 nul chk table [0x7f75085732e0,0x7f7508573318] = 56 # # If you would like to submit a bug report, please visit: # http://bugreport.sun.com/bugreport/crash.jsp # {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (CASSANDRA-8718) nodetool cleanup causes segfault
[ https://issues.apache.org/jira/browse/CASSANDRA-8718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joshua McKenzie resolved CASSANDRA-8718. Resolution: Fixed Fix Version/s: (was: 2.0.15) 2.0.13 nodetool cleanup causes segfault Key: CASSANDRA-8718 URL: https://issues.apache.org/jira/browse/CASSANDRA-8718 Project: Cassandra Issue Type: Bug Reporter: Maxim Ivanov Assignee: Joshua McKenzie Priority: Minor Fix For: 2.0.13 Attachments: java_hs_err.log When doing cleanup on C* 2.0.12 following error crashes the java process: {code} INFO 17:59:02,800 Cleaning up SSTableReader(path='/data/sdd/cassandra_prod/vdna/analytics/vdna-analytics-jb-21670-Data.db') # # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x7f750890268e, pid=28039, tid=140130222446336 # # JRE version: Java(TM) SE Runtime Environment (7.0_71-b14) (build 1.7.0_71-b14) # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.71-b01 mixed mode linux-amd64 compressed oops) # Problematic frame: # J 2655 C2 org.apache.cassandra.io.sstable.IndexSummary.binarySearch(Lorg/apache/cassandra/db/RowPosition;)I (88 bytes) @ 0x7f750890268e [0x7f7508902580+0x10e] # # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try ulimit -c unlimited before starting Java again # # An error report file with more information is saved as: # /var/lib/cassandra_prod/hs_err_pid28039.log Compiled method (c2) 913167265 4849 org.apache.cassandra.dht.Token::maxKeyBound (24 bytes) total in heap [0x7f7508572450,0x7f7508573318] = 3784 relocation [0x7f7508572570,0x7f7508572618] = 168 main code [0x7f7508572620,0x7f7508572cc0] = 1696 stub code [0x7f7508572cc0,0x7f7508572cf8] = 56 oops [0x7f7508572cf8,0x7f7508572d90] = 152 scopes data[0x7f7508572d90,0x7f7508573118] = 904 scopes pcs [0x7f7508573118,0x7f7508573268] = 336 dependencies [0x7f7508573268,0x7f7508573280] = 24 handler table [0x7f7508573280,0x7f75085732e0] = 96 nul chk table [0x7f75085732e0,0x7f7508573318] = 56 # # If you would like to submit a bug report, please visit: # http://bugreport.sun.com/bugreport/crash.jsp # {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6477) Global indexes
[ https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503291#comment-14503291 ] Aleksey Yeschenko commented on CASSANDRA-6477: -- bq. Why not call the feature high cardinality index since that's the use case it is focused on, right? As I see it, the focus is on making denormalization trivial, not on indexing - which is why I agree with Sylvain that we should drop the KEYS-only part, and why 'index' in the name is misleading. Global indexes -- Key: CASSANDRA-6477 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477 Project: Cassandra Issue Type: New Feature Components: API, Core Reporter: Jonathan Ellis Assignee: Carl Yeksigian Labels: cql Fix For: 3.0 Local indexes are suitable for low-cardinality data, where spreading the index across the cluster is a Good Thing. However, for high-cardinality data, local indexes require querying most nodes in the cluster even if only a handful of rows is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9146) Ever Growing Secondary Index sstables after every Repair
[ https://issues.apache.org/jira/browse/CASSANDRA-9146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503307#comment-14503307 ] Anuj commented on CASSANDRA-9146: - Yes we use vnodes.We havent changed cold_reads_to_ommit Ever Growing Secondary Index sstables after every Repair Key: CASSANDRA-9146 URL: https://issues.apache.org/jira/browse/CASSANDRA-9146 Project: Cassandra Issue Type: Bug Components: Core Reporter: Anuj Attachments: sstables.txt, system-modified.log Cluster has reached a state where every repair -pr operation on CF results in numerous tiny sstables being flushed to disk. Most sstables are related to secondary indexes. Due to thousands of sstables, reads have started timing out. Even though compaction begins for one of the secondary index, sstable count after repair remains very high (thousands). Every repair adds thousands of sstables. Problems: 1. Why burst of tiny secondary index tables are flushed during repair ? What is triggering frequent/premature flush of secondary index sstable (more than hundred in every burst)? At max we see one ParNew GC pauses 200ms. 2. Why auto-compaction is not compacting all sstables. Is it related to coldness issue(CASSANDRA-8885) where compaction doesn't works even when cold_reads_to_omit=0 by default? If coldness is the issue, we are stuck in infinite loop: reads will trigger compaction but reads timeout as sstable count is in thousands 3. What's the way out if we face this issue in Prod? Is this issue fixed in latest production release 2.0.13? Issue looks similar to CASSANDRA-8641, but the issue is fixed in only 2.1.3. I think it should be fixed in 2.0 branch too. Configuration: Compaction Strategy: STCS memtable_flush_writers=4 memtable_flush_queue_size=4 in_memory_compaction_limit_in_mb=32 concurrent_compactors=12 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9100) Gossip is inadequately tested
[ https://issues.apache.org/jira/browse/CASSANDRA-9100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503268#comment-14503268 ] Brandon Williams commented on CASSANDRA-9100: - Unfortunately, this is where singletons begin to bite us. We can't just spin up a bunch of gossipers in a unit test, we have to spin up a bunch of JVMs in a dtest :( Gossip is inadequately tested - Key: CASSANDRA-9100 URL: https://issues.apache.org/jira/browse/CASSANDRA-9100 Project: Cassandra Issue Type: Test Components: Core Reporter: Ariel Weisberg We found a few unit tests, but nothing that exercises Gossip under challenging conditions. Maybe consider a long test that hooks up some gossipers over a fake network and then do fault injection on that fake network. Uni-directional and bi-directional partitions, delayed delivery, out of order delivery if that is something that they can see in practice. Connects/disconnects. Also play with bad clocks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6477) Global indexes
[ https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503273#comment-14503273 ] Jack Krupansky commented on CASSANDRA-6477: --- Why not call the feature high cardinality index since that's the use case it is focused on, right? My personal preference would be to have a cardinality option clause with option values like low, medium, high, and unique. The default being low. A global index would be implied for high and unique cardinality. Global indexes -- Key: CASSANDRA-6477 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477 Project: Cassandra Issue Type: New Feature Components: API, Core Reporter: Jonathan Ellis Assignee: Carl Yeksigian Labels: cql Fix For: 3.0 Local indexes are suitable for low-cardinality data, where spreading the index across the cluster is a Good Thing. However, for high-cardinality data, local indexes require querying most nodes in the cluster even if only a handful of rows is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9146) Ever Growing Secondary Index sstables after every Repair
[ https://issues.apache.org/jira/browse/CASSANDRA-9146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503070#comment-14503070 ] Marcus Eriksson commented on CASSANDRA-9146: vnodes? have you changed cold_reads_to_omit? Ever Growing Secondary Index sstables after every Repair Key: CASSANDRA-9146 URL: https://issues.apache.org/jira/browse/CASSANDRA-9146 Project: Cassandra Issue Type: Bug Components: Core Reporter: Anuj Attachments: sstables.txt, system-modified.log Cluster has reached a state where every repair -pr operation on CF results in numerous tiny sstables being flushed to disk. Most sstables are related to secondary indexes. Due to thousands of sstables, reads have started timing out. Even though compaction begins for one of the secondary index, sstable count after repair remains very high (thousands). Every repair adds thousands of sstables. Problems: 1. Why burst of tiny secondary index tables are flushed during repair ? What is triggering frequent/premature flush of secondary index sstable (more than hundred in every burst)? At max we see one ParNew GC pauses 200ms. 2. Why auto-compaction is not compacting all sstables. Is it related to coldness issue(CASSANDRA-8885) where compaction doesn't works even when cold_reads_to_omit=0 by default? If coldness is the issue, we are stuck in infinite loop: reads will trigger compaction but reads timeout as sstable count is in thousands 3. What's the way out if we face this issue in Prod? Is this issue fixed in latest production release 2.0.13? Issue looks similar to CASSANDRA-8641, but the issue is fixed in only 2.1.3. I think it should be fixed in 2.0 branch too. Configuration: Compaction Strategy: STCS memtable_flush_writers=4 memtable_flush_queue_size=4 in_memory_compaction_limit_in_mb=32 concurrent_compactors=12 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9193) Facility to write dynamic code to selectively trigger trace or log for queries
[ https://issues.apache.org/jira/browse/CASSANDRA-9193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503069#comment-14503069 ] Sylvain Lebresne commented on CASSANDRA-9193: - Hard to have an opinion without having more context on how this is working, but I do feel like we shouldn't ask users to work on serialized stuffs in general. That is, people should have the ability to filter on CQL partition key column values and the handling of composite versus non-composite partition keys should be transparent. bq. StorageProxy doesn't have the context of a CQL3 query It kind of does, through the CFMetaData (which will tell you if the partition key is composite or not). Facility to write dynamic code to selectively trigger trace or log for queries -- Key: CASSANDRA-9193 URL: https://issues.apache.org/jira/browse/CASSANDRA-9193 Project: Cassandra Issue Type: New Feature Reporter: Matt Stump I want the equivalent of dtrace for Cassandra. I want the ability to intercept a query with a dynamic script (assume JS) and based on logic in that script trigger the statement for trace or logging. Examples - Trace only INSERT statements to a particular CF. - Trace statements for a particular partition or consistency level. - Log statements that fail to reach the desired consistency for read or write. - Log If the request size for read or write exceeds some threshold At some point in the future it would be helpful to also do things such as log partitions greater than X bytes or Z cells when performing compaction. Essentially be able to inject custom code dynamically without a reboot to the different stages of C*. The code should be executed synchronously as part of the monitored task, but we should provide the ability to log or execute CQL asynchronously from the provided API. Further down the line we could use this functionality to modify/rewrite requests or tasks dynamically. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9213) Compaction errors observed during heavy write load: BAD RELEASE
[ https://issues.apache.org/jira/browse/CASSANDRA-9213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503079#comment-14503079 ] Philip Thompson commented on CASSANDRA-9213: /cc [~krummas] Compaction errors observed during heavy write load: BAD RELEASE --- Key: CASSANDRA-9213 URL: https://issues.apache.org/jira/browse/CASSANDRA-9213 Project: Cassandra Issue Type: Bug Environment: Cassandra 2.1.4.374 Ubuntu 14.04.2 java version 1.7.0_45 10-node cluster, RF = 3 Reporter: Rocco Varela Fix For: 2.1.5 Attachments: COMPACTION-ERR.log During heavy write load testing we're seeing occasional compaction errors with the following error message: {code} ERROR [CompactionExecutor:40] 2015-04-16 17:01:16,936 Ref.java:170 - BAD RELEASE: attempted to release a reference (org.apache.cassandra.utils.concurrent.Ref$State@31d969bd) that has already been released ... ERROR [CompactionExecutor:40] 2015-04-16 17:01:22,190 CassandraDaemon.java:223 - Exception in thread Thread[CompactionExecutor:40,1,main] java.lang.AssertionError: null at org.apache.cassandra.io.sstable.SSTableReader.markObsolete(SSTableReader.java:1699) ~[cassandra-all-2.1.4.374.jar:2.1.4.374] at org.apache.cassandra.db.DataTracker.unmarkCompacting(DataTracker.java:240) ~[cassandra-all-2.1.4.374.jar:2.1.4.374] at org.apache.cassandra.io.sstable.SSTableRewriter.replaceWithFinishedReaders(SSTableRewriter.java:495) ~[cassandra-all-2.1.4.374.jar:2.1.4.374] at ... {code} I have turned on debugrefcount in bin/cassandra:launch_service() and I will repost another stack trace when it happens again. {code} cassandra_parms=$cassandra_parms -Dcassandra.debugrefcount=true {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9193) Facility to write dynamic code to selectively trigger trace or log for queries
[ https://issues.apache.org/jira/browse/CASSANDRA-9193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503111#comment-14503111 ] Albert P Tobey commented on CASSANDRA-9193: --- Maybe just steal this? https://github.com/datastax/nodejs-driver/blob/master/lib/tokenizer.js Facility to write dynamic code to selectively trigger trace or log for queries -- Key: CASSANDRA-9193 URL: https://issues.apache.org/jira/browse/CASSANDRA-9193 Project: Cassandra Issue Type: New Feature Reporter: Matt Stump I want the equivalent of dtrace for Cassandra. I want the ability to intercept a query with a dynamic script (assume JS) and based on logic in that script trigger the statement for trace or logging. Examples - Trace only INSERT statements to a particular CF. - Trace statements for a particular partition or consistency level. - Log statements that fail to reach the desired consistency for read or write. - Log If the request size for read or write exceeds some threshold At some point in the future it would be helpful to also do things such as log partitions greater than X bytes or Z cells when performing compaction. Essentially be able to inject custom code dynamically without a reboot to the different stages of C*. The code should be executed synchronously as part of the monitored task, but we should provide the ability to log or execute CQL asynchronously from the provided API. Further down the line we could use this functionality to modify/rewrite requests or tasks dynamically. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6477) Global indexes
[ https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503114#comment-14503114 ] Jonathan Ellis commented on CASSANDRA-6477: --- Materialized views is a much broader feature. I think users will be disappointed with the limitations if we try to call it that. (And DynamoDB has been using the GI term for a year and a half now, I'm not in favor of forking terminology without a clearly superior alternative.) Global indexes -- Key: CASSANDRA-6477 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477 Project: Cassandra Issue Type: New Feature Components: API, Core Reporter: Jonathan Ellis Assignee: Carl Yeksigian Labels: cql Fix For: 3.0 Local indexes are suitable for low-cardinality data, where spreading the index across the cluster is a Good Thing. However, for high-cardinality data, local indexes require querying most nodes in the cluster even if only a handful of rows is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7523) add date and time types
[ https://issues.apache.org/jira/browse/CASSANDRA-7523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503118#comment-14503118 ] Joshua McKenzie commented on CASSANDRA-7523: That was perhaps poorly-phrased on my part. I was referring to documentation in the protocol spec doc in v4, and it turns out I missed adding the serialization format information in there (added option id's). Created CASSANDRA-9214 to track that. add date and time types --- Key: CASSANDRA-7523 URL: https://issues.apache.org/jira/browse/CASSANDRA-7523 Project: Cassandra Issue Type: New Feature Components: API Reporter: Jonathan Ellis Assignee: Joshua McKenzie Priority: Minor Labels: client-impacting, docs Fix For: 2.1.5 http://www.postgresql.org/docs/9.1/static/datatype-datetime.html (we already have timestamp; interval is out of scope for now, and see CASSANDRA-6350 for discussion on timestamp-with-time-zone. but date/time should be pretty easy to add.) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9100) Gossip is inadequately tested
[ https://issues.apache.org/jira/browse/CASSANDRA-9100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ariel Weisberg updated CASSANDRA-9100: -- Description: We found a few unit tests, but nothing that exercises Gossip under challenging conditions. Maybe consider a long test that hooks up some gossipers over a fake network and then do fault injection on that fake network. Uni-directional and bi-directional partitions, delayed delivery, out of order delivery if that is something that they can see in practice. Connects/disconnects. Also play with bad clocks. was:We found a few unit tests, but nothing that exercises Gossip under challenging conditions. Maybe consider a long test that hooks up some gossipers over a fake network and then do fault injection on that fake network. Uni-directional and bi-directional partitions, delayed delivery, out of order delivery if that is something that they can see in practice. Connects/disconnects. Gossip is inadequately tested - Key: CASSANDRA-9100 URL: https://issues.apache.org/jira/browse/CASSANDRA-9100 Project: Cassandra Issue Type: Test Components: Core Reporter: Ariel Weisberg We found a few unit tests, but nothing that exercises Gossip under challenging conditions. Maybe consider a long test that hooks up some gossipers over a fake network and then do fault injection on that fake network. Uni-directional and bi-directional partitions, delayed delivery, out of order delivery if that is something that they can see in practice. Connects/disconnects. Also play with bad clocks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9198) Deleting from an empty list produces an error
[ https://issues.apache.org/jira/browse/CASSANDRA-9198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Jirsa updated CASSANDRA-9198: -- Attachment: (was: 9198.txt) Deleting from an empty list produces an error - Key: CASSANDRA-9198 URL: https://issues.apache.org/jira/browse/CASSANDRA-9198 Project: Cassandra Issue Type: Bug Components: API Reporter: Olivier Michallat Assignee: Benjamin Lerer Priority: Minor Fix For: 3.0 While deleting an element from a list that does not contain it is a no-op, deleting it from an empty list causes an error. This edge case is a bit inconsistent, because it makes list deletion non idempotent: {code} cqlsh:test create table foo (k int primary key, v listint); cqlsh:test insert into foo(k,v) values (1, [1,2]); cqlsh:test update foo set v = v - [1] where k = 1; cqlsh:test update foo set v = v - [1] where k = 1; cqlsh:test update foo set v = v - [2] where k = 1; cqlsh:test update foo set v = v - [2] where k = 1; InvalidRequest: code=2200 [Invalid query] message=Attempted to delete an element from a list which is null {code} With speculative retries coming to the drivers, idempotency becomes more important because it determines which query we might retry or not. So it would be better if deleting from an empty list succeeded. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6477) Global indexes
[ https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503178#comment-14503178 ] Sylvain Lebresne commented on CASSANDRA-6477: - As a data point, my own preference would be to call them denormalized views. As my preference would go to never do the only keys, I can agree that calling them index is a bit misleading. But I can also agree that materialized views might confuse SQL users that might expect more from them than what we'll offer. So granted denormalized views doesn't reference an existing term but I see that as a feature. Plus denormalization is really what we're doing there so ... Global indexes -- Key: CASSANDRA-6477 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477 Project: Cassandra Issue Type: New Feature Components: API, Core Reporter: Jonathan Ellis Assignee: Carl Yeksigian Labels: cql Fix For: 3.0 Local indexes are suitable for low-cardinality data, where spreading the index across the cluster is a Good Thing. However, for high-cardinality data, local indexes require querying most nodes in the cluster even if only a handful of rows is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6477) Global indexes
[ https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503184#comment-14503184 ] Aleksey Yeschenko commented on CASSANDRA-6477: -- bq. As a data point, my own preference would be to call them denormalized views. As my preference would go to never do the only keys, I can agree that calling them index is a bit misleading. But I can also agree that materialized views might confuse SQL users that might expect more from them than what we'll offer. So granted denormalized views doesn't reference an existing term but I see that as a feature. Plus denormalization is really what we're doing there so ... Fair enough. I'd +1 that. Global indexes -- Key: CASSANDRA-6477 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477 Project: Cassandra Issue Type: New Feature Components: API, Core Reporter: Jonathan Ellis Assignee: Carl Yeksigian Labels: cql Fix For: 3.0 Local indexes are suitable for low-cardinality data, where spreading the index across the cluster is a Good Thing. However, for high-cardinality data, local indexes require querying most nodes in the cluster even if only a handful of rows is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6477) Global indexes
[ https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503161#comment-14503161 ] Aleksey Yeschenko commented on CASSANDRA-6477: -- I see it mostly used for the automated denormalization, tbh, and not for the KEYS part, so I'd argue that materialized views is a more fitting name. Plus, with the upcoming SASI changes, there would be a huge gap between what local indexes/materialized views offer, in terms of expressiveness, and calling both 'indexes' would bring on more confusion in the long term. Also, I personally couldn't care less about the names used by DDB. I'd rather stick closer to what SQL has, since people coming from SQL world, and not people coming from DynamoDB, are our target audience. Global indexes -- Key: CASSANDRA-6477 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477 Project: Cassandra Issue Type: New Feature Components: API, Core Reporter: Jonathan Ellis Assignee: Carl Yeksigian Labels: cql Fix For: 3.0 Local indexes are suitable for low-cardinality data, where spreading the index across the cluster is a Good Thing. However, for high-cardinality data, local indexes require querying most nodes in the cluster even if only a handful of rows is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9212) QueryHandler to return custom payloads for testing
[ https://issues.apache.org/jira/browse/CASSANDRA-9212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-9212: -- Reviewer: Sam Tunnicliffe [~beobal] to review QueryHandler to return custom payloads for testing -- Key: CASSANDRA-9212 URL: https://issues.apache.org/jira/browse/CASSANDRA-9212 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Adam Holmberg Priority: Minor Fix For: 3.0 Attachments: 9212.txt While implementing custom payloads in client libraries, it is useful to have a QueryHandler that returns custom payloads. I'm wondering if the project would be amenable to including a QueryHandler that returns any payload sent with the request. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6477) Global indexes
[ https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503171#comment-14503171 ] Jonathan Ellis commented on CASSANDRA-6477: --- bq. calling both 'indexes' would bring on more confusion in the long term Disagreed. RDBMS users have lived with hash indexes vs btree indexes vs bitmap indexes with different abilities for a long time. (Notably hash indexes have exactly the limitations of global indexes.) MV otoh has a completely different feature set that GI doesn't even start to offer. bq. Also, I personally couldn't care less about the names used by DDB. I'd rather stick closer to what SQL has, since people coming from SQL world, and not people coming from DynamoDB, are our target audience. My point is that whether you come from SQL or NoSQL, MV is the wrong choice. Global indexes -- Key: CASSANDRA-6477 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477 Project: Cassandra Issue Type: New Feature Components: API, Core Reporter: Jonathan Ellis Assignee: Carl Yeksigian Labels: cql Fix For: 3.0 Local indexes are suitable for low-cardinality data, where spreading the index across the cluster is a Good Thing. However, for high-cardinality data, local indexes require querying most nodes in the cluster even if only a handful of rows is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6477) Global indexes
[ https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503173#comment-14503173 ] Jonathan Ellis commented on CASSANDRA-6477: --- Here is the difference: the definition of MV is to materialize *an arbitrary query*. That is not what we offer here. Global indexes -- Key: CASSANDRA-6477 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477 Project: Cassandra Issue Type: New Feature Components: API, Core Reporter: Jonathan Ellis Assignee: Carl Yeksigian Labels: cql Fix For: 3.0 Local indexes are suitable for low-cardinality data, where spreading the index across the cluster is a Good Thing. However, for high-cardinality data, local indexes require querying most nodes in the cluster even if only a handful of rows is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (CASSANDRA-9178) Test exposed JMX methods
[ https://issues.apache.org/jira/browse/CASSANDRA-9178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shawn Kumar reassigned CASSANDRA-9178: -- Assignee: Shawn Kumar Test exposed JMX methods Key: CASSANDRA-9178 URL: https://issues.apache.org/jira/browse/CASSANDRA-9178 Project: Cassandra Issue Type: Test Reporter: Carl Yeksigian Assignee: Shawn Kumar [~thobbs] added support for JMX testing in dtests, and we have seen issues related to nodetool testing in various different stages of execution. Tests which exercise the different methods which nodetool calls should be added to catch those issues early. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7523) add date and time types
[ https://issues.apache.org/jira/browse/CASSANDRA-7523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503206#comment-14503206 ] Aleksey Yeschenko commented on CASSANDRA-7523: -- While this is not committed-committed to an existing version yet, can we make the new types non-emptiable by default? See the last two CASSANDRA-8951 comments. add date and time types --- Key: CASSANDRA-7523 URL: https://issues.apache.org/jira/browse/CASSANDRA-7523 Project: Cassandra Issue Type: New Feature Components: API Reporter: Jonathan Ellis Assignee: Joshua McKenzie Priority: Minor Labels: client-impacting, docs Fix For: 2.1.5 http://www.postgresql.org/docs/9.1/static/datatype-datetime.html (we already have timestamp; interval is out of scope for now, and see CASSANDRA-6350 for discussion on timestamp-with-time-zone. but date/time should be pretty easy to add.) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9197) Startup slowdown due to preloading jemalloc
[ https://issues.apache.org/jira/browse/CASSANDRA-9197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503229#comment-14503229 ] Philip Thompson commented on CASSANDRA-9197: [~snazy], as discussed on IRC. I only notice this problem at the magnitude Brandon or Sylvain are mentioning on linux, not OSX. With Linux, installing jemalloc and then hardcoding the path as you stated does improve the startup time. Startup slowdown due to preloading jemalloc --- Key: CASSANDRA-9197 URL: https://issues.apache.org/jira/browse/CASSANDRA-9197 Project: Cassandra Issue Type: Bug Reporter: Sylvain Lebresne Assignee: Philip Thompson Priority: Minor On my box, it seems that the jemalloc loading from CASSANDRA-8714 made the process take ~10 seconds to even start (I have no explication for it). I don't know if it's specific to my machine or not, so that ticket is mainly so someone else can check if it sees the same, in particular for jenkins. If it does sees the same slowness, we might want to at least disable jemalloc for dtests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6477) Global indexes
[ https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503248#comment-14503248 ] Jeremiah Jordan commented on CASSANDRA-6477: My main issue is that Global Index means nothing to me, and I work in this space. So I think using the word materialized or denomarlized in the name would do a better job of making clear what they are. I don't really care what other words are used in the name ;). That being said, I don't see a problem with VIEW. While I agree that a VIEW in the SQL world has more utility, so does SELECT:. Seems to me we give you a subset of VIEW, just like we give you a subset of SELECT. I don't think it is confusing to not be able to materialize an arbitrary query that has joins and stuff in it, as we don't let you do that stuff in normal queries. And if we flesh this out over time with functions, composites, partial, etc, you get closer and closer to what you can do with a traditional VIEW. Global indexes -- Key: CASSANDRA-6477 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477 Project: Cassandra Issue Type: New Feature Components: API, Core Reporter: Jonathan Ellis Assignee: Carl Yeksigian Labels: cql Fix For: 3.0 Local indexes are suitable for low-cardinality data, where spreading the index across the cluster is a Good Thing. However, for high-cardinality data, local indexes require querying most nodes in the cluster even if only a handful of rows is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (CASSANDRA-7410) Pig support for BulkOutputFormat as a parameter in url
[ https://issues.apache.org/jira/browse/CASSANDRA-7410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Kołaczkowski updated CASSANDRA-7410: -- Comment: was deleted (was: Patch doesn't apply to 2.0 branch: {noformat} pkolaczk@m4600 ~/Projekty/DataStax/cassandra $ git fetch pkolaczk@m4600 ~/Projekty/DataStax/cassandra $ git checkout cassandra-2.0 Already on 'cassandra-2.0' Your branch is up-to-date with 'origin/cassandra-2.0'. pkolaczk@m4600 ~/Projekty/DataStax/cassandra $ git apply 7410-v3-2.0-branch.txt 7410-v3-2.0-branch.txt:195: trailing whitespace. [columns=columns][where_clause=where_clause] + error: patch failed: src/java/org/apache/cassandra/io/sstable/CQLSSTableWriter.java:345 error: src/java/org/apache/cassandra/io/sstable/CQLSSTableWriter.java: patch does not apply {noformat} ) Pig support for BulkOutputFormat as a parameter in url -- Key: CASSANDRA-7410 URL: https://issues.apache.org/jira/browse/CASSANDRA-7410 Project: Cassandra Issue Type: Improvement Components: Hadoop Reporter: Alex Liu Assignee: Alex Liu Priority: Minor Fix For: 2.0.15 Attachments: 7410-2.0-branch.txt, 7410-2.1-branch.txt, 7410-v2-2.0-branch.txt, 7410-v3-2.0-branch.txt, CASSANDRA-7410-v2-2.1-branch.txt, CASSANDRA-7410-v3-2.1-branch.txt, CASSANDRA-7410-v4-2.0-branch.txt Add BulkOutputFormat support in Pig url -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9198) Deleting from an empty list produces an error
[ https://issues.apache.org/jira/browse/CASSANDRA-9198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Jirsa updated CASSANDRA-9198: -- Attachment: 9198.txt Updated patch, allows idempotent deletes and corrects unit tests. Deleting from an empty list produces an error - Key: CASSANDRA-9198 URL: https://issues.apache.org/jira/browse/CASSANDRA-9198 Project: Cassandra Issue Type: Bug Components: API Reporter: Olivier Michallat Assignee: Benjamin Lerer Priority: Minor Fix For: 3.0 Attachments: 9198.txt While deleting an element from a list that does not contain it is a no-op, deleting it from an empty list causes an error. This edge case is a bit inconsistent, because it makes list deletion non idempotent: {code} cqlsh:test create table foo (k int primary key, v listint); cqlsh:test insert into foo(k,v) values (1, [1,2]); cqlsh:test update foo set v = v - [1] where k = 1; cqlsh:test update foo set v = v - [1] where k = 1; cqlsh:test update foo set v = v - [2] where k = 1; cqlsh:test update foo set v = v - [2] where k = 1; InvalidRequest: code=2200 [Invalid query] message=Attempted to delete an element from a list which is null {code} With speculative retries coming to the drivers, idempotency becomes more important because it determines which query we might retry or not. So it would be better if deleting from an empty list succeeded. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7168) Add repair aware consistency levels
[ https://issues.apache.org/jira/browse/CASSANDRA-7168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503164#comment-14503164 ] Jonathan Ellis commented on CASSANDRA-7168: --- What makes you uncomfortable about relying on repair time? What would make you more comfortable? Add repair aware consistency levels --- Key: CASSANDRA-7168 URL: https://issues.apache.org/jira/browse/CASSANDRA-7168 Project: Cassandra Issue Type: Improvement Components: Core Reporter: T Jake Luciani Labels: performance Fix For: 3.1 With CASSANDRA-5351 and CASSANDRA-2424 I think there is an opportunity to avoid a lot of extra disk I/O when running queries with higher consistency levels. Since repaired data is by definition consistent and we know which sstables are repaired, we can optimize the read path by having a REPAIRED_QUORUM which breaks reads into two phases: 1) Read from one replica the result from the repaired sstables. 2) Read from a quorum only the un-repaired data. For the node performing 1) we can pipeline the call so it's a single hop. In the long run (assuming data is repaired regularly) we will end up with much closer to CL.ONE performance while maintaining consistency. Some things to figure out: - If repairs fail on some nodes we can have a situation where we don't have a consistent repaired state across the replicas. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (CASSANDRA-6335) Hints broken for nodes that change broadcast address
[ https://issues.apache.org/jira/browse/CASSANDRA-6335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shawn Kumar resolved CASSANDRA-6335. Resolution: Cannot Reproduce Hints broken for nodes that change broadcast address Key: CASSANDRA-6335 URL: https://issues.apache.org/jira/browse/CASSANDRA-6335 Project: Cassandra Issue Type: Bug Components: Core Reporter: Rick Branson Assignee: Shawn Kumar When a node changes it's broadcast address, the transition process works properly, but hints that are destined for it can't be delivered because of the address change. It produces an exception: java.lang.AssertionError: Missing host ID for 10.1.60.22 at org.apache.cassandra.service.StorageProxy.writeHintForMutation(StorageProxy.java:598) at org.apache.cassandra.service.StorageProxy$5.runMayThrow(StorageProxy.java:567) at org.apache.cassandra.service.StorageProxy$HintRunnable.run(StorageProxy.java:1679) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6477) Global indexes
[ https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503182#comment-14503182 ] Jonathan Ellis commented on CASSANDRA-6477: --- Denormalized vs materialized is just using an unusual synonym. The problematic term is the view part. (Call them materialized indexes or denormalized indexes if you like, but don't call them views.) Global indexes -- Key: CASSANDRA-6477 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477 Project: Cassandra Issue Type: New Feature Components: API, Core Reporter: Jonathan Ellis Assignee: Carl Yeksigian Labels: cql Fix For: 3.0 Local indexes are suitable for low-cardinality data, where spreading the index across the cluster is a Good Thing. However, for high-cardinality data, local indexes require querying most nodes in the cluster even if only a handful of rows is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7410) Pig support for BulkOutputFormat as a parameter in url
[ https://issues.apache.org/jira/browse/CASSANDRA-7410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503183#comment-14503183 ] Piotr Kołaczkowski commented on CASSANDRA-7410: --- org/apache/cassandra/hadoop/pig/CqlNativeStorage.java:342 {noformat} private boolean serverEntryped() { if (!StringUtils.isEmpty(internodeEncrypt)) return InternodeEncryption.none != InternodeEncryption.valueOf(internodeEncrypt.toLowerCase()); return false; } {noformat} Typo: serverEntryped - serverEncrypted. Also, the if can be slightly simplified: {noformat} private boolean serverEntryped() { return !StringUtils.isEmpty(internodeEncrypt) InternodeEncryption.none != InternodeEncryption.valueOf(internodeEncrypt.toLowerCase()); } {noformat} Other than this, it looks good. +1 Pig support for BulkOutputFormat as a parameter in url -- Key: CASSANDRA-7410 URL: https://issues.apache.org/jira/browse/CASSANDRA-7410 Project: Cassandra Issue Type: Improvement Components: Hadoop Reporter: Alex Liu Assignee: Alex Liu Priority: Minor Fix For: 2.0.15 Attachments: 7410-2.0-branch.txt, 7410-2.1-branch.txt, 7410-v2-2.0-branch.txt, 7410-v3-2.0-branch.txt, CASSANDRA-7410-v2-2.1-branch.txt, CASSANDRA-7410-v3-2.1-branch.txt, CASSANDRA-7410-v4-2.0-branch.txt Add BulkOutputFormat support in Pig url -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7523) add date and time types
[ https://issues.apache.org/jira/browse/CASSANDRA-7523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503202#comment-14503202 ] Sylvain Lebresne commented on CASSANDRA-7523: - The thing is, I don't care too much about the spec documentation itself (it's important of course, but it's easily fixed). The problem is that if someone use one of those new data types with an existing client, Cassandra will currently happily return the new codes, which clients have no reason to know about and may therefore crash in unexpected ways. *That* is a problem and that's what I meant by Adding new codes to v3 would confuse drivers. So what I mean is that we should special case those 2 types in {{DataType}} (maybe in {{toType}}, maybe directly in the serialization) so that when the protocol version is = 3, then it write the types as {{CUSTOM}} ones. As far as Cassandra 2.1 is concerned, we could even go as far as remove the types from {{DataType}} and directly use {{DataType.CUSTOM}} unconditionally since there won't be support for protocol v4 anyway. add date and time types --- Key: CASSANDRA-7523 URL: https://issues.apache.org/jira/browse/CASSANDRA-7523 Project: Cassandra Issue Type: New Feature Components: API Reporter: Jonathan Ellis Assignee: Joshua McKenzie Priority: Minor Labels: client-impacting, docs Fix For: 2.1.5 http://www.postgresql.org/docs/9.1/static/datatype-datetime.html (we already have timestamp; interval is out of scope for now, and see CASSANDRA-6350 for discussion on timestamp-with-time-zone. but date/time should be pretty easy to add.) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7523) add date and time types
[ https://issues.apache.org/jira/browse/CASSANDRA-7523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503207#comment-14503207 ] Jonathan Ellis commented on CASSANDRA-7523: --- Should we just keep it to 3.0 since adding it to 2.1 wasn't as harmless as it looked? add date and time types --- Key: CASSANDRA-7523 URL: https://issues.apache.org/jira/browse/CASSANDRA-7523 Project: Cassandra Issue Type: New Feature Components: API Reporter: Jonathan Ellis Assignee: Joshua McKenzie Priority: Minor Labels: client-impacting, docs Fix For: 2.1.5 http://www.postgresql.org/docs/9.1/static/datatype-datetime.html (we already have timestamp; interval is out of scope for now, and see CASSANDRA-6350 for discussion on timestamp-with-time-zone. but date/time should be pretty easy to add.) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9194) Delete-only workloads crash Cassandra
[ https://issues.apache.org/jira/browse/CASSANDRA-9194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503090#comment-14503090 ] Jim Witschey commented on CASSANDRA-9194: - I'm afraid I don't follow. .bq This is already behaving properly in 2.1 [The results of the dtest|https://github.com/riptano/cassandra-dtest/blob/master/deletion_test.py#L44] show that the tracked memtable size is 0 in 2.1 after performing 100 deletions -- {{MemtableLiveDataSize}} is reported as 0 over JMX even when {{MemtableColumnsCount}} is 100. Is that behavior correct? I may not have been clear, but that test fails on all released 2.0 and 2.1 versions. Also, I don't understand why the amount of memory to track for tombstones is arbitrary in 2.0. Delete-only workloads crash Cassandra - Key: CASSANDRA-9194 URL: https://issues.apache.org/jira/browse/CASSANDRA-9194 Project: Cassandra Issue Type: Bug Environment: 2.0.14 Reporter: Robert Wille Assignee: Benedict Fix For: 2.0.15 Attachments: 9194.txt The size of a tombstone is not properly accounted for in the memtable. A memtable which has only tombstones will never get flushed. It will grow until the JVM runs out of memory. The following program easily demonstrates the problem. {code} Cluster.Builder builder = Cluster.builder(); Cluster c = builder.addContactPoints(cas121.devf3.com).build(); Session s = c.connect(); s.execute(CREATE KEYSPACE IF NOT EXISTS test WITH replication = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 }); s.execute(CREATE TABLE IF NOT EXISTS test.test(id INT PRIMARY KEY)); PreparedStatement stmt = s.prepare(DELETE FROM test.test WHERE id = :id); int id = 0; while (true) { s.execute(stmt.bind(id)); id++; }{code} This program should run forever, but eventually Cassandra runs out of heap and craps out. You needn't wait for Cassandra to crash. If you run nodetool cfstats test.test while it is running, you'll see Memtable cell count grow, but Memtable data size will remain 0. This issue was fixed once before. I received a patch for version 2.0.5 (I believe), which contained the fix, but the fix has apparently been lost, because it is clearly broken, and I don't see the fix in the change logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-9198) Deleting from an empty list produces an error
[ https://issues.apache.org/jira/browse/CASSANDRA-9198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503146#comment-14503146 ] Jeff Jirsa edited comment on CASSANDRA-9198 at 4/20/15 4:37 PM: Attaching patch that was previously option #2 on CASSANDRA-9077. Addresses this without modification (though I'll also attach an update shortly that ALSO addresses the now-failing unit tests that expected the older behavior) {noformat} cqlsh create keyspace IF NOT EXISTS test WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1}; cqlsh USE test; cqlsh:test create table foo (k int primary key, v listint); cqlsh:test insert into foo(k,v) values (1, [1,2]); cqlsh:test update foo set v = v - [1] where k = 1; cqlsh:test update foo set v = v - [1] where k = 1; cqlsh:test update foo set v = v - [2] where k = 1; cqlsh:test update foo set v = v - [2] where k = 1; cqlsh:test cqlsh:test select * from foo; k | v ---+-- 1 | null (1 rows) {noformat} was (Author: jjirsa): Attaching patch that was previously option #2 on CASSANDRA-9077. Addresses this without modification: {noformat} cqlsh create keyspace IF NOT EXISTS test WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1}; cqlsh USE test; cqlsh:test create table foo (k int primary key, v listint); cqlsh:test insert into foo(k,v) values (1, [1,2]); cqlsh:test update foo set v = v - [1] where k = 1; cqlsh:test update foo set v = v - [1] where k = 1; cqlsh:test update foo set v = v - [2] where k = 1; cqlsh:test update foo set v = v - [2] where k = 1; cqlsh:test cqlsh:test select * from foo; k | v ---+-- 1 | null (1 rows) {noformat} Deleting from an empty list produces an error - Key: CASSANDRA-9198 URL: https://issues.apache.org/jira/browse/CASSANDRA-9198 Project: Cassandra Issue Type: Bug Components: API Reporter: Olivier Michallat Assignee: Benjamin Lerer Priority: Minor Fix For: 3.0 Attachments: 9198.txt While deleting an element from a list that does not contain it is a no-op, deleting it from an empty list causes an error. This edge case is a bit inconsistent, because it makes list deletion non idempotent: {code} cqlsh:test create table foo (k int primary key, v listint); cqlsh:test insert into foo(k,v) values (1, [1,2]); cqlsh:test update foo set v = v - [1] where k = 1; cqlsh:test update foo set v = v - [1] where k = 1; cqlsh:test update foo set v = v - [2] where k = 1; cqlsh:test update foo set v = v - [2] where k = 1; InvalidRequest: code=2200 [Invalid query] message=Attempted to delete an element from a list which is null {code} With speculative retries coming to the drivers, idempotency becomes more important because it determines which query we might retry or not. So it would be better if deleting from an empty list succeeded. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9193) Facility to write dynamic code to selectively trigger trace or log for queries
[ https://issues.apache.org/jira/browse/CASSANDRA-9193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503087#comment-14503087 ] Matt Stump commented on CASSANDRA-9193: --- It's actually pretty simple the Nashorn API makes this pretty easy. For context here is a little snippet that I had running in {{StorageProxy.read}}. I'm evaluating a script, then executing a named function passing parameters to the JS function. The idea would be to pull a script from a CF for the injection point, eval and when we reach that point in the code invoke a known function name adhering to an interface. We would pass the request parameters. The JS would have full visibility into the internals of C*. Each injection point would have a different context (scope) so that we don't have to worry about namespace collision between scripts. {code} String script = + var printReadCommands = function(readCommands, consistency, state) { + for each (var rc in readCommands) { + print(\rc class definition: \ + Object.prototype.toString.call(rc) + \\\n\); + print(\KEYSPACE \ + rc.ksName); + print(\ CF \ + rc.cfName); + print(\ KEY \ + rc.key); + print(\\\n\); + } + };; jsEngine.eval(script); Invocable invocable = (Invocable) jsEngine; Object result = invocable.invokeFunction(printReadCommands, commands, consistencyLevel, state); {code} Facility to write dynamic code to selectively trigger trace or log for queries -- Key: CASSANDRA-9193 URL: https://issues.apache.org/jira/browse/CASSANDRA-9193 Project: Cassandra Issue Type: New Feature Reporter: Matt Stump I want the equivalent of dtrace for Cassandra. I want the ability to intercept a query with a dynamic script (assume JS) and based on logic in that script trigger the statement for trace or logging. Examples - Trace only INSERT statements to a particular CF. - Trace statements for a particular partition or consistency level. - Log statements that fail to reach the desired consistency for read or write. - Log If the request size for read or write exceeds some threshold At some point in the future it would be helpful to also do things such as log partitions greater than X bytes or Z cells when performing compaction. Essentially be able to inject custom code dynamically without a reboot to the different stages of C*. The code should be executed synchronously as part of the monitored task, but we should provide the ability to log or execute CQL asynchronously from the provided API. Further down the line we could use this functionality to modify/rewrite requests or tasks dynamically. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8729) Commitlog causes read before write when overwriting
[ https://issues.apache.org/jira/browse/CASSANDRA-8729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503113#comment-14503113 ] Ariel Weisberg commented on CASSANDRA-8729: --- There is a work around I think. [~tjake] said he changed the size of the commit log to 0 and that caused it to not retain any segments. Commitlog causes read before write when overwriting --- Key: CASSANDRA-8729 URL: https://issues.apache.org/jira/browse/CASSANDRA-8729 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Ariel Weisberg Labels: commitlog Fix For: 3.0 The memory mapped commit log implementation writes directly to the page cache. If a page is not in the cache the kernel will read it in even though we are going to overwrite. The way to avoid this is to write to private memory, and then pad the write with 0s at the end so it is page (4k) aligned before writing to a file. The commit log would benefit from being refactored into something that looks more like a pipeline with incoming requests receiving private memory to write in, completed buffers being submitted to a parallelized compression/checksum step, followed by submission to another thread for writing to a file that preserves the order. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9146) Ever Growing Secondary Index sstables after every Repair
[ https://issues.apache.org/jira/browse/CASSANDRA-9146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503028#comment-14503028 ] Anuj commented on CASSANDRA-9146: - Marcus Are you looking into this? Ever Growing Secondary Index sstables after every Repair Key: CASSANDRA-9146 URL: https://issues.apache.org/jira/browse/CASSANDRA-9146 Project: Cassandra Issue Type: Bug Components: Core Reporter: Anuj Attachments: sstables.txt, system-modified.log Cluster has reached a state where every repair -pr operation on CF results in numerous tiny sstables being flushed to disk. Most sstables are related to secondary indexes. Due to thousands of sstables, reads have started timing out. Even though compaction begins for one of the secondary index, sstable count after repair remains very high (thousands). Every repair adds thousands of sstables. Problems: 1. Why burst of tiny secondary index tables are flushed during repair ? What is triggering frequent/premature flush of secondary index sstable (more than hundred in every burst)? At max we see one ParNew GC pauses 200ms. 2. Why auto-compaction is not compacting all sstables. Is it related to coldness issue(CASSANDRA-8885) where compaction doesn't works even when cold_reads_to_omit=0 by default? If coldness is the issue, we are stuck in infinite loop: reads will trigger compaction but reads timeout as sstable count is in thousands 3. What's the way out if we face this issue in Prod? Is this issue fixed in latest production release 2.0.13? Issue looks similar to CASSANDRA-8641, but the issue is fixed in only 2.1.3. I think it should be fixed in 2.0 branch too. Configuration: Compaction Strategy: STCS memtable_flush_writers=4 memtable_flush_queue_size=4 in_memory_compaction_limit_in_mb=32 concurrent_compactors=12 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9213) Compaction errors observed during heavy write load: BAD RELEASE
[ https://issues.apache.org/jira/browse/CASSANDRA-9213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Thompson updated CASSANDRA-9213: --- Reproduced In: 2.1.4 Compaction errors observed during heavy write load: BAD RELEASE --- Key: CASSANDRA-9213 URL: https://issues.apache.org/jira/browse/CASSANDRA-9213 Project: Cassandra Issue Type: Bug Environment: Cassandra 2.1.4.374 Ubuntu 14.04.2 java version 1.7.0_45 10-node cluster, RF = 3 Reporter: Rocco Varela Fix For: 2.1.5 Attachments: COMPACTION-ERR.log During heavy write load testing we're seeing occasional compaction errors with the following error message: {code} ERROR [CompactionExecutor:40] 2015-04-16 17:01:16,936 Ref.java:170 - BAD RELEASE: attempted to release a reference (org.apache.cassandra.utils.concurrent.Ref$State@31d969bd) that has already been released ... ERROR [CompactionExecutor:40] 2015-04-16 17:01:22,190 CassandraDaemon.java:223 - Exception in thread Thread[CompactionExecutor:40,1,main] java.lang.AssertionError: null at org.apache.cassandra.io.sstable.SSTableReader.markObsolete(SSTableReader.java:1699) ~[cassandra-all-2.1.4.374.jar:2.1.4.374] at org.apache.cassandra.db.DataTracker.unmarkCompacting(DataTracker.java:240) ~[cassandra-all-2.1.4.374.jar:2.1.4.374] at org.apache.cassandra.io.sstable.SSTableRewriter.replaceWithFinishedReaders(SSTableRewriter.java:495) ~[cassandra-all-2.1.4.374.jar:2.1.4.374] at ... {code} I have turned on debugrefcount in bin/cassandra:launch_service() and I will repost another stack trace when it happens again. {code} cassandra_parms=$cassandra_parms -Dcassandra.debugrefcount=true {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9213) Compaction errors observed during heavy write load: BAD RELEASE
[ https://issues.apache.org/jira/browse/CASSANDRA-9213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Thompson updated CASSANDRA-9213: --- Fix Version/s: (was: 2.1.4) 2.1.5 Compaction errors observed during heavy write load: BAD RELEASE --- Key: CASSANDRA-9213 URL: https://issues.apache.org/jira/browse/CASSANDRA-9213 Project: Cassandra Issue Type: Bug Environment: Cassandra 2.1.4.374 Ubuntu 14.04.2 java version 1.7.0_45 10-node cluster, RF = 3 Reporter: Rocco Varela Fix For: 2.1.5 Attachments: COMPACTION-ERR.log During heavy write load testing we're seeing occasional compaction errors with the following error message: {code} ERROR [CompactionExecutor:40] 2015-04-16 17:01:16,936 Ref.java:170 - BAD RELEASE: attempted to release a reference (org.apache.cassandra.utils.concurrent.Ref$State@31d969bd) that has already been released ... ERROR [CompactionExecutor:40] 2015-04-16 17:01:22,190 CassandraDaemon.java:223 - Exception in thread Thread[CompactionExecutor:40,1,main] java.lang.AssertionError: null at org.apache.cassandra.io.sstable.SSTableReader.markObsolete(SSTableReader.java:1699) ~[cassandra-all-2.1.4.374.jar:2.1.4.374] at org.apache.cassandra.db.DataTracker.unmarkCompacting(DataTracker.java:240) ~[cassandra-all-2.1.4.374.jar:2.1.4.374] at org.apache.cassandra.io.sstable.SSTableRewriter.replaceWithFinishedReaders(SSTableRewriter.java:495) ~[cassandra-all-2.1.4.374.jar:2.1.4.374] at ... {code} I have turned on debugrefcount in bin/cassandra:launch_service() and I will repost another stack trace when it happens again. {code} cassandra_parms=$cassandra_parms -Dcassandra.debugrefcount=true {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9196) Do not rebuild indexes if no columns are actually indexed
[ https://issues.apache.org/jira/browse/CASSANDRA-9196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503102#comment-14503102 ] Sergio Bossa commented on CASSANDRA-9196: - Thanks [~beobal], I've reviewed the 2.1 version, and even if I don't like to have two different methods basically doing the same thing, as this could lead to inconsistent implementations, there doesn't seem to be any other way, so LGTM. Do not rebuild indexes if no columns are actually indexed - Key: CASSANDRA-9196 URL: https://issues.apache.org/jira/browse/CASSANDRA-9196 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Sergio Bossa Assignee: Sergio Bossa Fix For: 2.0.15 Attachments: 2.0-CASSANDRA-9196.txt, 2.1-CASSANDRA-9196.txt When rebuilding secondary indexes, the index task is executed regardless if the actual {{SecondaryIndex#indexes(ByteBuffer )}} implementation of any index returns true for any column, meaning that the expensive task of going through all sstables and related rows will be executed even if in the end no column/row will be actually indexed. This is a huge performance hit when i.e. bootstrapping with large datasets on tables having custom secondary index implementations whose {{indexes()}} implementation might return false. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6477) Global indexes
[ https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503040#comment-14503040 ] Jeremiah Jordan commented on CASSANDRA-6477: +1 from me for calling it Materialized Views Global indexes -- Key: CASSANDRA-6477 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477 Project: Cassandra Issue Type: New Feature Components: API, Core Reporter: Jonathan Ellis Assignee: Carl Yeksigian Labels: cql Fix For: 3.0 Local indexes are suitable for low-cardinality data, where spreading the index across the cluster is a Good Thing. However, for high-cardinality data, local indexes require querying most nodes in the cluster even if only a handful of rows is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-9194) Delete-only workloads crash Cassandra
[ https://issues.apache.org/jira/browse/CASSANDRA-9194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503090#comment-14503090 ] Jim Witschey edited comment on CASSANDRA-9194 at 4/20/15 4:07 PM: -- I'm afraid I don't follow. bq. This is already behaving properly in 2.1 [The results of the dtest|https://github.com/riptano/cassandra-dtest/blob/master/deletion_test.py#L44] show that the tracked memtable size is 0 in 2.1 after performing 100 deletions -- {{MemtableLiveDataSize}} is reported as 0 over JMX even when {{MemtableColumnsCount}} is 100. Is that behavior correct? I may not have been clear, but that test fails on all released 2.0 and 2.1 versions. Also, I don't understand why the amount of memory to track for tombstones is arbitrary in 2.0. was (Author: mambocab): I'm afraid I don't follow. .bq This is already behaving properly in 2.1 [The results of the dtest|https://github.com/riptano/cassandra-dtest/blob/master/deletion_test.py#L44] show that the tracked memtable size is 0 in 2.1 after performing 100 deletions -- {{MemtableLiveDataSize}} is reported as 0 over JMX even when {{MemtableColumnsCount}} is 100. Is that behavior correct? I may not have been clear, but that test fails on all released 2.0 and 2.1 versions. Also, I don't understand why the amount of memory to track for tombstones is arbitrary in 2.0. Delete-only workloads crash Cassandra - Key: CASSANDRA-9194 URL: https://issues.apache.org/jira/browse/CASSANDRA-9194 Project: Cassandra Issue Type: Bug Environment: 2.0.14 Reporter: Robert Wille Assignee: Benedict Fix For: 2.0.15 Attachments: 9194.txt The size of a tombstone is not properly accounted for in the memtable. A memtable which has only tombstones will never get flushed. It will grow until the JVM runs out of memory. The following program easily demonstrates the problem. {code} Cluster.Builder builder = Cluster.builder(); Cluster c = builder.addContactPoints(cas121.devf3.com).build(); Session s = c.connect(); s.execute(CREATE KEYSPACE IF NOT EXISTS test WITH replication = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 }); s.execute(CREATE TABLE IF NOT EXISTS test.test(id INT PRIMARY KEY)); PreparedStatement stmt = s.prepare(DELETE FROM test.test WHERE id = :id); int id = 0; while (true) { s.execute(stmt.bind(id)); id++; }{code} This program should run forever, but eventually Cassandra runs out of heap and craps out. You needn't wait for Cassandra to crash. If you run nodetool cfstats test.test while it is running, you'll see Memtable cell count grow, but Memtable data size will remain 0. This issue was fixed once before. I received a patch for version 2.0.5 (I believe), which contained the fix, but the fix has apparently been lost, because it is clearly broken, and I don't see the fix in the change logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9198) Deleting from an empty list produces an error
[ https://issues.apache.org/jira/browse/CASSANDRA-9198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503088#comment-14503088 ] Sylvain Lebresne commented on CASSANDRA-9198: - bq. last time we discussed that subject, we agreed that as we complain on invalid index for lists, not complaining when trying to delete from an empty list was not really consistent. Yes, but I hadn't though of the idempotency issue at the time, and I do think we should make thing idempotent as much as possible, so +1 for not complaining in that case. And since this case was buggy until CASSANDRA_9077, I suggest we fix that before 2.1.5 is released. Deleting from an empty list produces an error - Key: CASSANDRA-9198 URL: https://issues.apache.org/jira/browse/CASSANDRA-9198 Project: Cassandra Issue Type: Bug Components: API Reporter: Olivier Michallat Assignee: Benjamin Lerer Priority: Minor Fix For: 3.0 While deleting an element from a list that does not contain it is a no-op, deleting it from an empty list causes an error. This edge case is a bit inconsistent, because it makes list deletion non idempotent: {code} cqlsh:test create table foo (k int primary key, v listint); cqlsh:test insert into foo(k,v) values (1, [1,2]); cqlsh:test update foo set v = v - [1] where k = 1; cqlsh:test update foo set v = v - [1] where k = 1; cqlsh:test update foo set v = v - [2] where k = 1; cqlsh:test update foo set v = v - [2] where k = 1; InvalidRequest: code=2200 [Invalid query] message=Attempted to delete an element from a list which is null {code} With speculative retries coming to the drivers, idempotency becomes more important because it determines which query we might retry or not. So it would be better if deleting from an empty list succeeded. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9211) Keep history of SSTable metadata
[ https://issues.apache.org/jira/browse/CASSANDRA-9211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-9211: -- Reviewer: Marcus Eriksson Assignee: Björn Hegerfors Keep history of SSTable metadata Key: CASSANDRA-9211 URL: https://issues.apache.org/jira/browse/CASSANDRA-9211 Project: Cassandra Issue Type: Wish Reporter: Björn Hegerfors Assignee: Björn Hegerfors Priority: Minor Similar to the request in CASSANDRA-8078, I'm interested in SSTables' lineage. Specifically, I want to visualize the behaviors of compaction strategies based on real data. For example, for STCS I might want to generate something like this image: http://www.datastax.com/wp-content/uploads/2011/10/size-tiered-1.png. For LCS and DTCS, other properties than size are interesting. As Marcus responded in CASSANDRA-8078, there is already tracking of ancestors in the SSTable metadata. But as far as I know, the metadata gets garbage collected along with the SSTable itself. So what I propose is to persist metadata in a system table. Maybe some, maybe all metadata. Like the compaction_history table, this should have a default TTL of something like one week or just one day. But users can freely modify/remove the TTL. -- This message was sent by Atlassian JIRA (v6.3.4#6332)