[jira] [Commented] (CASSANDRA-15082) SASI SPARSE mode 5 limit

2019-11-18 Thread Alex Petrov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16977210#comment-16977210
 ] 

Alex Petrov commented on CASSANDRA-15082:
-

This seems to be related to, or possibly a duplicate of [CASSANDRA-13478], 
depending on which exactly part of issue is considered more important here. I 
agree that a general purpose data base should have no such limitation. 
Possibly, we could perform similar optimisation in a way that wouldn't force 
user to pick an arbitrary number that sets upper limit on cardinality of the 
data: fall back to non-sparse mode, or create overflow pages would be two 
potential options.

> SASI SPARSE mode 5 limit
> 
>
> Key: CASSANDRA-15082
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15082
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/SASI
>Reporter: Edward Capriolo
>Priority: Normal
>
> I do not know what the "improvement" should be here, but I ran into this:
> [https://github.com/apache/cassandra/blob/cassandra-3.11/src/java/org/apache/cassandra/index/sasi/disk/OnDiskIndexBuilder.java#L585]
> Term '55.3' belongs to more than 5 keys in sparse mode, which is not allowed.
> The only reference I can find to the limit is here:
>  [http://www.doanduyhai.com/blog/?p=2058]
> Why is it 5? Could it be a variable? Could it be an option when creating the 
> table? Why or why not?
> This seems awkward. A user can insert more then 5 rows into a table, and it 
> "works". IE you can write and you can query that table getting more than 5 
> results, but the index will not flush to disk. It throws an IOException.
> Maybe I am misunderstanding, but this seems impossible to support, if users 
> inserts the same value 5 times, the entire index will not flush to disk?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15077) Dropping column via thrift renders cf unreadable via CQL, leads to missing data

2019-11-18 Thread Alex Petrov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Petrov updated CASSANDRA-15077:

 Bug Category: Parent values: Correctness(12982)Level 1 values: 
Unrecoverable Corruption / Loss(13161)
   Complexity: Normal
  Component/s: Legacy/Distributed Metadata
Discovered By: User Report
   Status: Open  (was: Triage Needed)

> Dropping column via thrift renders cf unreadable via CQL, leads to missing 
> data
> ---
>
> Key: CASSANDRA-15077
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15077
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Distributed Metadata
>Reporter: Muir Manders
>Priority: Normal
>
> Hello
> We have a lot of thrift/compact storage column families in production. We 
> upgraded to 3.11.4 last week. This week we ran a (thrift) schema change to 
> drop a column from a column family. Our CQL clients immediately starting 
> getting a read error ("ReadFailure: Error from server: code=1300 ...") trying 
> to read the column family. Thrift clients were still able to read the column 
> family.
> We determined restarting the nodes "fixed" CQL reads, so we did that, but 
> soon discovered that we were missing data because cassandra was skipping 
> sstables it didn't like on startup. That exception looked like this:
> {noformat}
> INFO  [main] 2019-04-04 20:06:35,676 ColumnFamilyStore.java:430 - 
> Initializing test.test
> ERROR [SSTableBatchOpen:1] 2019-04-04 20:06:35,689 CassandraDaemon.java:228 - 
> Exception in thread Thread[SSTableBatchOpen:1,5,main]
> java.lang.RuntimeException: Unknown column foo during deserialization
> at 
> org.apache.cassandra.db.SerializationHeader$Component.toHeader(SerializationHeader.java:326)
>  ~[apache-cassandra-3.11.4.jar:3.11.4]
> at 
> org.apache.cassandra.io.sstable.format.SSTableReader.open(SSTableReader.java:522)
>  ~[apache-cassandra-3.11.4.jar:3.11.4]
> at 
> org.apache.cassandra.io.sstable.format.SSTableReader.open(SSTableReader.java:385)
>  ~[apache-cassandra-3.11.4.jar:3.11.4]
> at 
> org.apache.cassandra.io.sstable.format.SSTableReader$3.run(SSTableReader.java:570)
>  ~[apache-cassandra-3.11.4.jar:3.11.4]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_121]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[na:1.8.0_121]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[na:1.8.0_121]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_121]
> at 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81)
>  [apache-cassandra-3.11.4.jar:3.11.4]
> at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_121]
> {noformat}
>  
> Below is a list of steps to reproduce the issue. Note that in production our 
> column families were all created via thrift, but I thought it was simpler to 
> create them using CQL for the reproduction script.
> {code}
> ccm create test -v 3.11.4 -n 1
> ccm updateconf 'start_rpc: true'
> ccm start
> sleep 10
> ccm node1 cqlsh < CREATE KEYSPACE test WITH REPLICATION = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> CREATE COLUMNFAMILY test.test (
>   id text,
>   foo text,
>   bar text,
>   PRIMARY KEY (id)
> ) WITH COMPACT STORAGE;
> INSERT INTO test.test (id, foo, bar) values ('1', 'hi', 'there');
> SCHEMA
> pip install pycassa
> python < import pycassa
> sys = pycassa.system_manager.SystemManager('127.0.0.1:9160')
> cf = sys.get_keyspace_column_families('test')['test']
> sys.alter_column_family('test', 'test', column_metadata=filter(lambda c: 
> c.name != 'foo', cf.column_metadata))
> DROP_COLUMN
> # this produces the "ReadFailure: Error from server: code=1300" error
> ccm node1 cqlsh < select * from test.test;
> QUERY
> ccm node1 stop
> ccm node1 start
> sleep 10
> # this returns 0 rows (i.e. demonstrates missing data)
> ccm node1 cqlsh < select * from test.test;
> QUERY
> {code}
> We added the columns back via thrift and restarted cassandra to restore the 
> missing data. Later we realized a secondary index on the affected column 
> family had become out of sync with the data. We assume that was somehow a 
> side effect of running for a period with data missing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15433) Pending ranges are not recalculated on keyspace creation

2019-11-18 Thread Alex Petrov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Petrov updated CASSANDRA-15433:

Since Version: 3.0.0

> Pending ranges are not recalculated on keyspace creation
> 
>
> Key: CASSANDRA-15433
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15433
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Membership
>Reporter: Josh Snyder
>Priority: Normal
>
> When a node begins bootstrapping, Cassandra recalculates pending tokens for 
> each keyspace that exists when the state change is observed (in 
> StorageService:handleState*). When new keyspaces are created, we do not 
> recalculate pending ranges (around Schema:merge). As a result, writes for new 
> keyspaces are not received by nodes in BOOT or BOOT_REPLACE modes. When 
> bootstrapping finishes, the node which just bootstrapped will not have data 
> for the newly created keyspace.
> Consider a ring with bootstrapped nodes A, B, and C. Node D is pending, and 
> when it finishes bootstrapping, C will cede ownership of some ranges to D. A 
> quorum write is acknowledged by C and A. B missed the write, and the 
> coordinator didn't send it to D at all. When D finishes bootstrapping, the 
> quorum B+D will not contain the mutation.
> Steps to reproduce:
> # Join a node in BOOT mode
> # Create a keyspace
> # Send writes to that keyspace
> # On the joining node, observe that {{nodetool cfstats}} records zero writes 
> to the new keyspace
> I have observed this directly in Cassandra 3.0, and based on my reading the 
> code, I believe it affects up through trunk.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15433) Pending ranges are not recalculated on keyspace creation

2019-11-18 Thread Alex Petrov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Petrov updated CASSANDRA-15433:

Impacts:   (was: None)

> Pending ranges are not recalculated on keyspace creation
> 
>
> Key: CASSANDRA-15433
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15433
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Membership
>Reporter: Josh Snyder
>Priority: Normal
>
> When a node begins bootstrapping, Cassandra recalculates pending tokens for 
> each keyspace that exists when the state change is observed (in 
> StorageService:handleState*). When new keyspaces are created, we do not 
> recalculate pending ranges (around Schema:merge). As a result, writes for new 
> keyspaces are not received by nodes in BOOT or BOOT_REPLACE modes. When 
> bootstrapping finishes, the node which just bootstrapped will not have data 
> for the newly created keyspace.
> Consider a ring with bootstrapped nodes A, B, and C. Node D is pending, and 
> when it finishes bootstrapping, C will cede ownership of some ranges to D. A 
> quorum write is acknowledged by C and A. B missed the write, and the 
> coordinator didn't send it to D at all. When D finishes bootstrapping, the 
> quorum B+D will not contain the mutation.
> Steps to reproduce:
> # Join a node in BOOT mode
> # Create a keyspace
> # Send writes to that keyspace
> # On the joining node, observe that {{nodetool cfstats}} records zero writes 
> to the new keyspace
> I have observed this directly in Cassandra 3.0, and based on my reading the 
> code, I believe it affects up through trunk.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15433) Pending ranges are not recalculated on keyspace creation

2019-11-18 Thread Alex Petrov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Petrov updated CASSANDRA-15433:

 Bug Category: Parent values: Correctness(12982)Level 1 values: Recoverable 
Corruption / Loss(12986)
   Complexity: Normal
Discovered By: User Report
 Severity: Normal
   Status: Open  (was: Triage Needed)

> Pending ranges are not recalculated on keyspace creation
> 
>
> Key: CASSANDRA-15433
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15433
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Membership
>Reporter: Josh Snyder
>Priority: Normal
>
> When a node begins bootstrapping, Cassandra recalculates pending tokens for 
> each keyspace that exists when the state change is observed (in 
> StorageService:handleState*). When new keyspaces are created, we do not 
> recalculate pending ranges (around Schema:merge). As a result, writes for new 
> keyspaces are not received by nodes in BOOT or BOOT_REPLACE modes. When 
> bootstrapping finishes, the node which just bootstrapped will not have data 
> for the newly created keyspace.
> Consider a ring with bootstrapped nodes A, B, and C. Node D is pending, and 
> when it finishes bootstrapping, C will cede ownership of some ranges to D. A 
> quorum write is acknowledged by C and A. B missed the write, and the 
> coordinator didn't send it to D at all. When D finishes bootstrapping, the 
> quorum B+D will not contain the mutation.
> Steps to reproduce:
> # Join a node in BOOT mode
> # Create a keyspace
> # Send writes to that keyspace
> # On the joining node, observe that {{nodetool cfstats}} records zero writes 
> to the new keyspace
> I have observed this directly in Cassandra 3.0, and based on my reading the 
> code, I believe it affects up through trunk.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15433) Pending ranges are not recalculated on keyspace creation

2019-11-18 Thread Alex Petrov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Petrov updated CASSANDRA-15433:

Component/s: Cluster/Membership

> Pending ranges are not recalculated on keyspace creation
> 
>
> Key: CASSANDRA-15433
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15433
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Membership
>Reporter: Josh Snyder
>Priority: Normal
>
> When a node begins bootstrapping, Cassandra recalculates pending tokens for 
> each keyspace that exists when the state change is observed (in 
> StorageService:handleState*). When new keyspaces are created, we do not 
> recalculate pending ranges (around Schema:merge). As a result, writes for new 
> keyspaces are not received by nodes in BOOT or BOOT_REPLACE modes. When 
> bootstrapping finishes, the node which just bootstrapped will not have data 
> for the newly created keyspace.
> Consider a ring with bootstrapped nodes A, B, and C. Node D is pending, and 
> when it finishes bootstrapping, C will cede ownership of some ranges to D. A 
> quorum write is acknowledged by C and A. B missed the write, and the 
> coordinator didn't send it to D at all. When D finishes bootstrapping, the 
> quorum B+D will not contain the mutation.
> Steps to reproduce:
> # Join a node in BOOT mode
> # Create a keyspace
> # Send writes to that keyspace
> # On the joining node, observe that {{nodetool cfstats}} records zero writes 
> to the new keyspace
> I have observed this directly in Cassandra 3.0, and based on my reading the 
> code, I believe it affects up through trunk.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15052) Dtests: Add acceptable warnings to offline tool tests in order to pass them

2019-11-18 Thread Alex Petrov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Petrov updated CASSANDRA-15052:

Change Category: Quality Assurance
 Complexity: Normal
 Status: Open  (was: Triage Needed)

> Dtests: Add acceptable warnings to offline tool tests in order to pass them
> ---
>
> Key: CASSANDRA-15052
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15052
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/dtest
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
>  Labels: pull-request-available
> Attachments: SPICE-15052.txt
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> I run all dtest suite and test 
> offline_tools_test.py::TestOfflineTools::test_sstablelevelreset has failed 
> because of additional warning logs which were not added into acceptable ones.
> After adding them, test passed fine. I believe added warning messages have 
> nothing to do with test itself, it was reproduced on c5.9xlarge as well as no 
> "regular" notebook.
>  
> https://github.com/apache/cassandra-dtest/pull/47



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-15433) Pending ranges are not recalculated on keyspace creation

2019-11-18 Thread Josh Snyder (Jira)
Josh Snyder created CASSANDRA-15433:
---

 Summary: Pending ranges are not recalculated on keyspace creation
 Key: CASSANDRA-15433
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15433
 Project: Cassandra
  Issue Type: Bug
Reporter: Josh Snyder


When a node begins bootstrapping, Cassandra recalculates pending tokens for 
each keyspace that exists when the state change is observed (in 
StorageService:handleState*). When new keyspaces are created, we do not 
recalculate pending ranges (around Schema:merge). As a result, writes for new 
keyspaces are not received by nodes in BOOT or BOOT_REPLACE modes. When 
bootstrapping finishes, the node which just bootstrapped will not have data for 
the newly created keyspace.

Consider a ring with bootstrapped nodes A, B, and C. Node D is pending, and 
when it finishes bootstrapping, C will cede ownership of some ranges to D. A 
quorum write is acknowledged by C and A. B missed the write, and the 
coordinator didn't send it to D at all. When D finishes bootstrapping, the 
quorum B+D will not contain the mutation.

Steps to reproduce:
# Join a node in BOOT mode
# Create a keyspace
# Send writes to that keyspace
# On the joining node, observe that {{nodetool cfstats}} records zero writes to 
the new keyspace

I have observed this directly in Cassandra 3.0, and based on my reading the 
code, I believe it affects up through trunk.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13019) Improve clearsnapshot to delete the snapshot files slowly

2019-11-18 Thread Jeff Jirsa (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-13019:
---
Reviewers: Aleksey Yeschenko, Chris Lohfink, maxwellguo, Jeff Jirsa  (was: 
Aleksey Yeschenko, Chris Lohfink, Jeff Jirsa, maxwellguo)
   Aleksey Yeschenko, Chris Lohfink, maxwellguo, Jeff Jirsa
   Status: Review In Progress  (was: Patch Available)

> Improve clearsnapshot to delete the snapshot files slowly 
> --
>
> Key: CASSANDRA-13019
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13019
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Core
>Reporter: Dikang Gu
>Assignee: Jeff Jirsa
>Priority: Normal
>  Labels: pull-request-available
> Fix For: 4.x
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> In our environment, we are creating snapshots for backup, after we finish the 
> backup, we are running {{clearsnapshot}} to delete the snapshot files. At 
> that time we may have thousands of files to delete, and it's causing sudden 
> disk usage spike. As a result, we are experiencing a spike of drop messages 
> from Cassandra.
> I think we should implement something like {{slowrm}} to delete the snapshot 
> files slowly, avoid the sudden disk usage spike.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13019) Improve clearsnapshot to delete the snapshot files slowly

2019-11-18 Thread Jeff Jirsa (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-13019:
---
Status: Ready to Commit  (was: Review In Progress)

> Improve clearsnapshot to delete the snapshot files slowly 
> --
>
> Key: CASSANDRA-13019
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13019
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Core
>Reporter: Dikang Gu
>Assignee: Jeff Jirsa
>Priority: Normal
>  Labels: pull-request-available
> Fix For: 4.x
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> In our environment, we are creating snapshots for backup, after we finish the 
> backup, we are running {{clearsnapshot}} to delete the snapshot files. At 
> that time we may have thousands of files to delete, and it's causing sudden 
> disk usage spike. As a result, we are experiencing a spike of drop messages 
> from Cassandra.
> I think we should implement something like {{slowrm}} to delete the snapshot 
> files slowly, avoid the sudden disk usage spike.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13019) Improve clearsnapshot to delete the snapshot files slowly

2019-11-18 Thread Jeff Jirsa (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976959#comment-16976959
 ] 

Jeff Jirsa commented on CASSANDRA-13019:


Patch is approved by 3 people in GH PR (Aleksey, Chris, Maxwell) 

> Improve clearsnapshot to delete the snapshot files slowly 
> --
>
> Key: CASSANDRA-13019
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13019
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Core
>Reporter: Dikang Gu
>Assignee: Jeff Jirsa
>Priority: Normal
>  Labels: pull-request-available
> Fix For: 4.x
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> In our environment, we are creating snapshots for backup, after we finish the 
> backup, we are running {{clearsnapshot}} to delete the snapshot files. At 
> that time we may have thousands of files to delete, and it's causing sudden 
> disk usage spike. As a result, we are experiencing a spike of drop messages 
> from Cassandra.
> I think we should implement something like {{slowrm}} to delete the snapshot 
> files slowly, avoid the sudden disk usage spike.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13990) Remove OldNetworkTopologyStrategy

2019-11-18 Thread Anthony Grasso (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976941#comment-16976941
 ] 

Anthony Grasso commented on CASSANDRA-13990:


Started reviewing the patch.

> Remove OldNetworkTopologyStrategy
> -
>
> Key: CASSANDRA-13990
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13990
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: Jeremy Hanna
>Assignee: Anthony Grasso
>Priority: Low
>  Labels: lhf
> Attachments: 13990-trunk.txt
>
>
> RackAwareStrategy was renamed OldNetworkTopologyStrategy back in 0.7 
> (CASSANDRA-1392) and it's still around.  Is there any reason to keep this 
> relatively dead code in the codebase at this point?  I'm not aware of its use 
> and it sometimes confuses users.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-11370) Display sstable count per level according to repair status on nodetool tablestats

2019-11-18 Thread Ekaterina Dimitrova (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ekaterina Dimitrova reassigned CASSANDRA-11370:
---

Assignee: (was: Ekaterina Dimitrova)

> Display sstable count per level according to repair status on nodetool 
> tablestats
> -
>
> Key: CASSANDRA-11370
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11370
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tool/nodetool
>Reporter: Paulo Motta
>Priority: Low
>  Labels: lhf
>
> After CASSANDRA-8004 we still display sstables in each level on nodetool 
> tablestats as if we had a single compaction strategy, while we have one 
> strategy for repaired and another for unrepaired data. 
> We should split display into repaired and unrepaired set, so this:
> SSTables in each level: [2, 20/10, 15, 0, 0, 0, 0, 0, 0]
> Would become:
> SSTables in each level (repaired): [1, 10, 0, 0, 0, 0, 0, 0, 0]
> SSTables in each level (unrepaired): [1, 10, 15, 0, 0, 0, 0, 0, 0]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-11370) Display sstable count per level according to repair status on nodetool tablestats

2019-11-18 Thread Ekaterina Dimitrova (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ekaterina Dimitrova reassigned CASSANDRA-11370:
---

Assignee: Ekaterina Dimitrova

> Display sstable count per level according to repair status on nodetool 
> tablestats
> -
>
> Key: CASSANDRA-11370
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11370
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tool/nodetool
>Reporter: Paulo Motta
>Assignee: Ekaterina Dimitrova
>Priority: Low
>  Labels: lhf
>
> After CASSANDRA-8004 we still display sstables in each level on nodetool 
> tablestats as if we had a single compaction strategy, while we have one 
> strategy for repaired and another for unrepaired data. 
> We should split display into repaired and unrepaired set, so this:
> SSTables in each level: [2, 20/10, 15, 0, 0, 0, 0, 0, 0]
> Would become:
> SSTables in each level (repaired): [1, 10, 0, 0, 0, 0, 0, 0, 0]
> SSTables in each level (unrepaired): [1, 10, 15, 0, 0, 0, 0, 0, 0]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-15318) sendMessagesToNonlocalDC() should shuffle targets

2019-11-18 Thread Dinesh Joshi (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976809#comment-16976809
 ] 

Dinesh Joshi edited comment on CASSANDRA-15318 at 11/18/19 7:46 PM:


+1 but before merging lets ensure that the test failures are unrelated.


was (Author: djoshi3):
+1 but before merging lets insure that the test failures are unrelated.

> sendMessagesToNonlocalDC() should shuffle targets
> -
>
> Key: CASSANDRA-15318
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15318
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Messaging/Internode
>Reporter: Jon Meredith
>Assignee: Jon Meredith
>Priority: Normal
>
> To better spread load and reduce the impact of a node failure before 
> detection (or other issues like issues host replacement), when forwarding 
> messages to other data centers the forwarding non-local dc nodes should be 
> selected at random rather than always selecting the first node in the list of 
> endpoints for a token.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15318) sendMessagesToNonlocalDC() should shuffle targets

2019-11-18 Thread Dinesh Joshi (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dinesh Joshi updated CASSANDRA-15318:
-
Reviewers: Dinesh Joshi, Dinesh Joshi  (was: Dinesh Joshi)
   Dinesh Joshi, Dinesh Joshi  (was: Dinesh Joshi)
   Status: Review In Progress  (was: Patch Available)

> sendMessagesToNonlocalDC() should shuffle targets
> -
>
> Key: CASSANDRA-15318
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15318
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Messaging/Internode
>Reporter: Jon Meredith
>Assignee: Jon Meredith
>Priority: Normal
>
> To better spread load and reduce the impact of a node failure before 
> detection (or other issues like issues host replacement), when forwarding 
> messages to other data centers the forwarding non-local dc nodes should be 
> selected at random rather than always selecting the first node in the list of 
> endpoints for a token.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15318) sendMessagesToNonlocalDC() should shuffle targets

2019-11-18 Thread Dinesh Joshi (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976809#comment-16976809
 ] 

Dinesh Joshi commented on CASSANDRA-15318:
--

+1 but before merging lets insure that the test failures are unrelated.

> sendMessagesToNonlocalDC() should shuffle targets
> -
>
> Key: CASSANDRA-15318
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15318
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Messaging/Internode
>Reporter: Jon Meredith
>Assignee: Jon Meredith
>Priority: Normal
>
> To better spread load and reduce the impact of a node failure before 
> detection (or other issues like issues host replacement), when forwarding 
> messages to other data centers the forwarding non-local dc nodes should be 
> selected at random rather than always selecting the first node in the list of 
> endpoints for a token.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15429) Support NodeTool for in-jvm dtest

2019-11-18 Thread Yifan Cai (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976807#comment-16976807
 ] 

Yifan Cai commented on CASSANDRA-15429:
---

[~drohrer] and I co-worked on it.

The changes in the PRs 
 # added {{NodeProbeFactory}} field in NodeTool. The field can be set to the 
mock version, {{InternalNodeProbeFactory}} when running dtest. 
 # added {{InternalNodeProbe}} that extends {{NodeProbe}}. It supports a subset 
of the nodetool functionality. The unsupported operations are basically 
'printing info onto terminal' or similar display ops, which dtest has little 
interest in.
 # The changes to the production code, i.e. under the {{tools}} package, is 
made minimal. All existing functionality of nodetool should work as is.

||PR||
|[trunk|https://github.com/apache/cassandra/pull/385]|
|[cassandra-3.11|https://github.com/apache/cassandra/pull/386]|
|[cassandra-3.0|https://github.com/apache/cassandra/pull/387]|
|[cassandra-2.2|https://github.com/apache/cassandra/pull/388]|

> Support NodeTool for in-jvm dtest
> -
>
> Key: CASSANDRA-15429
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15429
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Test/dtest
>Reporter: Yifan Cai
>Assignee: Yifan Cai
>Priority: Normal
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> In-JVM dtest framework does not support nodetool as of now. This 
> functionality is wanted in some tests, e.g. constructing an end-to-end test 
> scenario that uses nodetool.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15413) Missing results on reading large frozen text map

2019-11-18 Thread Tyler Codispoti (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976805#comment-16976805
 ] 

Tyler Codispoti commented on CASSANDRA-15413:
-

As a temporary workaround, we made a change to compareNextTo() in 
AbstractCompoundCellNameType to force using the BytesType comparator for this 
column. We did more changes to ensure we don't mess with any other columns, but 
essentially, the change boils down to:

 
{code:java}


ByteBuffer previous = null;
for (int i = 0; i < composite.size(); i++)
{
if (!hasComponent(i))
return nextEOC == Composite.EOC.END ? 1 : -1;

AbstractType comparator = type.subtype(i);
ByteBuffer value1 = nextComponents[i];
ByteBuffer value2 = composite.get(i);

// For frozen map, do not compare each key/value. Compare the 
whole serilized binary
// as what it did when writing to sstables.
if (comparator instanceof MapType) {
comparator = BytesType.instance;
}

int cmp = comparator.compareCollectionMembers(value1, value2, 
previous);

if (cmp != 0)
return cmp;

previous = value1;
}


{code}
 

This looks to resolve the issue. It seems what is happening is that, when 
reading Map, it will compare when reading back using lexicographic 
order, but it is stored in binary order. When reading data back for pages after 
the 1st, it'll compare the last value of the previous page to the current page 
to see if any records can be skipped. Since the sort comparator is of the wrong 
type, you can easily end up in a state where it skips records in correctly.

> Missing results on reading large frozen text map
> 
>
> Key: CASSANDRA-15413
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15413
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/SSTable
>Reporter: Tyler Codispoti
>Assignee: Alex Petrov
>Priority: Normal
>
> Cassandra version: 2.2.15
> I have been running into a case where, when fetching the results from a table 
> with a frozen>, if the number of results is greater than the 
> fetch size (default 5000), we can end up with missing data.
> Side note: The table schema comes from using KairosDB, but we've isolated 
> this issue to Cassandra itself. But it looks like this can cause problems for 
> users of KairosDB as well.
> Repro case. Tested against fresh install of Cassandra 2.2.15.
> 1. Create table (csqlsh)
> {code:sql}
> CREATE KEYSPACE test
>   WITH REPLICATION = { 
>'class' : 'SimpleStrategy', 
>'replication_factor' : 1 
>   };
>   CREATE TABLE test.test (
> name text,
> tags frozen>,
> PRIMARY KEY (name, tags)
>   ) WITH CLUSTERING ORDER BY (tags ASC);
> {code}
> 2. Insert data (python3)
> {code:python}
> import time
> from cassandra.cluster import Cluster
> cluster = Cluster(['127.0.0.1'])
> session = cluster.connect('test')
> for i in range(0, 2):
> session.execute(
> """
> INSERT INTO test (name, tags)  
> VALUES (%s, %s)
> """,
> ("test_name", {'id':str(i)})
> )
> {code}
>  
> 3. Flush
>  
> {code:java}
> nodetools flush{code}
>  
>  
> 4. Fetch data (python3)
> {code:python}
> import time
> from cassandra.cluster import Cluster
> cluster = Cluster(['127.0.0.1'], control_connection_timeout=5000)
> session = cluster.connect('test')
> session.default_fetch_size = 5000
> session.default_timeout = 120
> count = 0
> rows = session.execute("select tags from test where name='test_name'")
> for row in rows:
> count += 1
> print(count)
> {code}
> Result: 10111 (expected 2)
>  
> Changing the page size changes the result count. Some quick samples:
>  
> ||default_fetch_size||count||
> |5000|10111|
> |1000|1830|
> |999|1840|
> |998|1850|
> |2|2|
> |10|2|
>  
>  
> In short, I cannot guarantee I'll get all the results back unless the page 
> size > number of rows.
> This seems to get worse with multiple SSTables (eg nodetool flush between 
> some of the insert batches). When using replication, the issue can get 
> disgustingly bad - potentially giving a different result on each query.
> Interesting, if we pad the values on the tag map ("id" in this repro case) so 
> that the insertion is in lexicographical order, there is no issue. I believe 
> the issue also does not repro if I do not call "nodetools flush" before 
> querying.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: 

[jira] [Updated] (CASSANDRA-15429) Support NodeTool for in-jvm dtest

2019-11-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated CASSANDRA-15429:
---
Labels: pull-request-available  (was: )

> Support NodeTool for in-jvm dtest
> -
>
> Key: CASSANDRA-15429
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15429
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Test/dtest
>Reporter: Yifan Cai
>Assignee: Yifan Cai
>Priority: Normal
>  Labels: pull-request-available
>
> In-JVM dtest framework does not support nodetool as of now. This 
> functionality is wanted in some tests, e.g. constructing an end-to-end test 
> scenario that uses nodetool.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15318) sendMessagesToNonlocalDC() should shuffle targets

2019-11-18 Thread Jon Meredith (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976802#comment-16976802
 ] 

Jon Meredith commented on CASSANDRA-15318:
--

Rebased and rerunning to double-check some (thought to be) unrelated unit test 
failures.

[CircleCI|https://circleci.com/workflow-run/c6247670-e965-4260-9632-5bd3deb9ad06]



> sendMessagesToNonlocalDC() should shuffle targets
> -
>
> Key: CASSANDRA-15318
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15318
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Messaging/Internode
>Reporter: Jon Meredith
>Assignee: Jon Meredith
>Priority: Normal
>
> To better spread load and reduce the impact of a node failure before 
> detection (or other issues like issues host replacement), when forwarding 
> messages to other data centers the forwarding non-local dc nodes should be 
> selected at random rather than always selecting the first node in the list of 
> endpoints for a token.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15429) Support NodeTool for in-jvm dtest

2019-11-18 Thread Yifan Cai (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yifan Cai updated CASSANDRA-15429:
--
Authors: Doug Rohrer, Yifan Cai  (was: Yifan Cai)

> Support NodeTool for in-jvm dtest
> -
>
> Key: CASSANDRA-15429
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15429
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Test/dtest
>Reporter: Yifan Cai
>Assignee: Yifan Cai
>Priority: Normal
>
> In-JVM dtest framework does not support nodetool as of now. This 
> functionality is wanted in some tests, e.g. constructing an end-to-end test 
> scenario that uses nodetool.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-2848) Make the Client API support passing down timeouts

2019-11-18 Thread Yifan Cai (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yifan Cai reassigned CASSANDRA-2848:


Assignee: Yifan Cai  (was: Dinesh Joshi)

> Make the Client API support passing down timeouts
> -
>
> Key: CASSANDRA-2848
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2848
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Chris Goffinet
>Assignee: Yifan Cai
>Priority: Low
> Fix For: 3.11.x
>
> Attachments: 2848-trunk-v2.txt, 2848-trunk.txt
>
>
> Having a max server RPC timeout is good for worst case, but many applications 
> that have middleware in front of Cassandra, might have higher timeout 
> requirements. In a fail fast environment, if my application starting at say 
> the front-end, only has 20ms to process a request, and it must connect to X 
> services down the stack, by the time it hits Cassandra, we might only have 
> 10ms. I propose we provide the ability to specify the timeout on each call we 
> do optionally.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15410) Avoid over-allocation of bytes for UTF8 string serialization

2019-11-18 Thread Yifan Cai (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976776#comment-16976776
 ] 

Yifan Cai commented on CASSANDRA-15410:
---

Updated the PR with the {{sizeOfAsciiString()}} method. 

Since the input of the method is a US-ASCII string, the method simply returns 
{{2 (size) + str.length}}. Therefore, 2 orders of magnitude faster than 
{{sizeOfString()}} that iterates through the string.  

{code:java}
[java] Benchmark   Mode  Cnt
ScoreError  Units
[java] StringsEncodeBench.sizeOfAsciiStringavgt6
1.999 ±  0.153  ns/op
[java] StringsEncodeBench.sizeOfString avgt6  
283.413 ± 24.614  ns/op
{code}


> Avoid over-allocation of bytes for UTF8 string serialization 
> -
>
> Key: CASSANDRA-15410
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15410
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Messaging/Client
>Reporter: Yifan Cai
>Assignee: Yifan Cai
>Priority: Normal
> Fix For: 4.0
>
>
> In the current message encoding implementation, it first calculates the 
> `encodeSize` and allocates the bytebuffer with that size. 
> However, during encoding, it assumes the worst case of writing UTF8 string to 
> allocate bytes, i.e. assuming each letter takes 3 bytes. 
> The over-estimation further leads to resizing the underlying array and data 
> copy. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15410) Avoid over-allocation of bytes for UTF8 string serialization

2019-11-18 Thread Yifan Cai (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yifan Cai updated CASSANDRA-15410:
--
Reviewers: Aleksey Yeschenko, Dinesh Joshi  (was: Aleksey Yeschenko)

> Avoid over-allocation of bytes for UTF8 string serialization 
> -
>
> Key: CASSANDRA-15410
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15410
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Messaging/Client
>Reporter: Yifan Cai
>Assignee: Yifan Cai
>Priority: Normal
> Fix For: 4.0
>
>
> In the current message encoding implementation, it first calculates the 
> `encodeSize` and allocates the bytebuffer with that size. 
> However, during encoding, it assumes the worst case of writing UTF8 string to 
> allocate bytes, i.e. assuming each letter takes 3 bytes. 
> The over-estimation further leads to resizing the underlying array and data 
> copy. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14888) Several mbeans are not unregistered when dropping a keyspace and table

2019-11-18 Thread Chris Lohfink (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Lohfink updated CASSANDRA-14888:
--
Reviewers: Chris Lohfink, Dinesh Joshi  (was: Dinesh Joshi)

> Several mbeans are not unregistered when dropping a keyspace and table
> --
>
> Key: CASSANDRA-14888
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14888
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability/Metrics
>Reporter: Ariel Weisberg
>Assignee: Alex Deparvu
>Priority: Urgent
>  Labels: patch-available
> Fix For: 4.0, 4.0-rc
>
> Attachments: CASSANDRA-14888.patch
>
>
> CasCommit, CasPrepare, CasPropose, ReadRepairRequests, 
> ShortReadProtectionRequests, AntiCompactionTime, BytesValidated, 
> PartitionsValidated, RepairPrepareTime, RepairSyncTime, 
> RepairedDataInconsistencies, ViewLockAcquireTime, ViewReadTime, 
> WriteFailedIdealCL
> Basically for 3 years people haven't known what they are doing because the 
> entire thing is kind of obscure. Fix it and also add a dtest that detects if 
> any mbeans are left behind after dropping a table and keyspace.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14888) Several mbeans are not unregistered when dropping a keyspace and table

2019-11-18 Thread Chris Lohfink (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976643#comment-16976643
 ] 

Chris Lohfink commented on CASSANDRA-14888:
---

You should to add a unit test to cover MVs as well as they have some 
conditionally registered metrics.

There are utility methods to create the metrics and automatically deregister 
them on cleanup. All metrics with issues just skipped that and created the 
metrics manually. Really this does fix the issue, but by doing more manual 
cleanup.

While this does fix the problem, I think we should change these metrics to 
register appropriately (which also may provide keyspace metrics) or clean up 
that mechanism up a bit to be easier (maybe using annotations, reflection or 
something?). We should try to enforce the registering and automatic cleanup or 
make it easier and more obvious instead of setting a further precedence to do 
it manually.

In the past the "wall of removeMetric" calls in {{release}} was never kept in 
sync and while the junit should work, this actually isn't the first time a unit 
test like this has been made (3rd to my knowledge) and there are some "flakey" 
scenarios with dropping tables, and the test actually doesn't capture 
everything (although we can improve that).

This is the 4th (that I remember at least) time this issue came up, I think we 
should look at this in a bigger picture sense. That said if you are not 
interested or don't have bandwidth we could just patch this up a little here as 
it has definite value and have a follow up ticket to try to prevent the issue 
in future.

> Several mbeans are not unregistered when dropping a keyspace and table
> --
>
> Key: CASSANDRA-14888
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14888
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability/Metrics
>Reporter: Ariel Weisberg
>Assignee: Alex Deparvu
>Priority: Urgent
>  Labels: patch-available
> Fix For: 4.0, 4.0-rc
>
> Attachments: CASSANDRA-14888.patch
>
>
> CasCommit, CasPrepare, CasPropose, ReadRepairRequests, 
> ShortReadProtectionRequests, AntiCompactionTime, BytesValidated, 
> PartitionsValidated, RepairPrepareTime, RepairSyncTime, 
> RepairedDataInconsistencies, ViewLockAcquireTime, ViewReadTime, 
> WriteFailedIdealCL
> Basically for 3 years people haven't known what they are doing because the 
> entire thing is kind of obscure. Fix it and also add a dtest that detects if 
> any mbeans are left behind after dropping a table and keyspace.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15410) Avoid over-allocation of bytes for UTF8 string serialization

2019-11-18 Thread Aleksey Yeschenko (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976583#comment-16976583
 ] 

Aleksey Yeschenko commented on CASSANDRA-15410:
---

While you are at it, maybe update {{encodedSize()}} implementation as well to 
use the faster {{sizeOfAsciiString()}} - if not for performance then for 
symmetry?

> Avoid over-allocation of bytes for UTF8 string serialization 
> -
>
> Key: CASSANDRA-15410
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15410
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Messaging/Client
>Reporter: Yifan Cai
>Assignee: Yifan Cai
>Priority: Normal
> Fix For: 4.0
>
>
> In the current message encoding implementation, it first calculates the 
> `encodeSize` and allocates the bytebuffer with that size. 
> However, during encoding, it assumes the worst case of writing UTF8 string to 
> allocate bytes, i.e. assuming each letter takes 3 bytes. 
> The over-estimation further leads to resizing the underlying array and data 
> copy. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-15432) The "read defragmentation" optimization does not work

2019-11-18 Thread Sylvain Lebresne (Jira)
Sylvain Lebresne created CASSANDRA-15432:


 Summary: The "read defragmentation" optimization does not work
 Key: CASSANDRA-15432
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15432
 Project: Cassandra
  Issue Type: Bug
Reporter: Sylvain Lebresne


The so-called "read defragmentation" that has been added way back with 
CASSANDRA-2503 actually does not work, and never has. That is, the 
defragmentation writes do happen, but they only additional load on the nodes 
without helping anything, and are thus a clear negative.

The "read defragmentation" (which only impact so-called "names queries") kicks 
in when a read hits "too many" sstables (> 4 by default), and when it does, it 
writes down the result of that read. The assumption being that the next read 
for that data would only read the newly written data, which if not still in 
memtable would at least be in a single sstable, thus speeding that next read.

Unfortunately, this is not how this work. When we defrag and write the result 
of our original read, we do so with the timestamp of the data read (as we 
should, changing the timestamp would be plain wrong). And as a result, 
following reads will read that data first, but will have no way to tell that no 
more sstables should be read. Technically, the 
[{{reduceFilter}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/SinglePartitionReadCommand.java#L830]
 call will not return {{null}} because the {{currentMaxTs}} will be higher than 
at least some of the data in the result, and this until we've read from as many 
sstables than in the original read.

I see no easy way to fix this. It might be possible to make it work with 
additional per-sstable metadata, but nothing sufficiently simple and cheap to 
be worth it comes to mind. And I thus suggest simply removing that code.

For the record, I'll note that there is actually a 2nd problem with that code: 
currently, we "defrag" a read even if we didn't got data for everything that 
the query requests. This also is "wrong" even if we ignore the first issue: a 
following read that would read the defragmented data would also have no way to 
know to not read more sstables to try to get the missing parts. This problem 
would be fixeable, but is obviously overshadowed by the previous one anyway.

Anyway, as mentioned, I suggest to just remove the "optimization" (which again, 
never optimized anything) altogether, and happy to provide the simple patch.

The only question might be in which versions? This impact all versions, but 
this isn't a correction bug either, "just" a performance one. So do we want 4.0 
only or is there appetite for earlier?




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org