[jira] [Updated] (CASSANDRA-13082) Suspect OnDiskIndex.IteratorOrder.startAt code for ASC

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-13082:
-
Component/s: (was: Core)
 Local Write-Read Paths

> Suspect OnDiskIndex.IteratorOrder.startAt code for ASC
> --
>
> Key: CASSANDRA-13082
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13082
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local Write-Read Paths
>Reporter: Dave Brosius
>Priority: Trivial
> Fix For: 4.x
>
>
> startAt for ASC does
> {code}
> case ASC:
> if (found.cmp < 0) // search term was bigger then whole 
> data set
> return found.index;
> return inclusive && (found.cmp == 0 || found.cmp < 0) ? 
> found.index : found.index - 1;
> {code}
> which is equivalent to
> {code}
> case ASC:
> if (found.cmp < 0) // search term was bigger then whole 
> data set
> return found.index;
> return inclusive ? found.index : found.index - 1;
> {code}
> which seems wrong. Is the parenthesis wrong here?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-10309) Avoid always looking up column type

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-10309:
-
Component/s: Local Write-Read Paths

> Avoid always looking up column type
> ---
>
> Key: CASSANDRA-10309
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10309
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local Write-Read Paths
>Reporter: T Jake Luciani
>Assignee: Carl Yeksigian
>Priority: Minor
>  Labels: perfomance
> Fix For: 4.x
>
>
> Doing some read profiling I noticed we always seem to look up the type of a 
> column from the schema metadata when we have the type already in the column 
> class.
> This one simple change to SerializationHeader improves read performance 
> non-trivially.
> https://github.com/tjake/cassandra/commit/69b94c389b3f36aa035ac4619fd22d1f62ea80b2
> http://cstar.datastax.com/graph?stats=3fb1ced4-58c7-11e5-9faf-42010af0688f=op_rate=2_read=1_aggregates=true=0=357.94=0=157416.6
> I assume we are looking this up to deal with schema changes. But I'm sure 
> there is a more performant way of doing this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-10201) Row cache compression

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-10201:
-
Component/s: Core

> Row cache compression
> -
>
> Key: CASSANDRA-10201
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10201
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Core
>Reporter: Robert Stupp
>Priority: Minor
> Fix For: 4.x
>
>
> Compressing the contents of the row cache _may_ buy some performance benefit 
> by increasing hit ratio due to allowing more data in the row cache.
> This would obviously only work if the data really allows good compression 
> ratio.
> (This is not a high priority ticket but feels useful enough to have it in the 
> backlog.)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-10158) Rationalize implementations of DataInputPlus and DataOutputPlus

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-10158:
-
Component/s: Local Write-Read Paths

> Rationalize implementations of DataInputPlus and DataOutputPlus
> ---
>
> Key: CASSANDRA-10158
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10158
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local Write-Read Paths
>Reporter: Stefania
>Priority: Major
> Fix For: 4.x
>
>
> Following CASSANDRA-8630, we can further improve implementations of 
> {{DataInputPlus}} and {{DataOutputPlus}} as follows:
> * In {{MmappedRegions}}, compact the mmap ranges, at least on the final 
> opening of the file
> * Use the mmap extension logic for compressed files
> * Consider renaming classes to more appropriate names, moving them in their 
> own package and making their relations more simply defined.
> * Consider unifying {{ChecksummedDataInput}} and 
> {{ChecksummedRandomAccessReader}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-10179) Duplicate index should throw AlreadyExistsException

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-10179:
-
Component/s: Secondary Indexes

> Duplicate index should throw AlreadyExistsException
> ---
>
> Key: CASSANDRA-10179
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10179
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Secondary Indexes
>Reporter: T Jake Luciani
>Priority: Minor
> Fix For: 4.x
>
>
> If a 2i already exists we currently throw a InvalidQueryException.  This 
> should be a AlreadyExistsException for consistency like trying to create the 
> same CQL Table twice.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-11968) More metrics on native protocol requests & responses

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-11968:
-
Component/s: Metrics

> More metrics on native protocol requests & responses
> 
>
> Key: CASSANDRA-11968
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11968
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Metrics
>Reporter: Robert Stupp
>Assignee: Robert Stupp
>Priority: Minor
> Fix For: 4.x
>
>
> Proposal to add more metrics to the native protocol:
> - number of requests per request-type
> - number of responses by response-type
> - size of request messages in bytes
> - size of response messages in bytes
> - number of in-flight requests (from request arrival to response)
> (Will provide a patch soon)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-11868) unused imports and generic types

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-11868:
-
Component/s: Core

> unused imports and generic types
> 
>
> Key: CASSANDRA-11868
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11868
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Edward Capriolo
>Assignee: Edward Capriolo
>Priority: Major
> Fix For: 4.x
>
>
> I was going through Cassandra source and for busy work I started looking at 
> all the .java files eclipse flags as warning. They are broken roughly into a 
> few cases. 
> 1) unused imports 
> 2) raw types missing <> 
> 3) case statements without defaults 
> 4) @resource annotation 
> My IDE claims item 4 is not needed (it looks like we have done this to 
> signify methods that return objects that need to be closed) I can guess 4 was 
> done intentionally and short of making out own annotation I will ignore these 
> for now. 
> I would like to tackle this busy work before I get started. I have some 
> questions: 
> 1) Do this only on trunk? or multiple branches 
> 2) should I tackle 1,2,3 in separate branches/patches



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-10283) Create central class that represents node's local data store

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-10283:
-
Component/s: Core

> Create central class that represents node's local data store
> 
>
> Key: CASSANDRA-10283
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10283
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core, Local Write-Read Paths
>Reporter: Yuki Morishita
>Priority: Minor
>  Labels: refactoring
> Fix For: 4.x
>
>
> This is related to CASSANDRA-7837 which aims to take down static 
> initializations and singletons. Instead of doing it all at once, we can / 
> should conquer by dividing internal to several part as discussed in that 
> ticket. I just throw in my thought to discuss further.
> The node local data store (commitlog, memtable, SSTable etc) is at the core 
> of Cassandra, and I think it is good to start from here.
> The central class for local data store class (I refer it as 
> {{CassandraDataStore}} from here) manages the following:
> * CommitLog
> * CacheService
> * IndexSummaryManager
> * Keyspace / ColumnFamilyStore
> * CompactionManager
> * Schema / SchemaKeyspace
> * MemtablePool
> and possibly others, basically those that don't use {{MessagingService}}. 
> These classes won't have static initialization/accessors and singleton, as 
> {{CassandraDataStore}} initializes and wires them as necessary. We also can 
> take access to {{DatabaseDescriptor}} away from these classes as 
> {{CassandraDataStore}} does initialization (but this can be done separately).
> Of course, even after this, we still need to have one singleton instance that 
> is {{CassandraDataStore}} itself to be accessed from other modules, but it 
> will eventually be gone as we take down other part of singletons.
> Benefits to do this includes:
> * More explicit startup and cleanup procedure. It is hard to tell what is 
> initialized when right now, and sometimes it creates unpredected result.
> * Simpler unit test setup. We don't need to bootstrap messaging or gossip to 
> test local only functionality.
> Even the scope of the change is limitted compare to CASSANDRA-7837, this 
> needs a lot of effort still and change to the code is still huge. But I 
> believe this is worth the shot, and I appreciate any feedbacks regarding 
> feasibility.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-10283) Create central class that represents node's local data store

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-10283:
-
Component/s: Local Write-Read Paths

> Create central class that represents node's local data store
> 
>
> Key: CASSANDRA-10283
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10283
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core, Local Write-Read Paths
>Reporter: Yuki Morishita
>Priority: Minor
>  Labels: refactoring
> Fix For: 4.x
>
>
> This is related to CASSANDRA-7837 which aims to take down static 
> initializations and singletons. Instead of doing it all at once, we can / 
> should conquer by dividing internal to several part as discussed in that 
> ticket. I just throw in my thought to discuss further.
> The node local data store (commitlog, memtable, SSTable etc) is at the core 
> of Cassandra, and I think it is good to start from here.
> The central class for local data store class (I refer it as 
> {{CassandraDataStore}} from here) manages the following:
> * CommitLog
> * CacheService
> * IndexSummaryManager
> * Keyspace / ColumnFamilyStore
> * CompactionManager
> * Schema / SchemaKeyspace
> * MemtablePool
> and possibly others, basically those that don't use {{MessagingService}}. 
> These classes won't have static initialization/accessors and singleton, as 
> {{CassandraDataStore}} initializes and wires them as necessary. We also can 
> take access to {{DatabaseDescriptor}} away from these classes as 
> {{CassandraDataStore}} does initialization (but this can be done separately).
> Of course, even after this, we still need to have one singleton instance that 
> is {{CassandraDataStore}} itself to be accessed from other modules, but it 
> will eventually be gone as we take down other part of singletons.
> Benefits to do this includes:
> * More explicit startup and cleanup procedure. It is hard to tell what is 
> initialized when right now, and sometimes it creates unpredected result.
> * Simpler unit test setup. We don't need to bootstrap messaging or gossip to 
> test local only functionality.
> Even the scope of the change is limitted compare to CASSANDRA-7837, this 
> needs a lot of effort still and change to the code is still huge. But I 
> believe this is worth the shot, and I appreciate any feedbacks regarding 
> feasibility.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-10226) Support multiple non-PK cols in MV clustering key when partition key is shared

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-10226:
-
Component/s: Materialized Views

> Support multiple non-PK cols in MV clustering key when partition key is shared
> --
>
> Key: CASSANDRA-10226
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10226
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Materialized Views
>Reporter: Tyler Hobbs
>Priority: Major
>  Labels: materializedviews
> Fix For: 4.x
>
>
> This issue is similar to CASSANDRA-9928, but with one key limitation: the MV 
> partition key must match the base table's partition key.  This limitation 
> results in the base replica always pairing with itself as the MV replica.  
> Because of this pairing, if the base replica is lost, any MV rows that would 
> otherwise be ambiguous are also lost.  This allows us to avoid the problem 
> described in 9928 of not knowing which MV row to delete.
> Although this limitation has the potential to be a bit confusing for users, I 
> believe this improvement is still worthwhile because:
> * The base table's partition key will often be a good choice for the MV 
> partition key as well.  I expect it to be common for users to partition data 
> the same way, but use a different clustering order to optimize for (or allow 
> for) different queries.
> * It may take a long time to solve the problems presented in 9928 in general 
> (if we can solve them at all).  On the other hand, this is straightforward 
> and is a significant improvement to the usability of MVs.
> I have a minimal prototype of this that works well, so I should be able to 
> upload a patch with thorough tests within the next few days.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-10307) Avoid always locking the partition key when a table has a materialized view

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-10307:
-
Component/s: Materialized Views

> Avoid always locking the partition key when a table has a materialized view
> ---
>
> Key: CASSANDRA-10307
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10307
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Materialized Views
>Reporter: T Jake Luciani
>Priority: Major
>  Labels: materializedviews
> Fix For: 4.x
>
>
> When a table has associated materialized views we must restrict other 
> concurrent changes to the affected rows.  We currently lock the entire 
> partition.  
> The issue is many updates to the same partition on the base table is now 
> serialized effectively.
> We can't lock the primary key instead due to range tombstones cover a range 
> of rows.
> If we created (or perhaps reuse if already exists) a clustering range class 
> we can lock at this level. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-10245) Provide after the fact visibility into the reliability of the environment C* operates in

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-10245:
-
Component/s: Observability

> Provide after the fact visibility into the reliability of the environment C* 
> operates in
> 
>
> Key: CASSANDRA-10245
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10245
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Observability
>Reporter: Ariel Weisberg
>Priority: Major
> Fix For: 4.x
>
>
> I think that by default databases should not be completely dependent on 
> operator provided tools for monitoring node and network health.
> The database should be able to detect and report on several dimensions of 
> performance in its environment, and more specifically report on deviations 
> from acceptable performance.
> * Node wide pauses
> * JVM wide pauses
> * Latency, and roundtrip time to all endpoints
> * Block device IO latency
> If flight recorder were available for use in production I would say as a 
> start just turn that on, add jHiccup (inside and outside the server process), 
> and a daemon inside the server to measure network performance between 
> endpoints.
> FR is not available (requires a license in production) so instead focus on 
> adding instrumentation for the most useful facets of flight recorder in 
> diagnosing performance issues. I think we can get pretty far because what we 
> need to do is not quite as undirected as the exploration FR and JMC 
> facilitate.
> Until we dial in how we measure and how to signal without false positives I 
> would expect this kind of logging to be in the background for post-hoc 
> analysis.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-10154) Uncompressed data files should serialize inline checksums, over smaller blocks

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-10154:
-
Component/s: Core

> Uncompressed data files should serialize inline checksums, over smaller blocks
> --
>
> Key: CASSANDRA-10154
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10154
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Benedict
>Priority: Major
> Fix For: 4.x
>
>
> As has been raised on numerous occasions, we currently perform no "bitrot" 
> detection on uncompressed data file reads. Currently we serialize an 
> incremental checksum, but this is placed in a separate file, and covers a 64K 
> region. Both of these are undesirable characteristics.
> Ideally following on from CASSANDRA-10153 (or perhaps as part thereof), we 
> should somewhat unify the read paths of uncompressed and compressed data with 
> regard to checksums.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-10175) cassandra-stress should be tolerant when a remote node shutdown

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-10175:
-
Component/s: Stress

> cassandra-stress should be tolerant when a remote node shutdown 
> 
>
> Key: CASSANDRA-10175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10175
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Stress
>Reporter: Alan Boudreault
>Priority: Major
>  Labels: stress
> Fix For: 4.x
>
>
> Currently, if we start a stress session with 3 nodes and shutdown one node, 
> stress will crash. It is caused by the JMX connection lost on the node, which 
> is use to collect some gc stats IIRC. 
> backtrace: https://gist.github.com/aboudreault/6cd82bb0acc681992414
> Stress should handle that jmx connection lost in a better way so the session 
> could continue. Ideally, it should try to *reconnect* to JMX if the node is 
> back online?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-6633) Dynamic Resize of Bloom Filters

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-6633:

Component/s: Core

> Dynamic Resize of Bloom Filters
> ---
>
> Key: CASSANDRA-6633
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6633
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Benedict
>Priority: Minor
>  Labels: performance
> Fix For: 4.x
>
>
> Dynamic resizing would be useful. The simplest way to achieve this is to have 
> separate address spaces for each hash function, so that we may 
> increase/decrease accuracy by simply loading/unloading another function (we 
> could even do interesting stuff in future like alternating the functions we 
> select if we find we're getting more false positives than should be expected);
> Faster loading/unloading would help this, and we could achieve this by 
> mmapping the bloom filter representation on systems that we can mlock.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-6506) counters++ split counter context shards into separate cells

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-6506:

Component/s: Core

> counters++ split counter context shards into separate cells
> ---
>
> Key: CASSANDRA-6506
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6506
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Aleksey Yeschenko
>Priority: Major
>  Labels: counters
> Fix For: 4.x
>
>
> This change is related to, but somewhat orthogonal to CASSANDRA-6504.
> Currently all the shard tuples for a given counter cell are packed, in sorted 
> order, in one binary blob. Thus reconciling N counter cells requires 
> allocating a new byte buffer capable of holding the union of the two 
> context's shards N-1 times.
> For writes, in post CASSANDRA-6504 world, it also means reading more data 
> than we have to (the complete context, when all we need is the local node's 
> global shard).
> Splitting the context into separate cells, one cell per shard, will help to 
> improve this. We did a similar thing with super columns for CASSANDRA-3237. 
> Incidentally, doing this split is now possible thanks to CASSANDRA-3237.
> Doing this would also simplify counter reconciliation logic. Getting rid of 
> old contexts altogether can be done trivially with upgradesstables.
> In fact, we should be able to put the logical clock into the cell's 
> timestamp, and use regular Cell-s and regular Cell reconcile() logic for the 
> shards, especially once we get rid of the local/remote shards some time in 
> the future (until then we still have to differentiate between 
> global/remote/local shards and their priority rules).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-6812) Iterative Memtable->SSTable Replacement

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-6812:

Component/s: Local Write-Read Paths

> Iterative Memtable->SSTable Replacement
> ---
>
> Key: CASSANDRA-6812
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6812
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local Write-Read Paths
>Reporter: Benedict
>Priority: Major
>  Labels: performance
> Fix For: 4.x
>
>
> In an ideal world we wouldn't flush any memtable until we were almost 
> completely out of room. The problem with this approach (and in fact whenever 
> we currently *do* run out of room) is that flushing an entire memtable is a 
> slow process, and so write latencies spike dramatically during this interval.
> The solution to this is, in principle, quite straight forward: As we write 
> chunks of the new sstable and its index, open them up immediately for 
> reading, and free the memory associated with the portion of the file that has 
> been written so that it can be reused immediately for writing. This way 
> whilst latency will increase for the duration of the flush, the max latency 
> experienced during this time should be no greater than the time taken to 
> flush a few chunks, which should still be on the order of milliseconds, not 
> seconds.
> This depends on CASSANDRA-6689 and CASSANDRA-6694, so that we can reclaim 
> arbitrary portions of the allocated memory prior to a complete flush.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-7029) Investigate alternative transport protocols for both client and inter-server communications

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-7029:

Component/s: Streaming and Messaging
 CQL

> Investigate alternative transport protocols for both client and inter-server 
> communications
> ---
>
> Key: CASSANDRA-7029
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7029
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL, Streaming and Messaging
>Reporter: Benedict
>Priority: Major
>  Labels: performance
> Fix For: 4.x
>
>
> There are a number of reasons to think we can do better than TCP for our 
> communications:
> 1) We can actually tolerate sporadic small message losses, so guaranteed 
> delivery isn't essential (although for larger messages it probably is)
> 2) As shown in \[1\] and \[2\], Linux can behave quite suboptimally with 
> regard to TCP message delivery when the system is under load. Judging from 
> the theoretical description, this is likely to apply even when the 
> system-load is not high, but the number of processes to schedule is high. 
> Cassandra generally has a lot of threads to schedule, so this is quite 
> pertinent for us. UDP performs substantially better here.
> 3) Even when the system is not under load, UDP has a lower CPU burden, and 
> that burden is constant regardless of the number of connections it processes. 
> 4) On a simple benchmark on my local PC, using non-blocking IO for UDP and 
> busy spinning on IO I can actually push 20-40% more throughput through 
> loopback (where TCP should be optimal, as no latency), even for very small 
> messages. Since we can see networking taking multiple CPUs' worth of time 
> during a stress test, using a busy-spin for ~100micros after last message 
> receipt is almost certainly acceptable, especially as we can (ultimately) 
> process inter-server and client communications on the same thread/socket in 
> this model.
> 5) We can optimise the threading model heavily: since we generally process 
> very small messages (200 bytes not at all implausible), the thread signalling 
> costs on the processing thread can actually dramatically impede throughput. 
> In general it costs ~10micros to signal (and passing the message to another 
> thread for processing in the current model requires signalling). For 200-byte 
> messages this caps our throughput at 20MB/s.
> I propose to knock up a highly naive UDP-based connection protocol with 
> super-trivial congestion control over the course of a few days, with the only 
> initial goal being maximum possible performance (not fairness, reliability, 
> or anything else), and trial it in Netty (possibly making some changes to 
> Netty to mitigate thread signalling costs). The reason for knocking up our 
> own here is to get a ceiling on what the absolute limit of potential for this 
> approach is. Assuming this pans out with performance gains in C* proper, we 
> then look to contributing to/forking the udt-java project and see how easy it 
> is to bring performance in line with what we can get with our naive approach 
> (I don't suggest starting here, as the project is using blocking old-IO, and 
> modifying it with latency in mind may be challenging, and we won't know for 
> sure what the best case scenario is).
> \[1\] 
> http://test-docdb.fnal.gov/0016/001648/002/Potential%20Performance%20Bottleneck%20in%20Linux%20TCP.PDF
> \[2\] 
> http://cd-docdb.fnal.gov/cgi-bin/RetrieveFile?docid=1968;filename=Performance%20Analysis%20of%20Linux%20Networking%20-%20Packet%20Receiving%20(Official).pdf;version=2
> Further related reading:
> http://public.dhe.ibm.com/software/commerce/doc/mft/cdunix/41/UDTWhitepaper.pdf
> https://mospace.umsystem.edu/xmlui/bitstream/handle/10355/14482/ChoiUndPerTcp.pdf?sequence=1
> https://access.redhat.com/site/documentation/en-US/JBoss_Enterprise_Web_Platform/5/html/Administration_And_Configuration_Guide/jgroups-perf-udpbuffer.html
> http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.153.3762=rep1=pdf



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-6038) Support schema changes in mixed version clusters

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-6038:

Component/s: Distributed Metadata

> Support schema changes in mixed version clusters
> 
>
> Key: CASSANDRA-6038
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6038
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Distributed Metadata
>Reporter: Aleksey Yeschenko
>Priority: Major
>  Labels: client-impacting
> Fix For: 4.x
>
>
> CASSANDRA-5845 made it so that schema changes in a major-mixed cluster are 
> not propagated to the minorer-major nodes. This lets us perform 
> backwards-incompatible schema changes in major releases safely - like adding 
> the schema_triggers table, moving all the aliases to schema_columns, getting 
> rid of the deprecated schema columns, etc.
> Even this limitation might be too strict, however, and should be avoided if 
> possible (starting with at least versioning schema separately from messaging 
> service versioning, and resorting to major->minor schema block only when 
> truly necessary and not for every x->y pair). 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-5992) Add a logger.trace call to Tracing

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-5992:

Component/s: Observability

> Add a logger.trace call to Tracing
> --
>
> Key: CASSANDRA-5992
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5992
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability
>Reporter: Jeremiah Jordan
>Assignee: Ryan Magnusson
>Priority: Minor
>  Labels: lhf
> Fix For: 4.x
>
> Attachments: CASSANDRA-5992.txt
>
>
> A bunch of stuff is now written to Tracing, and there are no logging trace 
> calls any more.  If a node is having issues on the read/write path and 
> tracing can't actually save data, there is no way to see through logging what 
> is going on.  Would be nice if we made all Tracing messages also go out at 
> logger.trace so that you could enable that to debug stuff.
> Being able to change the RF of the system_traces KS might also help here, but 
> there would still be classes of problems that it would be good to have the 
> logging there for. Added CASSANDRA-6016 for that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-6474) Compaction strategy based on MinHash

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-6474:

Component/s: Compaction

> Compaction strategy based on MinHash
> 
>
> Key: CASSANDRA-6474
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6474
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Compaction
>Reporter: Yuki Morishita
>Assignee: sankalp kohli
>Priority: Major
>  Labels: compaction
> Fix For: 4.x
>
>
> We can consider an SSTable as a set of partition keys, and 'compaction' as 
> de-duplication of those partition keys.
> We want to find compaction candidates from SSTables that have as many same 
> keys as possible. If we can group similar SSTables based on some measurement, 
> we can achieve more efficient compaction.
> One such measurement is [Jaccard 
> Distance|http://en.wikipedia.org/wiki/Jaccard_index],
> !http://upload.wikimedia.org/math/1/8/6/186c7f4e83da32e889d606140fae25a0.png!
> which we can estimate using technique called 
> [MinHash|http://en.wikipedia.org/wiki/MinHash].
> In Cassandra, we can calculate and store MinHash signature when writing 
> SSTable. New compaction strategy uses the signature to find the group of 
> similar SSTable for compaction candidates. We can always fall back to STCS 
> when such candidates are not exists.
> This is just an idea floating around my head, but before I forget, I dump it 
> here. For introduction to this technique, [Chapter 3 of 'Mining of Massive 
> Datasets'|http://infolab.stanford.edu/~ullman/mmds/ch3.pdf] is a good start.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-6680) Clock skew detection via gossip

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-6680:

Component/s: Distributed Metadata

> Clock skew detection via gossip
> ---
>
> Key: CASSANDRA-6680
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6680
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Distributed Metadata
>Reporter: Brandon Williams
>Priority: Minor
> Fix For: 4.x
>
>
> Gossip's HeartbeatState keeps the generation (local timestamp the node was 
> started) and version (monotonically increasing per gossip interval) which 
> could be used to roughly calculate the node's current time, enabling 
> detection of gossip messages too far in the future for the clocks to be 
> synced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-8943) Accept reads while bootstrapping for available ranges

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-8943:

Component/s: Lifecycle

> Accept reads while bootstrapping for available ranges
> -
>
> Key: CASSANDRA-8943
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8943
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Lifecycle
>Reporter: Yuki Morishita
>Assignee: Yuki Morishita
>Priority: Minor
> Fix For: 4.x
>
>
> Last step for CASSANDRA-8494 is to make bootstrapping node accept read 
> request and serve data for available keyspace/ranges.
> If data requested is not available, bootstrap node can either 1) send back 
> 'Not Available' and let coordinator forward request to current replica node 
> or 2) send proxy request to replica node and let it response to coordinator.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-6441) Explore merging memtables directly with L1

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-6441:

Component/s: Compaction

> Explore merging memtables directly with L1
> --
>
> Key: CASSANDRA-6441
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6441
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Jonathan Ellis
>Priority: Minor
>  Labels: compaction
> Fix For: 4.x
>
>
> Currently, memtables flush to L0 and are then compacted with L1, so you 
> automatically have 100% write amplification for unique cells right off the 
> bat.
> http://dl.acm.org/citation.cfm?id=2213862 suggests splitting the memtable 
> into pieces corresponding to the ranges of the sstables in L1 and turning the 
> flush + compact into a single write -- that is, we'd "compact" the data in 
> the L1 sstable with the corresponding data in the memtable.
> This would add some complexity around blocking memtable sections until the 
> corresponding L1 piece is no longer involved in its own compaction with L2, 
> and probably a "panic dump" to the old L0 behavior if we run low on memory.  
> But in theory it sounds like a promising optimization.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-6814) We should ensure that AbstractReadExecutor.makeDigestRequests never calls the local machine

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-6814:

Component/s: Coordination

> We should ensure that AbstractReadExecutor.makeDigestRequests never calls the 
> local machine
> ---
>
> Key: CASSANDRA-6814
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6814
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Coordination
>Reporter: Benedict
>Assignee: Vijay
>Priority: Minor
> Fix For: 4.x
>
> Attachments: 0001-CASSANDRA-6814.patch
>
>
> It should always opt to perform the actual read locally, when possible, and 
> this code path should assert that it does not read locally.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-6246) EPaxos

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-6246:

Component/s: Coordination

> EPaxos
> --
>
> Key: CASSANDRA-6246
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6246
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Coordination
>Reporter: Jonathan Ellis
>Assignee: Blake Eggleston
>Priority: Major
>  Labels: LWT, messaging-service-bump-required
> Fix For: 4.x
>
>
> One reason we haven't optimized our Paxos implementation with Multi-paxos is 
> that Multi-paxos requires leader election and hence, a period of 
> unavailability when the leader dies.
> EPaxos is a Paxos variant that requires (1) less messages than multi-paxos, 
> (2) is particularly useful across multiple datacenters, and (3) allows any 
> node to act as coordinator: 
> http://sigops.org/sosp/sosp13/papers/p358-moraru.pdf
> However, there is substantial additional complexity involved if we choose to 
> implement it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-5745) Minor compaction tombstone-removal deadlock

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-5745:

Component/s: Compaction

> Minor compaction tombstone-removal deadlock
> ---
>
> Key: CASSANDRA-5745
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5745
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Jonathan Ellis
>Priority: Minor
> Fix For: 4.x
>
>
> From a discussion with Axel Liljencrantz,
> If you have two SSTables that have temporally overlapping data, you can get 
> lodged into a state where a compaction of SSTable A can't drop tombstones 
> because SSTable B contains older data *and vice versa*. Once that's happened, 
> Cassandra should be wedged into a state where CASSANDRA-4671 no longer helps 
> with tombstone removal. The only way to break the wedge would be to perform a 
> compaction containing both SSTable A and SSTable B. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13481) Provide a method for flushing a CQLSSTableWriter

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-13481:
-
Component/s: Libraries

> Provide a method for flushing a CQLSSTableWriter
> 
>
> Key: CASSANDRA-13481
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13481
> Project: Cassandra
>  Issue Type: Wish
>  Components: Libraries
>Reporter: Gabriel Garcia
>Priority: Minor
> Fix For: 4.x
>
>
> The buffer size estimation in SSTableSimpleUnsortedWriter is not very 
> accurate and causes OOM errors quite often (or lots of GC pressure) for my 
> specific use case (rows that vary greatly in # of columns). I guess if the 
> user knows their data well enough, they can schedule flushes when really 
> needed.
> It's just an idea, not sure if flushing would actually help. However, playing 
> with bufferSizeInMB is not of much help.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13936) RangeTombstoneTest (compressed) failure - assertTimes expected:<1000> but was:<999>

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-13936:
-
Component/s: Testing

> RangeTombstoneTest (compressed) failure - assertTimes expected:<1000> but 
> was:<999>
> ---
>
> Key: CASSANDRA-13936
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13936
> Project: Cassandra
>  Issue Type: Bug
>  Components: Testing
>Reporter: Jeff Jirsa
>Priority: Major
>  Labels: Testing
> Fix For: 4.x
>
>
> In circleci run 
> [here|https://circleci.com/gh/jeffjirsa/cassandra/367#tests/containers/2] :
> {code}
> [junit] Testsuite: org.apache.cassandra.db.RangeTombstoneTest-compression
> [junit] Testsuite: org.apache.cassandra.db.RangeTombstoneTest-compression 
> Tests run: 14, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 10.945 sec
> [junit] 
> [junit] Testcase: 
> testTrackTimesRangeTombstoneWithData(org.apache.cassandra.db.RangeTombstoneTest)-compression:
>FAILED
> [junit] expected:<1000> but was:<999>
> [junit] junit.framework.AssertionFailedError: expected:<1000> but 
> was:<999>
> [junit]   at 
> org.apache.cassandra.db.RangeTombstoneTest.assertTimes(RangeTombstoneTest.java:314)
> [junit]   at 
> org.apache.cassandra.db.RangeTombstoneTest.testTrackTimesRangeTombstoneWithData(RangeTombstoneTest.java:308)
> [junit] 
> [junit] 
> [junit] Test org.apache.cassandra.db.RangeTombstoneTest FAILED
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13951) Release Java and Python drivers and re-enable build.xml driver dep

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-13951:
-
Component/s: Libraries
 Build

> Release Java and Python drivers and re-enable build.xml driver dep
> --
>
> Key: CASSANDRA-13951
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13951
> Project: Cassandra
>  Issue Type: Task
>  Components: Build, Libraries
>Reporter: Alex Petrov
>Assignee: Alex Petrov
>Priority: Blocker
> Fix For: 4.0, 4.x
>
>
> During [CASSANDRA-10786], driver changes were introduced that required a 
> snapshot release of both drivers. 
> When the stable driver version is published, we have to update the driver 
> dependencies in {{lib}} and un-comment {{build.xml}} sections with 
> {{java-driver}} information for correct publish of 4.0. This manipulation was 
> required in order to account for the chicken-or-egg problem between Cassandra 
> itself and the drivers. 
> This absolutely has to be done before 4.0 & there should be no release 
> without making sure right drivers are in place.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13989) Update security docs for 4.0

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-13989:
-
Component/s: Documentation and Website

> Update security docs for 4.0
> 
>
> Key: CASSANDRA-13989
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13989
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Documentation and Website
>Reporter: Jason Brown
>Assignee: Jason Brown
>Priority: Minor
> Fix For: 4.x
>
>
> CASSANDRA-8457 and CASSANDRA-10404 have brought changes to the way SSL works 
> for both internode messaging and the native protocol. Update the docs to 
> reflect information that is important to users/operators.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-10496) Make DTCS/TWCS split partitions based on time during compaction

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-10496:
-
Component/s: Compaction

> Make DTCS/TWCS split partitions based on time during compaction
> ---
>
> Key: CASSANDRA-10496
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10496
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Marcus Eriksson
>Priority: Major
>  Labels: dtcs, pull-request-available
> Fix For: 4.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> To avoid getting old data in new time windows with DTCS (or related, like 
> [TWCS|CASSANDRA-9666]), we need to split out old data into its own sstable 
> during compaction.
> My initial idea is to just create two sstables, when we create the compaction 
> task we state the start and end times for the window, and any data older than 
> the window will be put in its own sstable.
> By creating a single sstable with old data, we will incrementally get the 
> windows correct - say we have an sstable with these timestamps:
> {{[100, 99, 98, 97, 75, 50, 10]}}
> and we are compacting in window {{[100, 80]}} - we would create two sstables:
> {{[100, 99, 98, 97]}}, {{[75, 50, 10]}}, and the first window is now 
> 'correct'. The next compaction would compact in window {{[80, 60]}} and 
> create sstables {{[75]}}, {{[50, 10]}} etc.
> We will probably also want to base the windows on the newest data in the 
> sstables so that we actually have older data than the window.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-10414) Add Internal Support for Reading Multiple Token Ranges with Single Command

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-10414:
-
Component/s: Coordination

> Add Internal Support for Reading Multiple Token Ranges with Single Command
> --
>
> Key: CASSANDRA-10414
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10414
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Coordination
>Reporter: Tyler Hobbs
>Priority: Major
>  Labels: performance
> Fix For: 4.x
>
>
> Since CASSANDRA-1337, we've parallelized fetches of multiple token ranges 
> when handling range slices or secondary index queries.  However, a separate 
> command still needs to be created, issued, and handled for (almost) every 
> token range.  With vnodes enabled, the number of commands to handle becomes 
> quite large, introducing a lot of extra overhead.  If we allow ReadCommands 
> to contain multiple token ranges instead of a single range, we could 
> significantly improve this.
> A secondary bonus of doing this is that we will reduce over-fetching of rows. 
>  Right now, each command uses the same total limit.  If some of the token 
> ranges have more rows than we expect (based on the average partition size and 
> key estimates), the replicas will over-fetch rows, only for them to be 
> discarded by the coordinator.  In the worst case scenario, each token range 
> could contain LIMIT rows.  With a multi-range command we could still have 
> overfetching, but it would be more tightly bounded: each node could return at 
> most LIMIT rows (across all token ranges combined).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-10495) Improve the way we do streaming with vnodes

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-10495:
-
Component/s: Streaming and Messaging

> Improve the way we do streaming with vnodes
> ---
>
> Key: CASSANDRA-10495
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10495
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Streaming and Messaging
>Reporter: Marcus Eriksson
>Priority: Major
> Fix For: 4.x
>
>
> Streaming with vnodes usually creates a large amount of sstables on the 
> target node - for example if each source node has 100 sstables and we use 
> num_tokens = 256, the bootstrapping (for example) node might get 100*256 
> sstables
> One approach could be to do an on-the-fly compaction on the source node, 
> meaning we would only stream out one sstable per range. Note that we will 
> want the compaction strategy to decide how to combine the sstables, for 
> example LCS will not want to mix sstables from different levels while STCS 
> can probably just combine everything
> cc [~yukim]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-10467) QoS for Queries

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-10467:
-
Component/s: CQL

> QoS for Queries
> ---
>
> Key: CASSANDRA-10467
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10467
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: Benjamin Coverston
>Priority: Major
> Fix For: 4.x
>
>
> Background: As an OLTP system that is based on pipelined execution worker 
> pools can be saturated with long(er) running calls. When the system is under 
> stress, those long running calls can make requests that should be short lived 
> requests take a much longer period of time.
> Introduce the concept of QoS into Cassandra for client queries. A few ideas:
> 1. Allow clients to specify a QoS to be sent to Cassandra from the driver as 
> part of the protocol.
> 2. Allow different requests to be tagged based on some simple criteria 
> (perhaps configured) (i.e. ALLOW FILTERING was part of the query)
> 3. QOS based on the LUN that is accessed (SSDs get higher QOS than the RAID5)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-10405) MV updates should optionally wait for acknowledgement from view replicas

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-10405:
-
Component/s: Materialized Views

> MV updates should optionally wait for acknowledgement from view replicas
> 
>
> Key: CASSANDRA-10405
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10405
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Materialized Views
>Reporter: Carl Yeksigian
>Priority: Minor
>  Labels: materializedviews
> Fix For: 4.x
>
>
> MV updates are currently completely asynchronous in order to provide 
> parallelism of updates trying to acquire the partition lock. For some use 
> cases, leaving the MV updates asynchronous is exactly what's needed.
> However, there are some use cases where knowing that the update has either 
> succeeded or failed on the view is necessary, especially when trying to allow 
> read-your-write behavior. In those cases, we would follow the same code path 
> as asynchronous writes, but at the end wait on the acknowledgements from the 
> view replicas before acknowledging our write. This option should be for each 
> MV separately, since MVs which need the synchronous properties might be mixed 
> with MV which do not need this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-10383) Disable auto snapshot on selected tables.

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-10383:
-
Component/s: Configuration

> Disable auto snapshot on selected tables.
> -
>
> Key: CASSANDRA-10383
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10383
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Tommy Stendahl
>Assignee: Tommy Stendahl
>Priority: Major
>  Labels: doc-impacting, messaging-service-bump-required
> Fix For: 4.x
>
> Attachments: 10383.txt
>
>
> I have a use case where I would like to turn off auto snapshot for selected 
> tables, I don't want to turn it off completely since its a good feature. 
> Looking at the code I think it would be relatively easy to fix.
> My plan is to create a new table property named something like 
> "disable_auto_snapshot". If set to false it will prevent auto snapshot on the 
> table, if set to true auto snapshot will be controlled by the "auto_snapshot" 
> property in the cassandra.yaml. Default would be true.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-9829) Dynamically adjust LCS level sizes

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-9829:

Component/s: Compaction

> Dynamically adjust LCS level sizes
> --
>
> Key: CASSANDRA-9829
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9829
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Compaction
>Reporter: Jonathan Ellis
>Priority: Major
>  Labels: compaction, lcs, performance
> Fix For: 4.x
>
>
> LCS works best when the top level is full.  Then 90% of reads can be served 
> from a single sstable.  By contrast if the top level is only 10% full then 
> 90% of reads will be served from two.  This results in worse performance as 
> well as confused users.
> To address this, we can adjust the ideal top level size to how much data is 
> actually in it (and set each corresponding lower level to 1/10 of the next 
> one above).
> (This is an idea [from 
> rocksdb|https://www.reddit.com/r/IAmA/comments/3de3cv/we_are_rocksdb_engineering_team_ask_us_anything/ct4asen].)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-9796) Give 8099's like treatment to partition keys

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-9796:

Component/s: Local Write-Read Paths

> Give 8099's like treatment to partition keys
> 
>
> Key: CASSANDRA-9796
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9796
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local Write-Read Paths
>Reporter: Sylvain Lebresne
>Priority: Major
> Fix For: 4.x
>
>
> Post-8099, we properly distinguish clustering columns at the engine level, 
> which allows use somewhat more efficient encoding: we don't write the size of 
> values of fixed width types, and we can properly store null values (which 
> will likely prove useful for CASSANDRA-6477 for instance).
> Partition keys however have had no such love: the storage engine still 
> manipulate them like a single blob and their encoding is not terribly 
> efficient: we always store the size of every values (even fixed width ones) 
> and for compound values we even store the size of the full partition key even 
> though it's redundant with the individual value sizes. The encoding also 
> don't allow nulls, which is inconvenient at least for CASSANDRA-6477.
> So I'd like to improve on this by:
> # making the {{DecoratedKey}} API (which I'd personally rename into 
> {{PartitionKey}}) expose the fact that we can have more than one value.  
> Typically by adding {{size()}} and {{get\(i\)}} methods like for 
> {{Clustering}}.  This would simplify a couple of places in the code where we 
> still manually decompose such values in particular.
> # improve their encoding. An easy/consistent solution for that would be reuse 
> the same encoding than for {{Clustering}} (they are the same kind of beast), 
> though I'm open to other options.
> One small subtlety to be aware of is that whatever we do to the internal 
> encoding/implementation, we must make sure we still compute the same tokens.  
> But that's not particularly hard either.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-9772) Bound the number of concurrent range requests

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-9772:

Component/s: Coordination

> Bound the number of concurrent range requests
> -
>
> Key: CASSANDRA-9772
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9772
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Coordination
>Reporter: Tyler Hobbs
>Priority: Major
> Fix For: 4.x
>
>
> After CASSANDRA-1337, we will execute requests for many token ranges 
> concurrently based on our estimate of how many ranges will be required to 
> meet the requested LIMIT.  For queries with a lot of results this is 
> generally fine, because it will only take a few ranges to satisfy the limit.  
> However, for queries with very few results, this may result in the 
> coordinator concurrently requesting all token ranges.  On large vnode 
> clusters, this will be particularly problematic.
> Placing a simple bound on the number of concurrent requests is a good first 
> step.  Long-term, we should look into creating a new range command that 
> supports requesting multiple ranges.  This would eliminate the overhead of 
> serializing and handling hundreds of separate commands.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-9929) More efficient memory handling for tiny off-heap cache entries

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-9929:

Component/s: Core

> More efficient memory handling for tiny off-heap cache entries
> --
>
> Key: CASSANDRA-9929
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9929
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Robert Stupp
>Assignee: Robert Stupp
>Priority: Minor
> Fix For: 4.x
>
>
> CASSANDRA-9739 usually have tiny cache entries. The overhead of any 
> general-purpose memory allocator is probably too high.
> [~benedict] proposed a chunked allocation strategy for OHC that allows to 
> allocate the whole memory up-front - thus reducing the malloc/free overhead 
> to 0.
> (EDIT: Key cache does not have tiny cache entries - at least not now)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-9919) Cleanup closeable iterator usage

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-9919:

Component/s: Core

> Cleanup closeable iterator usage
> 
>
> Key: CASSANDRA-9919
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9919
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Benedict
>Priority: Major
> Fix For: 4.x
>
>
> Our iterators are all sensibly AutoCloseable, but we still have a lot of 
> places where we close the iterator if we reach the end in next() / hasNext(). 
> ThIs behaviour will only mask problems, as any exceptions in normal 
> processing will miss these pathways. Since we much more heavily depend on 
> iterators now, we need to be certain they are rock solid. So I would prefer 
> we remove our crutches and work through any pain earlier.
> CASSANDRA-9918 can then help us catch any misuse more easily.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-9911) Remove ConcurrentLinkedHashCache

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-9911:

Component/s: Core

> Remove ConcurrentLinkedHashCache
> 
>
> Key: CASSANDRA-9911
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9911
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Core
>Reporter: Robert Stupp
>Priority: Trivial
> Fix For: 4.x
>
>
> ... when CASSANDRA-9738 + CASSANDRA-9739 have landed



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-9892) Add support for unsandboxed UDF

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-9892:

Component/s: Core

> Add support for unsandboxed UDF
> ---
>
> Key: CASSANDRA-9892
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9892
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Core
>Reporter: Jonathan Ellis
>Assignee: Robert Stupp
>Priority: Minor
>  Labels: UDF, doc-impacting
> Fix For: 4.x
>
>
> From discussion on CASSANDRA-9402,
> The approach postgresql takes is to distinguish between "trusted" (sandboxed) 
> and "untrusted" (anything goes) UDF languages. 
> Creating an untrusted language always requires superuser mode. Once that is 
> done, creating functions in it requires nothing special.
> Personally I would be fine with this approach, but I think it would be more 
> useful to have the extra permission on creating the function, and also 
> wouldn't require adding explicit CREATE LANGUAGE.
> So I'd suggest just providing different CQL permissions for trusted and 
> untrusted, i.e. if you have CREATE FUNCTION permission that allows you to 
> create sandboxed UDF, but you can only create unsandboxed if you have CREATE 
> UNTRUSTED.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-9834) Merge Hints/CommitLog/BatchLog

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-9834:

Component/s: Core

> Merge Hints/CommitLog/BatchLog
> --
>
> Key: CASSANDRA-9834
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9834
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Benedict
>Priority: Major
> Fix For: 4.x
>
>
> As discussed briefly on CASSANDRA-6230, it should be quite possible to 
> construct a single log that can serve as commit log, hints log and batch log. 
> The basic idea would be to write sequentially, marking messages as members of 
> one or more logical logs. We have a separate efficient (possibly embedded) 
> ledger for invalidation of log records. As entire log segments become 
> invalidated, we simply delete them; the rest we accumulate until we hit a 
> high watermark, and have segments that are at least half empty, at which 
> point we begin rewriting the emptiest.
> This absolutely bounds our worst case sequential IO at 2x that used by _just_ 
> the commit log, with normal operation under sufficiently high watermark 
> having zero overhead. The upper bound for space utilisation is the smaller of 
> 2x the actual amount of data stored, and our high watermark. This gives us 
> batch and hints for free, and eliminates OOMs from hints.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-9810) Lift some counters limitations in C* 3.X

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-9810:

Component/s: Core

> Lift some counters limitations in C* 3.X
> 
>
> Key: CASSANDRA-9810
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9810
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Aleksey Yeschenko
>Priority: Major
> Fix For: 4.x
>
>
> A ticket to aggregate issues for removal of some of the counters limitations 
> we currently have:
> 1. Counters are not reusable after deletion. It's true that deletion doesn't 
> commute, and an increment coming soon after a deletion would lead to 
> undefined behavior, but the current behavior is the result of bug in 1.1. We 
> should revert CASSANDRA-7346
> 2. Once that is done, we can start allowing to mix counters and non-counters 
> in the same tables. For read requests it means data locality, being able to 
> satisfy more requests with a single query. We would still be forbidding 
> updating counter columns and regular columns in a single {{UPDATE}}, however, 
> and would not allow counter updates in {{INSERT}}
> See CASSANDRA-8878 for some discussion of the subject. In particular, 
> [this|https://issues.apache.org/jira/browse/CASSANDRA-8878?focusedCommentId=14346748=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14346748]
>  and 
> [this|https://issues.apache.org/jira/browse/CASSANDRA-8878?focusedCommentId=14351091=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14351091]
>  comments.
> This seems to be a pre-requisite for CASSANDRA-9778.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-9786) Our index file seeks can be directed to the page most likely to contain the record, and read exactly one page

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-9786:

Component/s: Local Write-Read Paths

> Our index file seeks can be directed to the page most likely to contain the 
> record, and read exactly one page
> -
>
> Key: CASSANDRA-9786
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9786
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local Write-Read Paths
>Reporter: Benedict
>Priority: Major
>  Labels: performance
> Fix For: 4.x
>
>
> (Perhaps two, if it is likely to cross a page boundary).
> This only works if the partitioner is hash based, but if so we can expect the 
> keys to be uniformly distributed, so we can easily calculate the likelihood 
> of finding it on a given page. With CASSANDRA-8931, we can also expect the 
> data to be of a uniform size.
> This will require a little bit of index file restructuring, so that each page 
> can be read as a single unit. A follow-up ticket may then permit in-memory 
> binary search within the page for the desired record (with variable key sizes 
> this may be difficult, so perhaps a follow-up).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-9756) Cleanup UDA code after 6717

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-9756:

Component/s: Core

> Cleanup UDA code after 6717
> ---
>
> Key: CASSANDRA-9756
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9756
> Project: Cassandra
>  Issue Type: Task
>  Components: Core
>Reporter: Robert Stupp
>Priority: Minor
> Fix For: 4.x
>
>
> After CASSANDRA-6717 has landed, there should be some cleanup of UDF/UDA code 
> wrt load from schema tables and handling broken functions.
> /cc [~iamaleksey] 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-7393) Replace counter cache with optimised counter read path

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-7393:

Component/s: Core

> Replace counter cache with optimised counter read path
> --
>
> Key: CASSANDRA-7393
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7393
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Core
>Reporter: Aleksey Yeschenko
>Priority: Major
> Fix For: 4.x
>
>
> As mentioned in CASSANDRA-7366, we can utilize the optimizations in 
> CollationController#collectTimeOrderedData() now for the read-before-write 
> counter path, and get most/all of the benefits of the counter cache for 
> 'free' + get better 'cold' read performance as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-7447) New sstable format with support for columnar layout

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-7447:

Component/s: Local Write-Read Paths

> New sstable format with support for columnar layout
> ---
>
> Key: CASSANDRA-7447
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7447
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local Write-Read Paths
>Reporter: Benedict
>Assignee: sankalp kohli
>Priority: Major
>  Labels: performance, storage
> Fix For: 4.x
>
> Attachments: ngcc-storage.odp, storage_format.pdf
>
>
> h2. Storage Format Proposal
> C* has come a long way over the past few years, and unfortunately our storage 
> format hasn't kept pace with the data models we are now encouraging people to 
> utilise. This ticket proposes a collections of storage primitives that can be 
> combined to serve these data models more optimally.
> It would probably help to first state the data model at the most abstract 
> level. We have a fixed three-tier structure: We have the partition key, the 
> clustering columns, and the data columns. Each have their own characteristics 
> and so require their own specialised treatment.
> I should note that these changes will necessarily be delivered in stages, and 
> that we will be making some assumptions about what the most useful features 
> to support initially will be. Any features not supported will require 
> sticking with the old format until we extend support to all C* functionality.
> h3. Partition Key
> * This really has two components: the partition, and the value. Although the 
> partition is primarily used to distribute across nodes, it can also be used 
> to optimise lookups for a given key within a node
> * Generally partitioning is by hash, and for the moment I want to focus this 
> ticket on the assumption that this is the case
> * Given this, it makes sense to optimise our storage format to permit O(1) 
> searching of a given partition. It may be possible to achieve this with 
> little overhead based on the fact we store the hashes in order and know they 
> are approximately randomly distributed, as this effectively forms an 
> immutable contiguous split-ordered list (see Shalev/Shavit, or 
> CASSANDRA-7282), so we only need to store an amount of data based on how 
> imperfectly distributed the hashes are, or at worst a single value per block.
> * This should completely obviate the need for a separate key-cache, which 
> will be relegated to supporting the old storage format only
> h3. Primary Key / Clustering Columns
> * Given we have a hierarchical data model, I propose the use of a 
> cache-oblivious trie
> * The main advantage of the trie is that it is extremely compact and 
> _supports optimally efficient merges with other tries_ so that we can support 
> more efficient reads when multiple sstables are touched
> * The trie will be preceded by a small amount of related data; the full 
> partition key, a timestamp epoch (for offset-encoding timestamps) and any 
> other partition level optimisation data, such as (potentially) a min/max 
> timestamp to abort merges earlier
> * Initially I propose to limit the trie to byte-order comparable data types 
> only (the number of which we can expand through translations of the important 
> types that are not currently)
> * Crucially the trie will also encapsulate any range tombstones, so that 
> these are merged early in the process and avoids re-iterating the same data
> * Results in true bidirectional streaming without having to read entire range 
> into memory
> h3. Values
> There are generally two approaches to storing rows of data: columnar, or 
> row-oriented. The above two data structures can be combined with a value 
> storage scheme that is based on either. However, given the current model we 
> have of reading large 64Kb blocks for any read, I am inclined to focus on 
> columnar support first, as this delivers order-of-magnitude benefits to those 
> users with the correct workload, while for most workloads our 64Kb blocks are 
> large enough to store row-oriented data in a column-oriented fashion without 
> any performance degradation (I'm happy to consign very large row support to 
> phase 2). 
> Since we will most likely target both behaviours eventually, I am currently 
> inclined to suggest that static columns, sets and maps be targeted for a 
> row-oriented release, as they don't naturally fit in a columnar layout 
> without secondary heap-blocks. This may be easier than delivering heap-blocks 
> also, as it keeps both implementations relatively clean. This is certainly 
> open to debate, and I have no doubt there will be conflicting opinions here.
> Focusing on our columnar layout, the goals 

[jira] [Updated] (CASSANDRA-7738) Permit CL overuse to be explicitly bounded

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-7738:

Component/s: Core

> Permit CL overuse to be explicitly bounded
> --
>
> Key: CASSANDRA-7738
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7738
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Benedict
>Priority: Minor
> Fix For: 4.x
>
>
> As mentioned in CASSANDRA-7554, we do not currently offer any way to 
> explicitly bound CL growth, which can be problematic in some scenarios (e.g. 
> EC2 where the system drive is quite small). We should offer a configurable 
> amount of headroom, beyond which we stop accepting writes until the backlog 
> clears.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-7402) Add metrics to track memory used by client requests

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-7402:

Component/s: Metrics

> Add metrics to track memory used by client requests
> ---
>
> Key: CASSANDRA-7402
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7402
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Metrics
>Reporter: T Jake Luciani
>Assignee: T Jake Luciani
>Priority: Minor
>  Labels: ops, performance, stability
> Fix For: 4.x
>
> Attachments: 7402.txt
>
>
> When running a production cluster one common operational issue is quantifying 
> GC pauses caused by ongoing requests.
> Since different queries return varying amount of data you can easily get your 
> self into a situation where you Stop the world from a couple of bad actors in 
> the system.  Or more likely the aggregate garbage generated on a single node 
> across all in flight requests causes a GC.
> It would be very useful for operators to see how much garbage the system is 
> using to handle in flight mutations and queries. 
> It would also be nice to have either a log of queries which generate the most 
> garbage so operators can track this.  Also a histogram.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-7662) Implement templated CREATE TABLE functionality (CREATE TABLE LIKE)

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-7662:

Component/s: CQL

> Implement templated CREATE TABLE functionality (CREATE TABLE LIKE)
> --
>
> Key: CASSANDRA-7662
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7662
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: Aleksey Yeschenko
>Priority: Minor
> Fix For: 4.x
>
> Attachments: 7662.patch, CASSANDRA-7662.patch
>
>
> Implement templated CREATE TABLE functionality (CREATE TABLE LIKE) to 
> simplify creating new tables duplicating existing ones (see parent_table part 
> of  http://www.postgresql.org/docs/9.1/static/sql-createtable.html).
> CREATE TABLE  LIKE ; - would create a new table with 
> the same columns and options as 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-7542) Reduce CAS contention

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-7542:

Component/s: Coordination

> Reduce CAS contention
> -
>
> Key: CASSANDRA-7542
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7542
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Coordination
>Reporter: sankalp kohli
>Priority: Minor
>  Labels: LWT
> Fix For: 4.x
>
>
> CAS updates on same CQL partition can lead to heavy contention inside C*. I 
> am looking for simple ways(no algorithmic changes) to reduce contention as 
> the penalty of it is high in terms of latency, specially for reads. 
> We can put some sort of synchronization on CQL partition at StorageProxy 
> level. This will reduce contention at least for all requests landing on one 
> box for same partition. 
> Here is an example of why it will help:
> 1) Say 1 write and 2 read CAS requests for the same partition key is send to 
> C* in parallel. 
> 2) Since client is token-aware, it sends these 3 request to the same C* 
> instance A. (Lets assume that all 3 requests goto same instance A) 
> 3) In this C* instance A, all 3 CAS requests will contend with each other in 
> Paxos. (This is bad)
> To improve contention in 3), what I am proposing is to add a lock on 
> partition key similar to what we do in PaxosState.java to serialize these 3 
> requests. This will remove the contention and improve performance as these 3 
> requests will not collide with each other.
> Another improvement we can do in client is to pick a deterministic live 
> replica for a given partition doing CAS.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-6936) Make all byte representations of types comparable by their unsigned byte representation only

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-6936:

Component/s: Core

> Make all byte representations of types comparable by their unsigned byte 
> representation only
> 
>
> Key: CASSANDRA-6936
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6936
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Benedict
>Priority: Major
>  Labels: compaction, performance
> Fix For: 4.x
>
>
> This could be a painful change, but is necessary for implementing a 
> trie-based index, and settling for less would be suboptimal; it also should 
> make comparisons cheaper all-round, and since comparison operations are 
> pretty much the majority of C*'s business, this should be easily felt (see 
> CASSANDRA-6553 and CASSANDRA-6934 for an example of some minor changes with 
> major performance impacts). No copying/special casing/slicing should mean 
> fewer opportunities to introduce performance regressions as well.
> Since I have slated for 3.0 a lot of non-backwards-compatible sstable 
> changes, hopefully this shouldn't be too much more of a burden.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-6853) Allow filtering on primary key expressions in 2i queries

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-6853:

Component/s: Secondary Indexes

> Allow filtering on primary key expressions in 2i queries
> 
>
> Key: CASSANDRA-6853
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6853
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Secondary Indexes
>Reporter: Jonathan Ellis
>Assignee: Sylvain Lebresne
>Priority: Minor
>  Labels: indexes
> Fix For: 4.x
>
>
> We allow
> {code}
> SELECT a, d FROM t.t WHERE b = 'b1' AND a = 'a14521'
> {code}
> and
> {code}
> SELECT a, d FROM t.t WHERE b = 'b1' AND token(a)  > token( 'a14521')
> {code}
> but not
> {code}
> SELECT a, d FROM t.t WHERE b = 'b1' AND a  > 'a14521'
> {code}
> (given an index on {{b}}, with primary key {{a}})
> we allow combining other predicates with an indexed one and filtering those 
> in a nested loop; we should allow the same for primary keys



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-7282) Faster Memtable map

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-7282:

Component/s: Local Write-Read Paths

> Faster Memtable map
> ---
>
> Key: CASSANDRA-7282
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7282
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local Write-Read Paths
>Reporter: Benedict
>Assignee: Michael Burman
>Priority: Major
>  Labels: performance
> Fix For: 4.x
>
> Attachments: jasobrown-sample-run.txt, profile.yaml, reads.svg, 
> run1.svg, writes.svg
>
>
> Currently we maintain a ConcurrentSkipLastMap of DecoratedKey -> Partition in 
> our memtables. Maintaining this is an O(lg(n)) operation; since the vast 
> majority of users use a hash partitioner, it occurs to me we could maintain a 
> hybrid ordered list / hash map. The list would impose the normal order on the 
> collection, but a hash index would live alongside as part of the same data 
> structure, simply mapping into the list and permitting O(1) lookups and 
> inserts.
> I've chosen to implement this initial version as a linked-list node per item, 
> but we can optimise this in future by storing fatter nodes that permit a 
> cache-line's worth of hashes to be checked at once,  further reducing the 
> constant factor costs for lookups.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-7040) Replace read/write stage with per-disk access coordination

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-7040:

Component/s: Local Write-Read Paths

> Replace read/write stage with per-disk access coordination
> --
>
> Key: CASSANDRA-7040
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7040
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local Write-Read Paths
>Reporter: Benedict
>Priority: Major
>  Labels: performance
> Fix For: 4.x
>
>
> As discussed in CASSANDRA-6995, current coordination of access to disk is 
> suboptimal: instead of ensuring disk accesses alone are coordinated, we 
> instead coordinate at the level of operations that may touch the disks, 
> ensuring only so many are proceeding at once. As such, tuning is difficult, 
> and we incur unnecessary delays for operations that would not touch the 
> disk(s).
> Ideally we would instead simply use a shared coordination primitive to gate 
> access to the disk when we perform a rebuffer. This work would dovetail very 
> nicely with any work in CASSANDRA-5863, as we could prevent any blocking or 
> context switching for data that we know to be cached. It also, as far as I 
> can tell, obviates the need for CASSANDRA-5239.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-7056) Add RAMP transactions

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-7056:

Component/s: Core

> Add RAMP transactions
> -
>
> Key: CASSANDRA-7056
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7056
> Project: Cassandra
>  Issue Type: Wish
>  Components: Core
>Reporter: Tupshin Harper
>Priority: Minor
>  Labels: LWT
> Fix For: 4.x
>
>
> We should take a look at 
> [RAMP|http://www.bailis.org/blog/scalable-atomic-visibility-with-ramp-transactions/]
>  transactions, and figure out if they can be used to provide more efficient 
> LWT (or LWT-like) operations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-7168) Add repair aware consistency levels

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-7168:

Component/s: Coordination

> Add repair aware consistency levels
> ---
>
> Key: CASSANDRA-7168
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7168
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Coordination
>Reporter: T Jake Luciani
>Priority: Major
>  Labels: performance, repair
> Fix For: 4.x
>
>
> With CASSANDRA-5351 and CASSANDRA-2424 I think there is an opportunity to 
> avoid a lot of extra disk I/O when running queries with higher consistency 
> levels.  
> Since repaired data is by definition consistent and we know which sstables 
> are repaired, we can optimize the read path by having a REPAIRED_QUORUM which 
> breaks reads into two phases:
>  
>   1) Read from one replica the result from the repaired sstables. 
>   2) Read from a quorum only the un-repaired data.
> For the node performing 1) we can pipeline the call so it's a single hop.
> In the long run (assuming data is repaired regularly) we will end up with 
> much closer to CL.ONE performance while maintaining consistency.
> Some things to figure out:
>   - If repairs fail on some nodes we can have a situation where we don't have 
> a consistent repaired state across the replicas.  
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-7349) Use Native Frame in CommitLog

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-7349:

Component/s: Core

> Use Native Frame in CommitLog
> -
>
> Key: CASSANDRA-7349
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7349
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: T Jake Luciani
>Priority: Major
>  Labels: performance
> Fix For: 4.x
>
>
> We currently read mutations off the wire then re-serialize them to the commit 
> log.  In the case of the Native protocol we have the native Frame available 
> in the mutation we should be able to special case this and use it in the 
> commit log.  I would expect a decent performance boost.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-6123) Break timestamp ties consistently for a given user requests

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-6123:

Component/s: CQL

> Break timestamp ties consistently for a given user requests
> ---
>
> Key: CASSANDRA-6123
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6123
> Project: Cassandra
>  Issue Type: New Feature
>  Components: CQL
>Reporter: Sylvain Lebresne
>Priority: Major
> Fix For: 4.x
>
>
> The basic goal of this issue is to fix the fact that if 2 different clients 
> issue "simultaneously" the 2 following updates:
> {noformat}
> INSERT INTO foo(k, v1, v2) VALUES (0, 1, -1); // client1
> INSERT INTO foo(k, v1, v2) VALUES (0, -1, 1); // client2
> {noformat}
> then, if both updates get the same timestamp, then currently, we don't 
> guarantee that at the end the sum of {{v1}} and {{v2}} will be 0 (it won't be 
> in that case).
> The idea to solves this is to make sure 2 updates *never* get the same 
> "timestamp" by making the timestamp be the sum of the current time (and we 
> can relatively easily make sur no 2 update coordinated by the same node have 
> the same current time) and a small ID unique to each server node. We can 
> generate this small unique server id thanks to CAS (see CASSANDRA-6108).
> Let's note that this solution is only for server-side generated timestamps.  
> Client provided timestamp will still be allowed, but in that case it will be 
> the job of the client to synchronize to not generate 2 identical timestamp if 
> they care about this behavior.
> Note: see CASSANDRA-6106 for some related discussion on this issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-5051) Allow automatic cleanup after gc_grace

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-5051:

Component/s: Core

> Allow automatic cleanup after gc_grace
> --
>
> Key: CASSANDRA-5051
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5051
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Core
>Reporter: Brandon Williams
>Assignee: Vijay
>Priority: Major
>  Labels: vnodes
> Fix For: 4.x
>
> Attachments: 0001-5051-v4.patch, 0001-5051-v6.patch, 
> 0001-5051-with-test-fixes.patch, 0001-CASSANDRA-5051.patch, 
> 0002-5051-remove-upgradesstable-v4.patch, 
> 0002-5051-remove-upgradesstable.patch, 0004-5051-additional-test-v4.patch, 
> 5051-v2.txt
>
>
> When using vnodes, after adding a new node you have to run cleanup on all the 
> machines, because you don't know which are affected and chances are it was 
> most if not all of them.  As an alternative to this intensive process, we 
> could allow cleanup during compaction if the data is older than gc_grace (or 
> perhaps some other time period since people tend to use gc_grace hacks to get 
> rid of tombstones.)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14612) Please add OWASP Dependency Check to the build (pom.xml)

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-14612:
-
Component/s: (was: Testing)
 (was: Repair)
 (was: Observability)
 (was: Lifecycle)

> Please add OWASP Dependency Check to the build (pom.xml)
> 
>
> Key: CASSANDRA-14612
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14612
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Build
> Environment: All development, build, test, environments.
>Reporter: Albert Baker
>Priority: Major
>  Labels: build, easyfix, security
> Fix For: 3.11.x, 4.x
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Please add OWASP Dependency Check to the build (pom.xml). OWASP DC makes an 
> outbound REST call to MITRE Common Vulnerabilities & Exposures (CVE) to 
> perform a lookup for each dependant .jar to list any/all known 
> vulnerabilities for each jar. This step is needed because a manual MITRE CVE 
> lookup/check on the main component does not include checking for 
> vulnerabilities in components or in dependant libraries.
> OWASP Dependency check : 
> https://www.owasp.org/index.php/OWASP_Dependency_Check has plug-ins for most 
> Java build/make types (ant, maven, ivy, gradle).
> Also, add the appropriate command to the nightly build to generate a report 
> of all known vulnerabilities in any/all third party libraries/dependencies 
> that get pulled in. example : mvn -Powasp -Dtest=false -DfailIfNoTests=false 
> clean aggregate
> Generating this report nightly/weekly will help inform the project's 
> development team if any dependant libraries have a reported known 
> vulnerailities. Project teams that keep up with removing vulnerabilities on a 
> weekly basis will help protect businesses that rely on these open source 
> componets.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14327) Compact new sstables together before importing in nodetool refresh

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-14327:
-
Component/s: Compaction

> Compact new sstables together before importing in nodetool refresh
> --
>
> Key: CASSANDRA-14327
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14327
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Marcus Eriksson
>Priority: Major
> Fix For: 4.x
>
>
> {{nodetool refresh}} just randomly puts sstables in a directory, then relies 
> on compaction to move the tokens to their correct directory. Instead we 
> should consider adding an option to compact all new sstables together and put 
> the tokens in the correct places.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-10537) CONTAINS and CONTAINS KEY support for Lightweight Transactions

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-10537:
-
Component/s: CQL

> CONTAINS and CONTAINS KEY support for Lightweight Transactions
> --
>
> Key: CASSANDRA-10537
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10537
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: Nimi Wariboko Jr.
>Priority: Major
>  Labels: CQL
> Fix For: 4.x
>
>
> Conditional updates currently do not support CONTAINS and CONTAINS KEY 
> conditions. Queries such as 
> {{UPDATE mytable SET somefield = 4 WHERE pk = 'pkv' IF set_column CONTAINS 
> 5;}}
> are not possible.
> Would it also be possible to support the negation of these (ex. testing that 
> a value does not exist inside a set)?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14491) Determine how to test cqlsh in a Python 2.7 environment, including dtests

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-14491:
-
Component/s: Tools

> Determine how to test cqlsh in a Python 2.7 environment, including dtests
> -
>
> Key: CASSANDRA-14491
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14491
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Testing, Tools
> Environment:  
>  
> We need to test with at least two versions of Python:
>  * Python 2.7
>  * Python 3.x (need to determine what versions of Python 3 are available by 
> default on Ubuntu and RHEL/CentOS)
> Additionally, it is recommended to test on at least three platforms:
>  * Ubuntu or other Debian derivative
>  * RHEL, CentOS, or other Red Hat derivative
>  * Windows (unless a consensus has formed around not testing on Windows?)
>Reporter: Patrick Bannister
>Priority: Major
>  Labels: cqlsh, test
> Fix For: 4.x
>
>
> It appears that a consensus is forming around maintaining Python 2.7 
> compatibility for cqlsh. However, the dtests now run in a Python 3 
> environment. We need to identify an option for testing infrastructure for 
> testing cqlsh on Python 2.7, including the dtests.
> Based on experience updating the cqlsh dtests, it is strongly recommended to 
> test in more than one environment - for example, for Linux, we should test on 
> a Debian derivative as well as a Red Hat derivative.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14347) Read payload metrics

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-14347:
-
Component/s: Metrics

> Read payload metrics
> 
>
> Key: CASSANDRA-14347
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14347
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Metrics
>Reporter: Sumanth Pasupuleti
>Assignee: Sumanth Pasupuleti
>Priority: Major
> Fix For: 4.x
>
>
> We currently have MutationSizeHistogram that gives an idea of write payloads. 
> This JIRA is about adding similar capability to reads, which can benefit us 
> with the following
>  * Histogram of payload sizes at Prepared Statement level
>  * Count of queries that meet vs do not meet SLO w.r.t. payload size
> We could also log queries that result in payload exceeding SLO. This can 
> prove useful in fire-fighting an incident, and can help resolve the issue 
> since its easy to narrow down to the specific culprit prepared statement.
> Read payload metrics could potentially be derived using network statistics, 
> however, it is difficult to isolate payloads due to reads, given that there 
> could be a bunch of other operations in C* that can amount to observed 
> network traffic (like repairs, streaming, etc).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-10500) Allow Schema Change responses and events to encode multiple changes

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-10500:
-
Component/s: CQL

> Allow Schema Change responses and events to encode multiple changes
> ---
>
> Key: CASSANDRA-10500
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10500
> Project: Cassandra
>  Issue Type: New Feature
>  Components: CQL
>Reporter: Adam Holmberg
>Priority: Minor
>  Labels: client-impacting, native_protocol, protocolv5
> Fix For: 4.x
>
>
> In some cases it would be useful to specify multiple change/targets in a 
> single [SCHEMA_CHANGE event or 
> response|https://github.com/apache/cassandra/blob/cassandra-3.0.0-rc1/doc/native_protocol_v4.spec#L706-L715].
> For example, when altering a table that has materialized views, it would be 
> useful to get all changing entities (table + views) in the same response.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-11004) LWT results '[applied]' column name collision

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-11004:
-
Component/s: CQL

> LWT results '[applied]' column name collision
> -
>
> Key: CASSANDRA-11004
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11004
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL
>Reporter: Adam Holmberg
>Priority: Minor
> Fix For: 4.x
>
>
> LWT requests return a not-well-documented single row result with a boolean 
> {{\[applied]}} column and optional column states.
> If the table happens to have a column named {{\[applied]}}, this causes a 
> name collision. There is no error, but the {{\[applied]}} flag is not 
> available.
> {code}
> cassandra@cqlsh:test> CREATE TABLE test (k int PRIMARY KEY , "[applied]" int);
> cassandra@cqlsh:test> INSERT INTO test (k, "[applied]") VALUES (2, 3) IF NOT 
> EXISTS ;
>  [applied]
> ---
>   True
> cassandra@cqlsh:test> INSERT INTO test (k, "[applied]") VALUES (2, 3) IF NOT 
> EXISTS ;
>  [applied] | k
> ---+---
>  3 | 2
> {code}
> I doubt this comes up much (at all) in practice, but thought I'd mention it. 
> One alternative approach might be to add a LWT result type 
> ([flag|https://github.com/apache/cassandra/blob/cassandra-3.0/doc/native_protocol_v4.spec#L518-L522])
>  that segregates the "applied" flag information optional row results.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-11106) Experiment with strategies for picking compaction candidates in LCS

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-11106:
-
Component/s: Compaction

> Experiment with strategies for picking compaction candidates in LCS
> ---
>
> Key: CASSANDRA-11106
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11106
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Marcus Eriksson
>Assignee: Dikang Gu
>Priority: Major
>  Labels: lcs
> Fix For: 4.x
>
>
> Ideas taken here: http://rocksdb.org/blog/2921/compaction_pri/
> Current strategy in LCS is that we keep track of the token that was last 
> compacted and then we start a compaction with the sstable containing the next 
> token (kOldestSmallestSeqFirst in the blog post above)
> The rocksdb blog post above introduces a few ideas how this could be improved:
> * pick the 'coldest' sstable (sstable with the oldest max timestamp) - we 
> want to keep the hot data (recently updated) in the lower levels to avoid 
> write amplification
> * pick the sstable with the highest tombstone ratio, we want to get 
> tombstones to the top level as quickly as possible.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-11022) Use SHA hashing to store password in the credentials cache

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-11022:
-
Component/s: Auth

> Use SHA hashing to store password in the credentials cache
> --
>
> Key: CASSANDRA-11022
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11022
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Auth
>Reporter: Mike Adamson
>Priority: Major
> Fix For: 4.x
>
>
> In CASSANDRA-7715 a credentials cache has been added to the 
> {{PasswordAuthenticator}} to improve performance when multiple 
> authentications occur for the same user. 
> Unfortunately, the bcrypt hash is being cached which is one of the major 
> performance overheads in password authentication. 
> I propose that the cache is changed to use a SHA- hash to store the user 
> password. As long as the cache is cleared for the user on an unsuccessful 
> authentication this won't significantly increase the ability of an attacker 
> to use a brute force attack because every other attempt will use bcrypt.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-11584) Add stats about index-entries to per sstable-stats

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-11584:
-
Component/s: Metrics

> Add stats about index-entries to per sstable-stats
> --
>
> Key: CASSANDRA-11584
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11584
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Metrics
>Reporter: Robert Stupp
>Priority: Minor
> Fix For: 4.x
>
>
> Knowing how big index entries (indexed or not) are could be of advantage to 
> tune data modeling or to tune the .yaml config - especially after 
> CASSANDRA-11206.
> Nice would be:
> * histogram of the serialized sizes of RowIndexEntry
> * histogram of the number of IndexInfo per indexed entry



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-10540) RangeAwareCompaction

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-10540:
-
Component/s: Compaction

> RangeAwareCompaction
> 
>
> Key: CASSANDRA-10540
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10540
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Compaction
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Major
>  Labels: 4.0-feature-freeze-review-requested, compaction, lcs, 
> vnodes
> Fix For: 4.x
>
>
> Broken out from CASSANDRA-6696, we should split sstables based on ranges 
> during compaction.
> Requirements;
> * dont create tiny sstables - keep them bunched together until a single vnode 
> is big enough (configurable how big that is)
> * make it possible to run existing compaction strategies on the per-range 
> sstables
> We should probably add a global compaction strategy parameter that states 
> whether this should be enabled or not.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-10699) Make schema alterations strongly consistent

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-10699:
-
Component/s: Distributed Metadata

> Make schema alterations strongly consistent
> ---
>
> Key: CASSANDRA-10699
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10699
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Distributed Metadata
>Reporter: Aleksey Yeschenko
>Assignee: Aleksey Yeschenko
>Priority: Major
> Fix For: 4.x
>
>
> Schema changes do not necessarily commute. This has been the case before 
> CASSANDRA-5202, but now is particularly problematic.
> We should employ a strongly consistent protocol instead of relying on 
> marshalling {{Mutation}} objects with schema changes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13019) Improve clearsnapshot command to delete the snapshot files slowly.

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-13019:
-
Component/s: Core

> Improve clearsnapshot command to delete the snapshot files slowly.
> --
>
> Key: CASSANDRA-13019
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13019
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Dikang Gu
>Assignee: Dikang Gu
>Priority: Major
> Fix For: 4.x
>
>
> In our environment, we are creating snapshots for backup, after we finish the 
> backup, we are running {{clearsnapshot}} to delete the snapshot files. At 
> that time we may have thousands of files to delete, and it's causing sudden 
> disk usage spike. As a result, we are experiencing a spike of drop messages 
> from Cassandra.
> I think we should implement something like {{slowrm}} to delete the snapshot 
> files slowly, avoid the sudden disk usage spike.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-9711) Refactor AbstractBounds hierarchy

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-9711:

Component/s: Core

> Refactor AbstractBounds hierarchy
> -
>
> Key: CASSANDRA-9711
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9711
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Sylvain Lebresne
>Priority: Major
> Fix For: 4.x
>
>
> As it has been remarked in CASSANDRA-9462 and other tickets, the API of 
> {{AbstractBounds}} is pretty messy. In particular, it's not terribly 
> consistent nor clear on it's handling of wrapping ranges. It also doesn't 
> make it easily identifiable if an {{AbstractBounds}} can be wrapping or not, 
> and there is a lot of place where the code assumes it's not but without 
> really checking it, which is error prone. It's also not a very nice API to 
> use (the fact their is 4 different classes that don't even always support the 
> same methods is annoying).
> So we should refactor that API. How exactly is up for discussion however.
> At the very least we probably want to stick to a single concrete class that 
> know if it's bounds are inclusive or not. But one other thing I'd personally 
> like to explore is to separate ranges that can wrap from the one that cannot 
> in 2 separate classes (which doesn't mean they can't share code, they may 
> even be subtypes). Having 2 separate types would:
> # make it obvious what part of the code expect what.
> # would probably simplify the actual code: we unwrap stuffs reasonably 
> quickly in the code, so there probably is a lot of operations that we don't 
> care about on wrapping ranges and lots of operations are easier to write if 
> we don't have to deal with wrapping.
> # for the non-wrapping class, we could trivially use a different value for 
> the min and max values, which will simplify stuff a lot. It might be harder 
> to do the same for wrapping ranges (especially since a single "wrapping" 
> value is what IPartitioner assumes; of course we can change IPartitioner but 
> I'm not sure blowing the scope of this ticket is a good idea).
> As a side note, Guava has a 
> [Range|http://docs.guava-libraries.googlecode.com/git/javadoc/com/google/common/collect/Range.html].
>  If we do separate wrapping and non-wrapping ranges, we might (emphasis on 
> "might") be able to reuse it for the non-wrapping case, which could be nice 
> (they have a 
> [RangeMap|http://docs.guava-libraries.googlecode.com/git/javadoc/com/google/common/collect/RangeMap.html]
>  in particular that could maybe replace our custom {{IntervalTree}}).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-9739) Migrate counter-cache to be fully off-heap

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-9739:

Component/s: Core

> Migrate counter-cache to be fully off-heap
> --
>
> Key: CASSANDRA-9739
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9739
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Core
>Reporter: Robert Stupp
>Assignee: Robert Stupp
>Priority: Major
> Fix For: 4.x
>
>
> Counter cache still uses a concurrent map on-heap. This could go to off-heap 
> and feels doable now after CASSANDRA-8099.
> Evaluation should be done in advance based on a POC to prove that pure 
> off-heap counter cache buys a performance and/or gc-pressure improvement.
> In theory, elimination of on-heap management of the map should buy us some 
> benefit.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-9754) Make index info heap friendly for large CQL partitions

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-9754:

Component/s: Core

> Make index info heap friendly for large CQL partitions
> --
>
> Key: CASSANDRA-9754
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9754
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: sankalp kohli
>Assignee: Michael Kjellman
>Priority: Minor
> Fix For: 4.x
>
> Attachments: 0f8e28c220fd5af6c7b5dd2d3dab6936c4aa4b6b.patch, 
> gc_collection_times_with_birch.png, gc_collection_times_without_birch.png, 
> gc_counts_with_birch.png, gc_counts_without_birch.png, 
> perf_cluster_1_with_birch_read_latency_and_counts.png, 
> perf_cluster_1_with_birch_write_latency_and_counts.png, 
> perf_cluster_2_with_birch_read_latency_and_counts.png, 
> perf_cluster_2_with_birch_write_latency_and_counts.png, 
> perf_cluster_3_without_birch_read_latency_and_counts.png, 
> perf_cluster_3_without_birch_write_latency_and_counts.png
>
>
>  Looking at a heap dump of 2.0 cluster, I found that majority of the objects 
> are IndexInfo and its ByteBuffers. This is specially bad in endpoints with 
> large CQL partitions. If a CQL partition is say 6,4GB, it will have 100K 
> IndexInfo objects and 200K ByteBuffers. This will create a lot of churn for 
> GC. Can this be improved by not creating so many objects?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-9712) Refactor CFMetaData

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-9712:

Component/s: Core

> Refactor CFMetaData
> ---
>
> Key: CASSANDRA-9712
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9712
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Aleksey Yeschenko
>Priority: Major
> Fix For: 4.x
>
>
> As part of CASSANDRA-9425 and a follow-up to CASSANDRA-9665, and a 
> pre-requisite for new schema change protocol, this ticket will do the 
> following
> 1. Make the triggers {{HashMap}} immutable (new {{Triggers}} class)
> 2. Allow multiple 2i definitions per column in CFMetaData
> 3. 
> 4. Rename and move {{config.CFMetaData}} to {{schema.TableMetadata}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-9424) Schema Improvements

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-9424:

Component/s: Distributed Metadata

> Schema Improvements
> ---
>
> Key: CASSANDRA-9424
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9424
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Distributed Metadata
>Reporter: Aleksey Yeschenko
>Priority: Major
> Fix For: 4.x
>
>
> C* schema code is both more brittle and less efficient than I'd like it to 
> be. This ticket will aggregate the improvement tickets to go into 3.X and 4.X 
> to improve the situation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-9520) Track per statement metrics for select prepared statements

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-9520:

Component/s: Metrics

> Track per statement metrics for select prepared statements
> --
>
> Key: CASSANDRA-9520
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9520
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Metrics
>Reporter: Aleksey Yeschenko
>Priority: Major
>  Labels: client-impacting, metrics
> Fix For: 4.x
>
>
> It would be nice to provide users with more granular metrics for the queries 
> they want to measure.
> I suggest extending the native protocol to allow setting an extra flag that 
> would indicate whether or not the statement should have its own separate 
> metrics, and measure latency and counts for each such statement individually.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-9587) Serialize table schema as a sstable component

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-9587:

Component/s: Core

> Serialize table schema as a sstable component
> -
>
> Key: CASSANDRA-9587
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9587
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Core
>Reporter: Sylvain Lebresne
>Priority: Major
> Fix For: 4.x
>
>
> Having the schema with each sstable would be tremendously useful for offline 
> tools and for debugging purposes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-9736) Add alter statement for MV

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-9736:

Component/s: Materialized Views

> Add alter statement for MV
> --
>
> Key: CASSANDRA-9736
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9736
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Materialized Views
>Reporter: Carl Yeksigian
>Assignee: ZhaoYang
>Priority: Major
>  Labels: materializedviews
> Fix For: 4.x
>
>
> {{ALTER MV}} would allow us to drop columns in the base table without first 
> dropping the materialized views, since we'd be able to later drop columns in 
> the MV.
> Also, we should be able to add new columns to the MV; a new builder would 
> have to run to copy the values for these additional columns.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-9691) Liberate cassandra.yaml parameter names from their units

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-9691:

Component/s: Configuration

> Liberate cassandra.yaml parameter names from their units
> 
>
> Key: CASSANDRA-9691
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9691
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Benedict
>Priority: Minor
> Fix For: 4.x
>
>
> The arbitrary units have bugged me for too long. 
> I would like to see, instead of {{trickle_fsync_interval_in_kb=1024}} 
> {{trickle_fsync_interval=1MiB}}, and instead of {{max_hint_window_in_ms = 3 * 
> 3600 * 1000}}, {{max_hint_window=3h}}
> We aren't currently even consistent, with {{counter_cache_save_period}} not 
> specifying its units, and {{stream_throughput_outbound_megabits_per_sec=200}} 
>  is tremendously ugly (vs {{inter_dc_stream_throughput_outbound=25MiB/s}} or 
> {{inter_dc_stream_throughput_outbound=200MBit/s}})
> This is really trivial, and would make our config file look a lot less ugly. 
> Of course, we'll have to support both in parallel for a period.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-9745) Better validation for values

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-9745:

Component/s: CQL

> Better validation for values
> 
>
> Key: CASSANDRA-9745
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9745
> Project: Cassandra
>  Issue Type: New Feature
>  Components: CQL
>Reporter: Sylvain Lebresne
>Priority: Minor
> Fix For: 4.x
>
>
> The only server side validation we do for values is currently that they match 
> their type. Now that we have UDFs, we could allow to optionally specify a 
> (boolean) value validation function on a per-column basis. This would give us 
> a pretty generic way of validating value, which sounds useful to me.
> Once we have that, we could even add some syntactic sugar for some common 
> type of validations, like {{v int[0..100]}} for numbers between 0 and 100, or 
> {{v text[20]}} for a string that is not longer than 20, ... That's more of a 
> follow up however.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-9367) Expose builtin-in CQL functions via schema data dictionary

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-9367:

Component/s: CQL

> Expose builtin-in CQL functions via schema data dictionary
> --
>
> Key: CASSANDRA-9367
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9367
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: CQL
>Reporter: Michaël Figuière
>Priority: Minor
> Fix For: 4.x
>
>
> As CASSANDRA-6717 will not be part of Cassandra 2.2, my comment there got to 
> be considered for the current {{system.schema_functions}} table, to avoid 
> creating a special case just for 2.2:
> {quote}
> Builtin CQL functions are not described in the {{system}} keyspace in its 
> current representation, this new schema should include them next to the user 
> defined ones as:
> * The {{system}} keyspace and its tables are described in the {{system}} 
> keyspace, therefore it would be consistent.
> * Having builtin CQL functions described there would allow external tools to 
> manipulate all the functions in a similar way.
> * This would document the available builtin function for users that don't 
> remember the ones available in their current version of Cassandra.
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-9468) Improvements to BufferPool

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-9468:

Component/s: Core

> Improvements to BufferPool
> --
>
> Key: CASSANDRA-9468
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9468
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Benedict
>Priority: Minor
> Fix For: 4.x
>
>
> Following up from CASSANDRA-8897, there are further improvements that can be 
> made to the BufferPool:
> # The common code paths can be made non-atomic.
> # The chunk pool can be turned into a Stack, instead of a Queue, to improve 
> the likelihood of cache presence
> # The chunk pool can be made processor-local, using e.g. 
> [https://github.com/OpenHFT/Java-Thread-Affinity]
> # We can support smaller allocations by creating micro-chunks within each 
> local pool, by allocating a single unit from the current chunk (or multiple 
> units if we're about to discard a chunk that is not fully utilised).
> #* It should be possible to generalise this approach to make the entire 
> allocation stack tiered, so that whenever you want a new chunk you go to the 
> parent chunk that is an order of magnitude larger, and allocate a small slice 
> (which you convert into a Chunk). Slices below a certain size can be taken 
> exclusive ownership of, and above a certain size they remain shared.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-9555) Don't let offline tools run while cassandra is running

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-9555:

Component/s: Tools

> Don't let offline tools run while cassandra is running
> --
>
> Key: CASSANDRA-9555
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9555
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Marcus Eriksson
>Assignee: Robert Stupp
>Priority: Minor
> Fix For: 4.x
>
>
> We should not let offline tools that modify sstables run while Cassandra is 
> running. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-8354) A better story for dealing with empty values

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-8354:

Component/s: CQL

> A better story for dealing with empty values
> 
>
> Key: CASSANDRA-8354
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8354
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: Sylvain Lebresne
>Priority: Major
> Fix For: 4.x
>
>
> In CQL, a value of any type can be "empty", even for types for which such 
> values doesn't make any sense (int, uuid, ...). Note that it's different from 
> having no value (i.e. a {{null}}). This is due to historical reasons, and we 
> can't entirely disallow it for backward compatibility, but it's pretty 
> painful when working with CQL since you always need to be defensive about 
> such largely non-sensical values.
> This is particularly annoying with UDF: those empty values are represented as 
> {{null}} for UDF and that plays weirdly with UDF that use unboxed native 
> types.
> So I would suggest that we introduce variations of the types that don't 
> accept empty byte buffers for those type for which it's not a particularly 
> sensible value.
> Ideally we'd use those variant by default, that is:
> {noformat}
> CREATE TABLE foo (k text PRIMARY, v int)
> {noformat}
> would not accept empty values for {{v}}. But
> {noformat}
> CREATE TABLE foo (k text PRIMARY, v int ALLOW EMPTY)
> {noformat}
> would.
> Similarly, for UDF, a function like:
> {noformat}
> CREATE FUNCTION incr(v int) RETURNS int LANGUAGE JAVA AS 'return v + 1';
> {noformat}
> would be guaranteed it can only be applied where no empty values are allowed. 
> A
> function that wants to handle empty values could be created with:
> {noformat}
> CREATE FUNCTION incr(v int ALLOW EMPTY) RETURNS int ALLOW EMPTY LANGUAGE JAVA 
> AS 'return (v == null) ? null : v + 1';
> {noformat}
> Of course, doing that has the problem of backward compatibility. One option 
> could be to say that if a type doesn't accept empties, but we do have an 
> empty internally, then we convert it to some reasonably sensible default 
> value (0 for numeric values, the smallest possible uuid for uuids, etc...). 
> This way, we could allow convesion of types to and from 'ALLOW EMPTY'. And 
> maybe we'd say that existing compact tables gets the 'ALLOW EMPTY' flag for 
> their types by default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-8449) Allow zero-copy reads again

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-8449:

Component/s: Local Write-Read Paths

> Allow zero-copy reads again
> ---
>
> Key: CASSANDRA-8449
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8449
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local Write-Read Paths
>Reporter: T Jake Luciani
>Priority: Minor
>  Labels: performance
> Fix For: 4.x
>
>
> We disabled zero-copy reads in CASSANDRA-3179 due to in flight reads 
> accessing a ByteBuffer when the data was unmapped by compaction.  Currently 
> this code path is only used for uncompressed reads.
> The actual bytes are in fact copied to the client output buffers for both 
> netty and thrift before being sent over the wire, so the only issue really is 
> the time it takes to process the read internally.  
> This patch adds a slow network read test and changes the tidy() method to 
> actually delete a sstable once the readTimeout has elapsed giving plenty of 
> time to serialize the read.
> Removing this copy causes significantly less GC on the read path and improves 
> the tail latencies:
> http://cstar.datastax.com/graph?stats=c0c8ce16-7fea-11e4-959d-42010af0688f=gc_count=2_read=1_aggregates=true=0=109.34=0=5.5



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-8465) Phase 1: Break static methods into classes

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-8465:

Component/s: Core

> Phase 1: Break static methods into classes
> --
>
> Key: CASSANDRA-8465
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8465
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Core
>Reporter: Joshua McKenzie
>Priority: Major
> Fix For: 4.x
>
>
> 1:  Writes
> * Regular
> * Counter
> * RegularBatch
> * CounterBatch
> * AtomicBatch
> 2:  Reads
> * Regular
> * Range
> 3:  LightweightTransaction
> * Write
> * Read
> 4: Truncate



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-8132) Save or stream hints to a safe place in node replacement

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-8132:

Component/s: Hints

> Save or stream hints to a safe place in node replacement
> 
>
> Key: CASSANDRA-8132
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8132
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Hints
>Reporter: Minh Do
>Assignee: Minh Do
>Priority: Major
> Fix For: 4.x
>
>
> Often, we need to replace a node with a new instance in the cloud environment 
> where we have all nodes are still alive. To be safe without losing data, we 
> usually make sure all hints are gone before we do this operation.
> Replacement means we just want to shutdown C* process on a node and bring up 
> another instance to take over that node's token.
> However, if a node to be replaced has a lot of stored hints, its 
> HintedHandofManager seems very slow to send the hints to other nodes.  In our 
> case, we tried to replace a node and had to wait for several days before its 
> stored hints are clear out.  As mentioned above, we need all hints on this 
> node to clear out before we can terminate it and replace it by a new 
> instance/machine.
> Since this is not a decommission, I am proposing that we have the same 
> hints-streaming mechanism as in the decommission code.  Furthermore, there 
> needs to be a cmd for NodeTool to trigger this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-8163) Re-introduce DESCRIBE permission

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-8163:

Component/s: CQL

> Re-introduce DESCRIBE permission
> 
>
> Key: CASSANDRA-8163
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8163
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: Vishy Kasar
>Priority: Minor
> Fix For: 4.x
>
>
> We have a cluster like this:
> project1_keyspace
> table101
> table102
> project2_keyspace
> table201
> table202
> We have set up following users and grants:
> project1_user has all access to project1_keyspace 
> project2_user has all access to project2_keyspace
> However project1_user can still do a 'describe schema' and get the schema for 
> project2_keyspace as well. We do not want project1_user to have any knowledge 
> for project2 in any way (cqlsh/java-driver etc) .



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-9264) Cassandra should not persist files without checksums

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-9264:

Component/s: Core

> Cassandra should not persist files without checksums
> 
>
> Key: CASSANDRA-9264
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9264
> Project: Cassandra
>  Issue Type: Wish
>  Components: Core
>Reporter: Ariel Weisberg
>Priority: Major
> Fix For: 4.x
>
>
> Even if checksums aren't validated on the read side every time it is helpful 
> to have them persisted with checksums so that if a corrupted file is 
> encountered you can at least validate that the issue is corruption and not an 
> application level error that generated a corrupt file.
> We should standardize on conventions for how to checksum a file and which 
> checksums to use so we can ensure we get the best performance possible.
> For a small checksum I think we should use CRC32 because the hardware support 
> appears quite good.
> For cases where a 4-byte checksum is not enough I think we can look at either 
> xxhash64 or MurmurHash3.
> The problem with xxhash64 is that output is only 8-bytes. The problem with 
> MurmurHash3 is that the Java implementation is slow. If we can live with 
> 8-bytes and make it easy to switch hash implementations I think xxhash64 is a 
> good choice because we already ship a good implementation with LZ4.
> I would also like to see hashes always prefixed by a type so that we can swap 
> hashes without running into pain trying to figure out what hash 
> implementation is present. I would also like to avoid making assumptions 
> about the number of bytes in a hash field where possible keeping in mind 
> compatibility and space issues.
> Hashing after compression is also desirable over hashing before compression.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-8094) Heavy writes in RangeSlice read requests

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-8094:

Component/s: Coordination

> Heavy writes in RangeSlice read  requests 
> --
>
> Key: CASSANDRA-8094
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8094
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Coordination
>Reporter: Minh Do
>Priority: Major
>  Labels: lhf
> Fix For: 4.x
>
>
> RangeSlice requests always do a scheduled read repair when coordinators try 
> to resolve replicas' responses no matter read_repair_chance is set or not.
> Because of this, in low writes and high reads clusters, there are very high 
> write requests going on between nodes.  
> We should have an option to turn this off and this can be different than the 
> read_repair_chance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-8100) Remove payload size from intra-cluster messages

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-8100:

Component/s: Streaming and Messaging

> Remove payload size from intra-cluster messages
> ---
>
> Key: CASSANDRA-8100
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8100
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Streaming and Messaging
>Reporter: Sylvain Lebresne
>Priority: Major
> Fix For: 4.x
>
>
> Intra-cluster messages ship with the [size of their 
> payload|https://github.com/apache/cassandra/blob/8d8fed52242c34b477d0384ba1d1ce3978efbbe8/src/java/org/apache/cassandra/net/MessageOut.java#L118]
>  before said payload. We mostly don't need it however as deserializers don't 
> rely on it to know when to stop. The [only reason we need 
> it|https://github.com/apache/cassandra/blob/8d8fed52242c34b477d0384ba1d1ce3978efbbe8/src/java/org/apache/cassandra/net/MessageIn.java#L86]
>  is that all responses message use the same {{Verb}} and so in those case we 
> use the message callback to find out the proper serializer, but in the 
> (unlikely) case where the callback has expired, we don't know which 
> deserializer to use and so we use the payload size to skip the payload.
> Having to ship the payload size means we need to be able to compute it. Which 
> means we have to implement the serializedSize for all of your serializer (not 
> a huge deal but annoying), but more importantly, this makes it impossible to 
> write a request response truly incrementally (CASSANDRA-8099) since you have 
> to buffer everything in memory just to compute the serialized size.
> So I propose we remove the payload size from messages. Instead, we can assign 
> specific {{Verb}} to each response type (getting rid of 
> {{CallbackDeterminedSerializer}}). We can then get rid of all the 
> serializedSize methods (for upgrade, when we need to generate messages to old 
> nodes, we could just write the message in a ByteBuffer, write the resulting 
> size and then write the buffer).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-9094) Check for fully expired sstables more often

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-9094:

Component/s: Compaction

> Check for fully expired sstables more often
> ---
>
> Key: CASSANDRA-9094
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9094
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Marcus Eriksson
>Priority: Minor
>  Labels: dtcs
> Fix For: 4.x
>
>
> CASSANDRA-8359 added an extra check for expired sstables to DTCS since that 
> is where it is most likely to happen.
> We should refactor this a bit and check more often for all compaction 
> strategies (and avoid checking twice like we do now with DTCS).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-7975) Get rid of pre-2.1 created local and remote counter shards

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-7975:

Component/s: Core

> Get rid of pre-2.1 created local and remote counter shards
> --
>
> Key: CASSANDRA-7975
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7975
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Core
>Reporter: Aleksey Yeschenko
>Priority: Major
>  Labels: counters
> Fix For: 4.x
>
>
> As a per-requisite for CASSANDRA-6506, we must get rid of the legacy local 
> and remote shards. Collapse them, and convert to global ones - although 
> simply collapsing all the local ones could be enough, if we then simply 
> ignore the shard type during 6506. This is easy to do with SizeTiered, but 
> for Leveled we need CASSANDRA-7019 in, first.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-8304) Explore evicting replacement state sooner

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-8304:

Component/s: Lifecycle
 Distributed Metadata

> Explore evicting replacement state sooner
> -
>
> Key: CASSANDRA-8304
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8304
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Distributed Metadata, Lifecycle
>Reporter: Brandon Williams
>Assignee: Brandon Williams
>Priority: Major
> Fix For: 4.x
>
>
> As a follow-up to CASSANDRA-8260 Tyler suggests we evict and quarantine in 
> step 2.  I don't want to change any core gossip logic in 2.0 at this point, 
> but his theory seems feasible and we should explore it for 3.0.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



  1   2   3   4   5   6   7   8   9   >