[jira] [Commented] (CASSANDRA-19661) Cannot restart Cassandra 5 after creating a vector table and index

2024-05-24 Thread Jonathan Ellis (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17849380#comment-17849380
 ] 

Jonathan Ellis commented on CASSANDRA-19661:


> It looks like perhaps {{computeRowIds()}} is being called multiple times when 
> it shouldn't

Agreed that it's not intended to be called multiple times.

> Is it possible for multiple keys in {{postingsMap}} to point to the same 
> {{VectorPostings}} instance?

I don't think that's possible – the only mutation against `postingsMap` is done 
with a freshly instantiated `VectorPostings`.

> Cannot restart Cassandra 5 after creating a vector table and index
> --
>
> Key: CASSANDRA-19661
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19661
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/SAI
>Reporter: Sergio Rua
>Priority: Normal
> Fix For: 5.0-rc, 5.x
>
>
> I'm using llama-index and llama3 to train a model. I'm using a very simple 
> code that reads some *.txt files from local and uploads them to Cassandra and 
> then creates the index:
>  
> {code:java}
> # Create the index from documents
> index = VectorStoreIndex.from_documents(
> documents,
> service_context=vector_store.service_context,
> storage_context=storage_context,
> show_progress=True,
> ) {code}
> This works well and I'm able to use a Chat app to get responses from the 
> Cassandra data. however, right after, I cannot restart Cassandra. It'll break 
> with the following error:
>  
> {code:java}
> INFO  [PerDiskMemtableFlushWriter_0:7] 2024-05-23 08:23:20,102 
> Flushing.java:179 - Completed flushing 
> /data/cassandra/data/gpt/docs_20240523-10c8eaa018d811ef8dadf75182f3e2b4/da-6-bti-Data.db
>  (124.236MiB) for commitlog position 
> CommitLogPosition(segmentId=1716452305636, position=15336)
> [...]
> WARN  [MemtableFlushWriter:1] 2024-05-23 08:28:29,575 
> MemtableIndexWriter.java:92 - [gpt.docs.idx_vector_docs] Aborting index 
> memtable flush for 
> /data/cassandra/data/gpt/docs-aea77a80184b11ef8dadf75182f3e2b4/da-3-bti...{code}
> {code:java}
> java.lang.IllegalStateException: null
>         at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:496)
>         at 
> org.apache.cassandra.index.sai.disk.v1.vector.VectorPostings.computeRowIds(VectorPostings.java:76)
>         at 
> org.apache.cassandra.index.sai.disk.v1.vector.OnHeapGraph.writeData(OnHeapGraph.java:313)
>         at 
> org.apache.cassandra.index.sai.memory.VectorMemoryIndex.writeDirect(VectorMemoryIndex.java:272)
>         at 
> org.apache.cassandra.index.sai.memory.MemtableIndex.writeDirect(MemtableIndex.java:110)
>         at 
> org.apache.cassandra.index.sai.disk.v1.MemtableIndexWriter.flushVectorIndex(MemtableIndexWriter.java:192)
>         at 
> org.apache.cassandra.index.sai.disk.v1.MemtableIndexWriter.complete(MemtableIndexWriter.java:117)
>         at 
> org.apache.cassandra.index.sai.disk.StorageAttachedIndexWriter.complete(StorageAttachedIndexWriter.java:185)
>         at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
>         at 
> java.base/java.util.Collections$UnmodifiableCollection.forEach(Collections.java:1085)
>         at 
> org.apache.cassandra.io.sstable.format.SSTableWriter.commit(SSTableWriter.java:289)
>         at 
> org.apache.cassandra.db.compaction.unified.ShardedMultiWriter.commit(ShardedMultiWriter.java:219)
>         at 
> org.apache.cassandra.db.ColumnFamilyStore$Flush.flushMemtable(ColumnFamilyStore.java:1323)
>         at 
> org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1222)
>         at 
> org.apache.cassandra.concurrent.ExecutionFailure$1.run(ExecutionFailure.java:133)
>         at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>         at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>         at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>         at java.base/java.lang.Thread.run(Thread.java:829) {code}
> The table created by the script is as follows:
>  
> {noformat}
> CREATE TABLE gpt.docs (
> partition_id text,
> row_id text,
> attributes_blob text,
> body_blob text,
> vector vector,
> metadata_s map,
> PRIMARY KEY (partition_id, row_id)
> ) WITH CLUSTERING ORDER BY (row_id ASC)
> AND additional_write_policy = '99p'
> AND allow_auto_snapshot = true
> AND bloom_filter_fp_chance = 0.01
> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
> AND cdc = false
> AND comment = ''
> AND compaction = {'class': 
> 'org.apache.cassandra.db.compaction.UnifiedCompactionStrategy', 
> 'scaling_parameters': 'T4', 'target_sstable_size': '1GiB'}
> 

[jira] [Commented] (CASSANDRA-18715) Add support for vector search in SAI

2023-10-26 Thread Jonathan Ellis (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17779948#comment-17779948
 ] 

Jonathan Ellis commented on CASSANDRA-18715:


As the primary author of the new JVector dependency, I can also verify that my 
code, while not a contribution to ASF, is my original work. It also includes 
complete details of any third-party licenses or restrictions I am aware of, in 
line with the spirit of clauses #5 and #7 of the ASF ICLA.

> Add support for vector search in SAI
> 
>
> Key: CASSANDRA-18715
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18715
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Vector Search
>Reporter: Mike Adamson
>Assignee: Mike Adamson
>Priority: Normal
> Fix For: 5.0-beta, 5.x
>
>  Time Spent: 24h 50m
>  Remaining Estimate: 0h
>
> The patch associated with this ticket adds a new vector index to SAI. This 
> introduces the following new elements and changes to SAI:
>  * VectorMemtableIndex - the in-memory representation of the vector indexes 
> that writes data to a DiskANN instance
>  * VectorSegmentBuilder - that writes a DiskANN graph to the following 
> on-disk components:
>  ** VECTOR - contains the floating point vectors associated with the graph
>  ** TERMS - contains the HNSW graph on-disk representation written by a 
> HnswGraphWriter
>  ** POSTINGS - contains the index postings as written by a 
> VectorPostingsWriter
>  * VectorIndexSegmentSearcher - used to search the on-disk DiskANN graph



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-18715) Add support for vector search in SAI

2023-10-12 Thread Jonathan Ellis (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-18715:
---
Reviewers: Andres de la Peña, Jonathan Ellis  (was: Andres de la Peña)

> Add support for vector search in SAI
> 
>
> Key: CASSANDRA-18715
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18715
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Vector Search
>Reporter: Mike Adamson
>Assignee: Mike Adamson
>Priority: Normal
> Fix For: 5.0-beta, 5.x
>
>  Time Spent: 12h 10m
>  Remaining Estimate: 0h
>
> The patch associated with this ticket adds a new vector index to SAI. This 
> introduces the following new elements and changes to SAI:
>  * VectorMemtableIndex - the in-memory representation of the vector indexes 
> that writes data to a CassandraOnHeapHnsw instance
>  * VectorSegmentBuilder - that writes a HNSW graph to the following on-disk 
> components:
>  ** VECTOR - contains the floating point vectors associated with the graph
>  ** TERMS - contains the HNSW graph on-disk representation written by a 
> HnswGraphWriter
>  ** POSTINGS - contains the index postings as written by a 
> VectorPostingsWriter
>  * VectorIndexSegmentSearcher - used to search the on-disk HNSW index



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-18793) Fix or remove nodetool compaction completion time estimate

2023-08-23 Thread Jonathan Ellis (Jira)
Jonathan Ellis created CASSANDRA-18793:
--

 Summary: Fix or remove nodetool compaction completion time estimate
 Key: CASSANDRA-18793
 URL: https://issues.apache.org/jira/browse/CASSANDRA-18793
 Project: Cassandra
  Issue Type: Bug
Reporter: Jonathan Ellis


It looks like many years ago the author of this code thought that the 
compactionthroughput from the server was the current rate when it is actually 
the configured maximum.  So using it to estimate time remaining is not useful.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18280) Investigate initial size of GrowableByteArrayDataOutput in RAMIndexOutput

2023-06-28 Thread Jonathan Ellis (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17738283#comment-17738283
 ] 

Jonathan Ellis commented on CASSANDRA-18280:


(This has moved to {color:#00}ResettableByteBuffersIndexOutput){color}

> Investigate initial size of GrowableByteArrayDataOutput in RAMIndexOutput
> -
>
> Key: CASSANDRA-18280
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18280
> Project: Cassandra
>  Issue Type: Task
>  Components: Feature/SAI
>Reporter: Mike Adamson
>Priority: Normal
>
> The GrowableByteArrayDataOutput in RAMIndexOutput is currently initialized 
> with a size of 128 bytes. There is no explanation as to why this size was 
> chosen. 
> The GrowableByteArrayDataOutput does not lazily allocate memory but only ever 
> allocates enough for each write operation. This can lead to a lot of fresh 
> allocations and calls to System.arrayCopy. 
> Since RAMIndexOutput is used to build the on-disk postings in SAI it is 
> likely that the size of the in-memory array is going to grow considerably 
> more that 128 bytes.
> We should investigate changing this initial value to something higher and 
> possibly changing the GrowableByteArrayDataOutput class to allocate in blocks 
> rather than write increments.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18615) CREATE INDEX Modifications for Initial Release of SAI

2023-06-26 Thread Jonathan Ellis (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17737381#comment-17737381
 ] 

Jonathan Ellis commented on CASSANDRA-18615:


I don't think the name `cassandra` is very descriptive to end users.  How about 
`hash_legacy`, `equalityonly_legacy`, something like that ?

> CREATE INDEX Modifications for Initial Release of SAI
> -
>
> Key: CASSANDRA-18615
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18615
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL/Syntax, Feature/SAI
>Reporter: Caleb Rackliffe
>Assignee: Caleb Rackliffe
>Priority: Normal
> Fix For: NA
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> After a lengthy discussion on the dev list, the community seems to have 
> arrived at the following list of TODOs before we release SAI in 5.0:
> 1.) CREATE INDEX should be expanded to support {{USING … WITH OPTIONS…}}
> Essentially, we should be able to do something like {{CREATE INDEX ON tbl(v) 
> USING ’sai’ WITH OPTIONS = ...}} and {{CREATE INDEX ON tbl(v) USING 
> ‘cassandra’}} as a more specific/complete way to emulate the current behavior 
> of {{CREATE INDEX}}.
> 2.) Allow operators to configure, in the YAML, a.) whether an index 
> implementation must be specified w/ USING and {{CREATE INDEX}} and b.) what 
> the default implementation will be, if {{USING}} isn’t required.
> 3.) The defaults we ship w/ will avoid breaking existing {{CREATE INDEX}} 
> usage. (i.e. A default is allowed, and that default will remain ‘cassandra’, 
> or the legacy 2i)
> With all this in place, users should be able create SAI indexes w/ the 
> simplest possible syntax, no defaults will change, and operators will have 
> the ability to change defaults to favor SAI whenever they like.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-18557) CEP-30 ANN Vector Search with SAI

2023-05-30 Thread Jonathan Ellis (Jira)
Jonathan Ellis created CASSANDRA-18557:
--

 Summary: CEP-30 ANN Vector Search with SAI
 Key: CASSANDRA-18557
 URL: https://issues.apache.org/jira/browse/CASSANDRA-18557
 Project: Cassandra
  Issue Type: Epic
Reporter: Jonathan Ellis


[CEP-30|https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-30%3A+Approximate+Nearest+Neighbor%28ANN%29+Vector+Search+via+Storage-Attached+Indexes]
 # Implement approximate nearest neighbor (ANN) vector search capability in 
Apache Cassandra using storage-attached indexes (SAI).
 # Support a vector of float32 embeddings as a new CQL type.
 # Add ANN search to work with normal Cassandra data flow (insertion, updating, 
and deleting rows). The implementation should support adding a new vector in 
log(N) time, and ANN queries in M log(N) time where N is the number of vectors 
and M is the number of sstables.
 # Compose with other SAI predicates.
 # Scatter/gather across replicas, combining topK from each to get global topK.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-18517) Make Cassandra more user-friendly (admin-friendly) by cleaning up logging

2023-05-10 Thread Jonathan Ellis (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-18517:
---
Fix Version/s: 5.0

> Make Cassandra more user-friendly (admin-friendly) by cleaning up logging
> -
>
> Key: CASSANDRA-18517
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18517
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability/Logging
>Reporter: Jonathan Ellis
>Assignee: Jonathan Ellis
>Priority: Normal
> Fix For: 5.0
>
>
> At a very high level we see two types of users:
> 1. Early adopters generally want to understand Cassandra's design, how the 
> pieces fit together, and why
> 2. Mainstream users just want it to work, preferably with as little attention 
> from them as possible
>  
> Group 1 loves verbose logs because it helps them figure out what's going on 
> under the hood.  Group 2 sees verbose logs as intimidating ("this is going to 
> be hard") if not scary ("this isn't debugged enough yet").
>  
> Early on, group 1 users predominate.  But there's way more group 2 users out 
> there now and as Cassandra sees more adoption it will primarily come from 
> them.
>  
> It's time to start optimizing for group 2.  Group 1 is, after all, completely 
> capable of adjusting the log levels themselves when necessary.
>  
> A good rule of thumb is, "is this necessary information for my day-to-day 
> operation of the system."  If not, it should be at debug (or sometimes trace).
> Compare [our startup 
> logging|https://gist.githubusercontent.com/jbellis/a1f324c145c4d46c3409969393b24077/raw/a767e4b92c29a51130940326de619a3c418096b7/gistfile1.txt]
>  with 
> [postgresql's.|https://gist.githubusercontent.com/jbellis/a9585d3f04f49c7e3685c9122975d51b/raw/beea37fd0834d1c8f38e5c1667623aa37598d3dd/gistfile1.txt]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-18517) Make Cassandra more user-friendly (admin-friendly) by cleaning up logging

2023-05-10 Thread Jonathan Ellis (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-18517:
---
  Workflow: Copy of Cassandra Default Workflow  (was: Copy of Cassandra Bug 
Workflow)
Issue Type: Improvement  (was: Bug)

> Make Cassandra more user-friendly (admin-friendly) by cleaning up logging
> -
>
> Key: CASSANDRA-18517
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18517
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability/Logging
>Reporter: Jonathan Ellis
>Assignee: Jonathan Ellis
>Priority: Normal
>
> At a very high level we see two types of users:
> 1. Early adopters generally want to understand Cassandra's design, how the 
> pieces fit together, and why
> 2. Mainstream users just want it to work, preferably with as little attention 
> from them as possible
>  
> Group 1 loves verbose logs because it helps them figure out what's going on 
> under the hood.  Group 2 sees verbose logs as intimidating ("this is going to 
> be hard") if not scary ("this isn't debugged enough yet").
>  
> Early on, group 1 users predominate.  But there's way more group 2 users out 
> there now and as Cassandra sees more adoption it will primarily come from 
> them.
>  
> It's time to start optimizing for group 2.  Group 1 is, after all, completely 
> capable of adjusting the log levels themselves when necessary.
>  
> A good rule of thumb is, "is this necessary information for my day-to-day 
> operation of the system."  If not, it should be at debug (or sometimes trace).
> Compare [our startup 
> logging|https://gist.githubusercontent.com/jbellis/a1f324c145c4d46c3409969393b24077/raw/a767e4b92c29a51130940326de619a3c418096b7/gistfile1.txt]
>  with 
> [postgresql's.|https://gist.githubusercontent.com/jbellis/a9585d3f04f49c7e3685c9122975d51b/raw/beea37fd0834d1c8f38e5c1667623aa37598d3dd/gistfile1.txt]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-18517) Make Cassandra more user-friendly (admin-friendly) by cleaning up logging

2023-05-10 Thread Jonathan Ellis (Jira)
Jonathan Ellis created CASSANDRA-18517:
--

 Summary: Make Cassandra more user-friendly (admin-friendly) by 
cleaning up logging
 Key: CASSANDRA-18517
 URL: https://issues.apache.org/jira/browse/CASSANDRA-18517
 Project: Cassandra
  Issue Type: Bug
  Components: Observability/Logging
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis


At a very high level we see two types of users:

1. Early adopters generally want to understand Cassandra's design, how the 
pieces fit together, and why
2. Mainstream users just want it to work, preferably with as little attention 
from them as possible
 
Group 1 loves verbose logs because it helps them figure out what's going on 
under the hood.  Group 2 sees verbose logs as intimidating ("this is going to 
be hard") if not scary ("this isn't debugged enough yet").
 
Early on, group 1 users predominate.  But there's way more group 2 users out 
there now and as Cassandra sees more adoption it will primarily come from them.
 
It's time to start optimizing for group 2.  Group 1 is, after all, completely 
capable of adjusting the log levels themselves when necessary.
 
A good rule of thumb is, "is this necessary information for my day-to-day 
operation of the system."  If not, it should be at debug (or sometimes trace).

Compare [our startup 
logging|https://gist.githubusercontent.com/jbellis/a1f324c145c4d46c3409969393b24077/raw/a767e4b92c29a51130940326de619a3c418096b7/gistfile1.txt]
 with 
[postgresql's.|https://gist.githubusercontent.com/jbellis/a9585d3f04f49c7e3685c9122975d51b/raw/beea37fd0834d1c8f38e5c1667623aa37598d3dd/gistfile1.txt]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-18504) Added support for type VECTOR

2023-05-06 Thread Jonathan Ellis (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-18504:
---
Reviewers: Mike Adamson

> Added support for type VECTOR
> --
>
> Key: CASSANDRA-18504
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18504
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Cluster/Schema, CQL/Syntax
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
> Fix For: 5.x
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Based off several mailing list threads (see "[POLL] Vector type for ML”, 
> "[DISCUSS] New data type for vector search”, and "Adding vector search to SAI 
> with heirarchical navigable small world graph index”), its desirable to add a 
> new type “VECTOR” that has the following properties
> 1) fixed length array
> 2) elements may not be null
> 3) flatten array (aka multi-cell = false)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13981) Enable Cassandra for Persistent Memory

2023-05-02 Thread Jonathan Ellis (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17718713#comment-17718713
 ] 

Jonathan Ellis commented on CASSANDRA-13981:


I'm not aware of any mainstream persistent memory now that Optane is dead.

> Enable Cassandra for Persistent Memory 
> ---
>
> Key: CASSANDRA-13981
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13981
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Legacy/Core
>Reporter: Preetika Tyagi
>Assignee: Preetika Tyagi
>Priority: Normal
> Fix For: 5.x
>
> Attachments: in-mem-cassandra-1.0.patch, in-mem-cassandra-2.0.patch, 
> in-mem-cassandra-2.1.patch, readme.txt, readme2.1.txt, readme2_0.txt
>
>
> Currently, Cassandra relies on disks for data storage and hence it needs data 
> serialization, compaction, bloom filters and partition summary/index for 
> speedy access of the data. However, with persistent memory, data can be 
> stored directly in the form of Java objects and collections, which can 
> greatly simplify the retrieval mechanism of the data. What we are proposing 
> is to make use of faster and scalable B+ tree-based data collections built 
> for persistent memory in Java (PCJ: https://github.com/pmem/pcj) and enable a 
> complete in-memory version of Cassandra, while still keeping the data 
> persistent.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-9829) Dynamically adjust LCS level sizes

2023-04-24 Thread Jonathan Ellis (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17715858#comment-17715858
 ] 

Jonathan Ellis commented on CASSANDRA-9829:
---

Do we think UCS replaces substantially all LCS use cases, or just many of them?

> Dynamically adjust LCS level sizes
> --
>
> Key: CASSANDRA-9829
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9829
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Local/Compaction
>Reporter: Jonathan Ellis
>Priority: Normal
>  Labels: compaction, lcs, performance
> Fix For: 5.x
>
>
> LCS works best when the top level is full.  Then 90% of reads can be served 
> from a single sstable.  By contrast if the top level is only 10% full then 
> 90% of reads will be served from two.  This results in worse performance as 
> well as confused users.
> To address this, we can adjust the ideal top level size to how much data is 
> actually in it (and set each corresponding lower level to 1/10 of the next 
> one above).
> (This is an idea [from 
> rocksdb|https://www.reddit.com/r/IAmA/comments/3de3cv/we_are_rocksdb_engineering_team_ask_us_anything/ct4asen].)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18329) Upgrade jamm

2023-04-18 Thread Jonathan Ellis (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17713812#comment-17713812
 ] 

Jonathan Ellis commented on CASSANDRA-18329:


As a more-actively-maintained alternative to jamm, Lucene has this: 
https://lucene.apache.org/core/9_5_0/core/org/apache/lucene/util/RamUsageEstimator.html

> Upgrade jamm
> 
>
> Key: CASSANDRA-18329
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18329
> Project: Cassandra
>  Issue Type: Task
>  Components: Jamm
>Reporter: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.x
>
>
> Jamm is currently under maintenance that will solve JDK11 issues and enable 
> it to work with post JDK11+ versions up to JDK17.
> This ticket will serve as a placeholder for upgrading Jamm in Cassandra when 
> the new Jamm release is out. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-18226) Document how multi-dc replication works

2023-02-03 Thread Jonathan Ellis (Jira)
Jonathan Ellis created CASSANDRA-18226:
--

 Summary: Document how multi-dc replication works
 Key: CASSANDRA-18226
 URL: https://issues.apache.org/jira/browse/CASSANDRA-18226
 Project: Cassandra
  Issue Type: Improvement
  Components: Documentation
Reporter: Jonathan Ellis


I can't find a good overview of how multi-dc replication works from an 
operator's perspective.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14248) SSTableIndex should not use Ref#globalCount() to determine when to delete index file

2020-05-04 Thread Jonathan Ellis (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17099309#comment-17099309
 ] 

Jonathan Ellis commented on CASSANDRA-14248:


discoverComponentsFor does find it.

 

Jira is breaking the link by including the close paren, clean link here: 
https://github.com/jbellis/cassandra/tree/14248

> SSTableIndex should not use Ref#globalCount() to determine when to delete 
> index file
> 
>
> Key: CASSANDRA-14248
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14248
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/SASI
>Reporter: Jordan West
>Assignee: Jordan West
>Priority: Normal
> Fix For: 3.11.x
>
>
> {{SSTableIndex}} instances maintain a {{Ref}} to the underlying 
> {{SSTableReader}} instance. When determining whether or not to delete the 
> file after the last {{SSTableIndex}} reference is released, the 
> implementation uses {{sstableRef.globalCount()}}: 
> [https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/index/sasi/SSTableIndex.java#L135.]
>  This is incorrect because {{sstableRef.globalCount()}} returns the number of 
> references to the specific instance of {{SSTableReader}}. However, in cases 
> like index summary redistribution, there can be more than one instance of 
> {{SSTableReader}}. Further, since the reader is shared across multiple 
> indexes, not all indexes see the count go to 0. This can lead to cases where 
> the {{SSTableIndex}} file is incorrectly deleted or not deleted when it 
> should be.
>  
> A more correct implementation would be to either:
>  * Tie into the existing {{SSTableTidier}}. SASI indexes already are SSTable 
> components but are not cleaned up by the {{SSTableTidier}} because they are 
> not found with the currently cleanup implementation
>  * Revamp {{SSTableIndex}} reference counting to use {{Ref}} and implement a 
> new tidier. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14248) SSTableIndex should not use Ref#globalCount() to determine when to delete index file

2020-04-23 Thread Jonathan Ellis (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090947#comment-17090947
 ] 

Jonathan Ellis commented on CASSANDRA-14248:


Hi Jordan,

It looks to me like SSTableTidier does in fact include SASI components and the 
only thing stopping it is that SSTableIndex usually deletes them first.  I 
created a patch to remove that redundancy 
([https://github.com/jbellis/cassandra/tree/14248)] and now SSTT is doing its 
job.  Am I missing something?

> SSTableIndex should not use Ref#globalCount() to determine when to delete 
> index file
> 
>
> Key: CASSANDRA-14248
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14248
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/SASI
>Reporter: Jordan West
>Assignee: Jordan West
>Priority: Normal
> Fix For: 3.11.x
>
>
> {{SSTableIndex}} instances maintain a {{Ref}} to the underlying 
> {{SSTableReader}} instance. When determining whether or not to delete the 
> file after the last {{SSTableIndex}} reference is released, the 
> implementation uses {{sstableRef.globalCount()}}: 
> [https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/index/sasi/SSTableIndex.java#L135.]
>  This is incorrect because {{sstableRef.globalCount()}} returns the number of 
> references to the specific instance of {{SSTableReader}}. However, in cases 
> like index summary redistribution, there can be more than one instance of 
> {{SSTableReader}}. Further, since the reader is shared across multiple 
> indexes, not all indexes see the count go to 0. This can lead to cases where 
> the {{SSTableIndex}} file is incorrectly deleted or not deleted when it 
> should be.
>  
> A more correct implementation would be to either:
>  * Tie into the existing {{SSTableTidier}}. SASI indexes already are SSTable 
> components but are not cleaned up by the {{SSTableTidier}} because they are 
> not found with the currently cleanup implementation
>  * Revamp {{SSTableIndex}} reference counting to use {{Ref}} and implement a 
> new tidier. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14825) Expose table schema for drivers

2019-04-18 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16821297#comment-16821297
 ] 

Jonathan Ellis commented on CASSANDRA-14825:


IMO the right UX for interacting with "what tables and keyspaces are defined" 
is a system view and SELECT.  SELECT composes by design with everything else 
(WHERE, even UDF) in a way that DESCRIBE does not.

(It's fine to then say "we're not going to give you the schema in that view, 
you need to use DESCRIBE if you want that" but I don't think that's necessarily 
either better or worse.)

I think it's more important to get the composability right at the CQL level, 
than give an easy way to get a full schema dump.  I think the latter scenario 
is much more niche and I'm fine with requiring a for loop to get it.

> Expose table schema for drivers
> ---
>
> Key: CASSANDRA-14825
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14825
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/CQL
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Normal
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Currently the drivers recreate the CQL for the tables by putting together the 
> system table values. This is very difficult to keep up to date and buggy 
> enough that its only even supported in Java and Python drivers. Cassandra 
> already has some limited output available for snapshots that we could provide 
> in a virtual table or new query that the drivers can fetch. This can greatly 
> reduce the complexity of drivers while also reducing bugs like 
> CASSANDRA-14822 as the underlying schema and properties change.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14737) Limit the dependencies used by UDFs/UDAs

2018-09-12 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16612242#comment-16612242
 ] 

Jonathan Ellis commented on CASSANDRA-14737:


I can confirm that DS is contributing a license to the Java driver code 
included here to the ASF.

> Limit the dependencies used by UDFs/UDAs
> 
>
> Key: CASSANDRA-14737
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14737
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: Robert Stupp
>Assignee: Robert Stupp
>Priority: Minor
>  Labels: UDF
> Fix For: 4.0
>
>
> In an effort to clean up our hygiene and limit the dependencies used by 
> UDFs/UDAs, I think we should refactor the UDF code parts and remove the 
> dependency to the Java Driver in that area without breaking existing 
> UDFs/UDAs.
>   
> The patch is in [this 
> branch|https://github.com/snazy/cassandra/tree/feature/remove-udf-driver-dep-trunk].
>  The changes are rather trivial and provide 100% backwards compatibility for 
> existing UDFs.
>   
>  The prototype copies the necessary parts from the Java Driver into the C* 
> source tree to {{org.apache.cassandra.cql3.functions.types}} and adopts its 
> usages - i.e. UDF/UDA code plus {{CQLSSTableWriter}} + 
> {{StressCQLSSTableWriter}}. The latter two classes have a reference to UDF's 
> {{UDHelper}} and had to be changed as well.
>   
>  Some functionality, like type parsing & handling, is duplicated in the code 
> base with this prototype - once in the "current" source tree and once for 
> UDFs. However, unifying the code paths is not trivial, since the UDF sandbox 
> prohibits the use of internal classes (direct and likely indirect 
> dependencies).
>   
>  /cc [~jbellis] 
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-7544) Allow storage port to be configurable per node

2018-02-01 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348816#comment-16348816
 ] 

Jonathan Ellis commented on CASSANDRA-7544:
---

{quote}I'm not sure how we intended that to work. We don't have trunk releases 
so what is the expectation there from the perspective of clients?
{quote}
ISTM that a discussion on the dev list is warranted.  That's a pretty big "side 
effect" of this patch.

> Allow storage port to be configurable per node
> --
>
> Key: CASSANDRA-7544
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7544
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Sam Overton
>Assignee: Ariel Weisberg
>Priority: Major
> Fix For: 4.0
>
>
> Currently storage_port must be configured identically on all nodes in a 
> cluster and it is assumed that this is the case when connecting to a remote 
> node.
> This prevents running in any environment that requires multiple nodes to be 
> able to bind to the same network interface, such as with many automatic 
> provisioning/deployment frameworks.
> The current solutions seems to be
> * use a separate network interface for each node deployed to the same box. 
> This puts a big requirement on IP allocation at large scale.
> * allow multiple clusters to be provisioned from the same resource pool, but 
> restrict allocation to a maximum of one node per host from each cluster, 
> assuming each cluster is running on a different storage port.
> It would make operations much simpler in these kind of environments if the 
> environment provisioning the resources could assign the ports to be used when 
> bringing up a new node on shared hardware.
> The changes required would be at least the following:
> 1. configure seeds as IP:port instead of just IP
> 2. gossip the storage port as part of a node's ApplicationState
> 3. refer internally to nodes by hostID instead of IP, since there will be 
> multiple nodes with the same IP
> (1) & (2) are mostly trivial and I already have a patch for these. The bulk 
> of the work to enable this is (3), and I would structure this as a separate 
> pre-requisite patch. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-7544) Allow storage port to be configurable per node

2018-01-30 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345873#comment-16345873
 ] 

Jonathan Ellis commented on CASSANDRA-7544:
---

I see that the protocol version is incremented but there are no edits to the 
native_protocol spec.  Oversight?

Also, is this the right place to change the default protocol?  Shouldn't that 
be a separate discussion?

> Allow storage port to be configurable per node
> --
>
> Key: CASSANDRA-7544
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7544
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Sam Overton
>Assignee: Ariel Weisberg
>Priority: Major
> Fix For: 4.0
>
>
> Currently storage_port must be configured identically on all nodes in a 
> cluster and it is assumed that this is the case when connecting to a remote 
> node.
> This prevents running in any environment that requires multiple nodes to be 
> able to bind to the same network interface, such as with many automatic 
> provisioning/deployment frameworks.
> The current solutions seems to be
> * use a separate network interface for each node deployed to the same box. 
> This puts a big requirement on IP allocation at large scale.
> * allow multiple clusters to be provisioned from the same resource pool, but 
> restrict allocation to a maximum of one node per host from each cluster, 
> assuming each cluster is running on a different storage port.
> It would make operations much simpler in these kind of environments if the 
> environment provisioning the resources could assign the ports to be used when 
> bringing up a new node on shared hardware.
> The changes required would be at least the following:
> 1. configure seeds as IP:port instead of just IP
> 2. gossip the storage port as part of a node's ApplicationState
> 3. refer internally to nodes by hostID instead of IP, since there will be 
> multiple nodes with the same IP
> (1) & (2) are mostly trivial and I already have a patch for these. The bulk 
> of the work to enable this is (3), and I would structure this as a separate 
> pre-requisite patch. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-12126) CAS Reads Inconsistencies

2017-04-19 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15974834#comment-15974834
 ] 

Jonathan Ellis edited comment on CASSANDRA-12126 at 4/19/17 2:56 PM:
-

I see.  So you are saying that

1: Write
2: Read -> Nothing
3: Read -> Something

Is broken because to go from Nothing to Something [in a linearized system] 
there needs to be a write in between.


was (Author: jbellis):
I see.  So you are saying that

1: Write
2: Read -> Nothing
3: Read -> Something

Is broken because to go from Nothing to Something there needs to be a write in 
between.

> CAS Reads Inconsistencies 
> --
>
> Key: CASSANDRA-12126
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12126
> Project: Cassandra
>  Issue Type: Bug
>  Components: Coordination
>Reporter: sankalp kohli
>Assignee: Stefan Podkowinski
>
> While looking at the CAS code in Cassandra, I found a potential issue with 
> CAS Reads. Here is how it can happen with RF=3
> 1) You issue a CAS Write and it fails in the propose phase. A machine replies 
> true to a propose and saves the commit in accepted filed. The other two 
> machines B and C does not get to the accept phase. 
> Current state is that machine A has this commit in paxos table as accepted 
> but not committed and B and C does not. 
> 2) Issue a CAS Read and it goes to only B and C. You wont be able to read the 
> value written in step 1. This step is as if nothing is inflight. 
> 3) Issue another CAS Read and it goes to A and B. Now we will discover that 
> there is something inflight from A and will propose and commit it with the 
> current ballot. Now we can read the value written in step 1 as part of this 
> CAS read.
> If we skip step 3 and instead run step 4, we will never learn about value 
> written in step 1. 
> 4. Issue a CAS Write and it involves only B and C. This will succeed and 
> commit a different value than step 1. Step 1 value will never be seen again 
> and was never seen before. 
> If you read the Lamport “paxos made simple” paper and read section 2.3. It 
> talks about this issue which is how learners can find out if majority of the 
> acceptors have accepted the proposal. 
> In step 3, it is correct that we propose the value again since we dont know 
> if it was accepted by majority of acceptors. When we ask majority of 
> acceptors, and more than one acceptors but not majority has something in 
> flight, we have no way of knowing if it is accepted by majority of acceptors. 
> So this behavior is correct. 
> However we need to fix step 2, since it caused reads to not be linearizable 
> with respect to writes and other reads. In this case, we know that majority 
> of acceptors have no inflight commit which means we have majority that 
> nothing was accepted by majority. I think we should run a propose step here 
> with empty commit and that will cause write written in step 1 to not be 
> visible ever after. 
> With this fix, we will either see data written in step 1 on next serial read 
> or will never see it which is what we want. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-12126) CAS Reads Inconsistencies

2017-04-19 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15974834#comment-15974834
 ] 

Jonathan Ellis commented on CASSANDRA-12126:


I see.  So you are saying that

1: Write
2: Read -> Nothing
3: Read -> Something

Is broken because to go from Nothing to Something there needs to be a write in 
between.

> CAS Reads Inconsistencies 
> --
>
> Key: CASSANDRA-12126
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12126
> Project: Cassandra
>  Issue Type: Bug
>  Components: Coordination
>Reporter: sankalp kohli
>Assignee: Stefan Podkowinski
>
> While looking at the CAS code in Cassandra, I found a potential issue with 
> CAS Reads. Here is how it can happen with RF=3
> 1) You issue a CAS Write and it fails in the propose phase. A machine replies 
> true to a propose and saves the commit in accepted filed. The other two 
> machines B and C does not get to the accept phase. 
> Current state is that machine A has this commit in paxos table as accepted 
> but not committed and B and C does not. 
> 2) Issue a CAS Read and it goes to only B and C. You wont be able to read the 
> value written in step 1. This step is as if nothing is inflight. 
> 3) Issue another CAS Read and it goes to A and B. Now we will discover that 
> there is something inflight from A and will propose and commit it with the 
> current ballot. Now we can read the value written in step 1 as part of this 
> CAS read.
> If we skip step 3 and instead run step 4, we will never learn about value 
> written in step 1. 
> 4. Issue a CAS Write and it involves only B and C. This will succeed and 
> commit a different value than step 1. Step 1 value will never be seen again 
> and was never seen before. 
> If you read the Lamport “paxos made simple” paper and read section 2.3. It 
> talks about this issue which is how learners can find out if majority of the 
> acceptors have accepted the proposal. 
> In step 3, it is correct that we propose the value again since we dont know 
> if it was accepted by majority of acceptors. When we ask majority of 
> acceptors, and more than one acceptors but not majority has something in 
> flight, we have no way of knowing if it is accepted by majority of acceptors. 
> So this behavior is correct. 
> However we need to fix step 2, since it caused reads to not be linearizable 
> with respect to writes and other reads. In this case, we know that majority 
> of acceptors have no inflight commit which means we have majority that 
> nothing was accepted by majority. I think we should run a propose step here 
> with empty commit and that will cause write written in step 1 to not be 
> visible ever after. 
> With this fix, we will either see data written in step 1 on next serial read 
> or will never see it which is what we want. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-12126) CAS Reads Inconsistencies

2017-04-18 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15973117#comment-15973117
 ] 

Jonathan Ellis commented on CASSANDRA-12126:


I'm confused, because it sounds like you're saying "all operations should be 
visible once finished."  Of course that's not actually what you mean that would 
require participation from all replicas to finish in-flight requests and not 
just a majority.  What is the distinction you are proposing?

> CAS Reads Inconsistencies 
> --
>
> Key: CASSANDRA-12126
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12126
> Project: Cassandra
>  Issue Type: Bug
>  Components: Coordination
>Reporter: sankalp kohli
>Assignee: Stefan Podkowinski
>
> While looking at the CAS code in Cassandra, I found a potential issue with 
> CAS Reads. Here is how it can happen with RF=3
> 1) You issue a CAS Write and it fails in the propose phase. A machine replies 
> true to a propose and saves the commit in accepted filed. The other two 
> machines B and C does not get to the accept phase. 
> Current state is that machine A has this commit in paxos table as accepted 
> but not committed and B and C does not. 
> 2) Issue a CAS Read and it goes to only B and C. You wont be able to read the 
> value written in step 1. This step is as if nothing is inflight. 
> 3) Issue another CAS Read and it goes to A and B. Now we will discover that 
> there is something inflight from A and will propose and commit it with the 
> current ballot. Now we can read the value written in step 1 as part of this 
> CAS read.
> If we skip step 3 and instead run step 4, we will never learn about value 
> written in step 1. 
> 4. Issue a CAS Write and it involves only B and C. This will succeed and 
> commit a different value than step 1. Step 1 value will never be seen again 
> and was never seen before. 
> If you read the Lamport “paxos made simple” paper and read section 2.3. It 
> talks about this issue which is how learners can find out if majority of the 
> acceptors have accepted the proposal. 
> In step 3, it is correct that we propose the value again since we dont know 
> if it was accepted by majority of acceptors. When we ask majority of 
> acceptors, and more than one acceptors but not majority has something in 
> flight, we have no way of knowing if it is accepted by majority of acceptors. 
> So this behavior is correct. 
> However we need to fix step 2, since it caused reads to not be linearizable 
> with respect to writes and other reads. In this case, we know that majority 
> of acceptors have no inflight commit which means we have majority that 
> nothing was accepted by majority. I think we should run a propose step here 
> with empty commit and that will cause write written in step 1 to not be 
> visible ever after. 
> With this fix, we will either see data written in step 1 on next serial read 
> or will never see it which is what we want. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-12126) CAS Reads Inconsistencies

2017-04-18 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15972886#comment-15972886
 ] 

Jonathan Ellis commented on CASSANDRA-12126:


Bailis:

# once a write completes, all later reads (where “later” is defined by 
wall-clock start time) should return the value of that write or the value of a 
later write. 
# Once a read returns a particular value, all later reads should return that 
value or the value of a later write.

I think we all agree that our current behavior satisfies (2).  I am saying that 
we actually also satisfy (1) because the write is not complete until Sankalp's 
step 3.


> CAS Reads Inconsistencies 
> --
>
> Key: CASSANDRA-12126
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12126
> Project: Cassandra
>  Issue Type: Bug
>  Components: Coordination
>Reporter: sankalp kohli
>Assignee: Stefan Podkowinski
>
> While looking at the CAS code in Cassandra, I found a potential issue with 
> CAS Reads. Here is how it can happen with RF=3
> 1) You issue a CAS Write and it fails in the propose phase. A machine replies 
> true to a propose and saves the commit in accepted filed. The other two 
> machines B and C does not get to the accept phase. 
> Current state is that machine A has this commit in paxos table as accepted 
> but not committed and B and C does not. 
> 2) Issue a CAS Read and it goes to only B and C. You wont be able to read the 
> value written in step 1. This step is as if nothing is inflight. 
> 3) Issue another CAS Read and it goes to A and B. Now we will discover that 
> there is something inflight from A and will propose and commit it with the 
> current ballot. Now we can read the value written in step 1 as part of this 
> CAS read.
> If we skip step 3 and instead run step 4, we will never learn about value 
> written in step 1. 
> 4. Issue a CAS Write and it involves only B and C. This will succeed and 
> commit a different value than step 1. Step 1 value will never be seen again 
> and was never seen before. 
> If you read the Lamport “paxos made simple” paper and read section 2.3. It 
> talks about this issue which is how learners can find out if majority of the 
> acceptors have accepted the proposal. 
> In step 3, it is correct that we propose the value again since we dont know 
> if it was accepted by majority of acceptors. When we ask majority of 
> acceptors, and more than one acceptors but not majority has something in 
> flight, we have no way of knowing if it is accepted by majority of acceptors. 
> So this behavior is correct. 
> However we need to fix step 2, since it caused reads to not be linearizable 
> with respect to writes and other reads. In this case, we know that majority 
> of acceptors have no inflight commit which means we have majority that 
> nothing was accepted by majority. I think we should run a propose step here 
> with empty commit and that will cause write written in step 1 to not be 
> visible ever after. 
> With this fix, we will either see data written in step 1 on next serial read 
> or will never see it which is what we want. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-12126) CAS Reads Inconsistencies

2017-04-18 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15972883#comment-15972883
 ] 

Jonathan Ellis commented on CASSANDRA-12126:


But we stipulated that 1 times out and did not complete.  (If it did complete 
it would be guaranteed to be visible by any majority of course.)

> CAS Reads Inconsistencies 
> --
>
> Key: CASSANDRA-12126
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12126
> Project: Cassandra
>  Issue Type: Bug
>  Components: Coordination
>Reporter: sankalp kohli
>Assignee: Stefan Podkowinski
>
> While looking at the CAS code in Cassandra, I found a potential issue with 
> CAS Reads. Here is how it can happen with RF=3
> 1) You issue a CAS Write and it fails in the propose phase. A machine replies 
> true to a propose and saves the commit in accepted filed. The other two 
> machines B and C does not get to the accept phase. 
> Current state is that machine A has this commit in paxos table as accepted 
> but not committed and B and C does not. 
> 2) Issue a CAS Read and it goes to only B and C. You wont be able to read the 
> value written in step 1. This step is as if nothing is inflight. 
> 3) Issue another CAS Read and it goes to A and B. Now we will discover that 
> there is something inflight from A and will propose and commit it with the 
> current ballot. Now we can read the value written in step 1 as part of this 
> CAS read.
> If we skip step 3 and instead run step 4, we will never learn about value 
> written in step 1. 
> 4. Issue a CAS Write and it involves only B and C. This will succeed and 
> commit a different value than step 1. Step 1 value will never be seen again 
> and was never seen before. 
> If you read the Lamport “paxos made simple” paper and read section 2.3. It 
> talks about this issue which is how learners can find out if majority of the 
> acceptors have accepted the proposal. 
> In step 3, it is correct that we propose the value again since we dont know 
> if it was accepted by majority of acceptors. When we ask majority of 
> acceptors, and more than one acceptors but not majority has something in 
> flight, we have no way of knowing if it is accepted by majority of acceptors. 
> So this behavior is correct. 
> However we need to fix step 2, since it caused reads to not be linearizable 
> with respect to writes and other reads. In this case, we know that majority 
> of acceptors have no inflight commit which means we have majority that 
> nothing was accepted by majority. I think we should run a propose step here 
> with empty commit and that will cause write written in step 1 to not be 
> visible ever after. 
> With this fix, we will either see data written in step 1 on next serial read 
> or will never see it which is what we want. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-12126) CAS Reads Inconsistencies

2017-04-18 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15972815#comment-15972815
 ] 

Jonathan Ellis commented on CASSANDRA-12126:


I see you outlining two series of steps:

1 -> 2 -> 3.  The value V from 1 is not seen in 2, but once it is seen in 3 it 
is always seen.

1 -> 2 -> 4.  V is never seen.

It seems to me that both of these maintain linearizability.  What am I missing?

> CAS Reads Inconsistencies 
> --
>
> Key: CASSANDRA-12126
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12126
> Project: Cassandra
>  Issue Type: Bug
>  Components: Coordination
>Reporter: sankalp kohli
>Assignee: Stefan Podkowinski
>
> While looking at the CAS code in Cassandra, I found a potential issue with 
> CAS Reads. Here is how it can happen with RF=3
> 1) You issue a CAS Write and it fails in the propose phase. A machine replies 
> true to a propose and saves the commit in accepted filed. The other two 
> machines B and C does not get to the accept phase. 
> Current state is that machine A has this commit in paxos table as accepted 
> but not committed and B and C does not. 
> 2) Issue a CAS Read and it goes to only B and C. You wont be able to read the 
> value written in step 1. This step is as if nothing is inflight. 
> 3) Issue another CAS Read and it goes to A and B. Now we will discover that 
> there is something inflight from A and will propose and commit it with the 
> current ballot. Now we can read the value written in step 1 as part of this 
> CAS read.
> If we skip step 3 and instead run step 4, we will never learn about value 
> written in step 1. 
> 4. Issue a CAS Write and it involves only B and C. This will succeed and 
> commit a different value than step 1. Step 1 value will never be seen again 
> and was never seen before. 
> If you read the Lamport “paxos made simple” paper and read section 2.3. It 
> talks about this issue which is how learners can find out if majority of the 
> acceptors have accepted the proposal. 
> In step 3, it is correct that we propose the value again since we dont know 
> if it was accepted by majority of acceptors. When we ask majority of 
> acceptors, and more than one acceptors but not majority has something in 
> flight, we have no way of knowing if it is accepted by majority of acceptors. 
> So this behavior is correct. 
> However we need to fix step 2, since it caused reads to not be linearizable 
> with respect to writes and other reads. In this case, we know that majority 
> of acceptors have no inflight commit which means we have majority that 
> nothing was accepted by majority. I think we should run a propose step here 
> with empty commit and that will cause write written in step 1 to not be 
> visible ever after. 
> With this fix, we will either see data written in step 1 on next serial read 
> or will never see it which is what we want. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13442) Support a means of strongly consistent highly available replication with storage requirements approximating RF=2

2017-04-17 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15972042#comment-15972042
 ] 

Jonathan Ellis commented on CASSANDRA-13442:


Man, this makes me really nervous.  Reducing replication automagically is 
ripping the safety guard off a chainsaw and handing it to a ten year old.  
Remember that those replicas aren't just for consistency but in case your 
hardware fails: if you have three copies and you lose one, no big deal, you 
still have two to restore from.  Just two copies?  If anything goes wrong with 
that other copy while you repair you are SOL.

Optimizing away redundant queries a la 7168?  Sign me up.  But I think removing 
that "redundant" data and making RF not actually mean RF is going too far.

> The topology of the cluster would also have a new dimension that the drivers 
> would need to consider. Since for CL.ONE queries you would need to only use 
> one of the replicas with all the data on it.

Yes, I think there are going to be multiple places where this gets more 
complicated than it looks at first.

> Support a means of strongly consistent highly available replication with 
> storage requirements approximating RF=2
> 
>
> Key: CASSANDRA-13442
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13442
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction, Coordination, Distributed Metadata, Local 
> Write-Read Paths
>Reporter: Ariel Weisberg
>
> Replication factors like RF=2 can't provide strong consistency and 
> availability because if a single node is lost it's impossible to reach a 
> quorum of replicas. Stepping up to RF=3 will allow you to lose a node and 
> still achieve quorum for reads and writes, but requires committing additional 
> storage.
> The requirement of a quorum for writes/reads doesn't seem to be something 
> that can be relaxed without additional constraints on queries, but it seems 
> like it should be possible to relax the requirement that 3 full copies of the 
> entire data set are kept. What is actually required is a covering data set 
> for the range and we should be able to achieve a covering data set and high 
> availability without having three full copies. 
> After a repair we know that some subset of the data set is fully replicated. 
> At that point we don't have to read from a quorum of nodes for the repaired 
> data. It is sufficient to read from a single node for the repaired data and a 
> quorum of nodes for the unrepaired data.
> One way to exploit this would be to have N replicas, say the last N replicas 
> (where N varies with RF) in the preference list, delete all repaired data 
> after a repair completes. Subsequent quorum reads will be able to retrieve 
> the repaired data from any of the two full replicas and the unrepaired data 
> from a quorum read of any replica including the "transient" replicas.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13315) Consistency is confusing for new users

2017-03-09 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15903379#comment-15903379
 ] 

Jonathan Ellis commented on CASSANDRA-13315:


I like this idea a lot.  We have a lot more experience now with how people use 
and misuse CL in the wild so I am comfortable getting a lot more opinionated in 
how we push people towards certain options and away from others.

1/2: The dual CL for Serial isn't for what to do w/ no condition, it's for the 
"commit" to EC land from the Paxos sandbox.  So mandating a condition (don't we 
already?) doesn't make that go away.  But, I think we could make that default 
to Q and call it good.  (I'm having trouble thinking of a situation where you 
would need LWT, which requires a quorum to participate already, but also need 
lower CL on commit.)

3: I would bikeshed this to

# EVENTUAL
# STRONG
# SERIAL

4. It sounds like we can do all of this at the drivers level except for adding 
some aliases to CQLSH.  I don't see any benefit to adding synonyms at the 
protocol level. 

5. How do we give power users the ability to use classic CL if they need it?

> Consistency is confusing for new users
> --
>
> Key: CASSANDRA-13315
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13315
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Ryan Svihla
>
> New users really struggle with consistency level and fall into a large number 
> of tarpits trying to decide on the right one.
> 1. There are a LOT of consistency levels and it's up to the end user to 
> reason about what combinations are valid and what is really what they intend 
> it to be. Is there any reason why write at ALL and read at CL TWO is better 
> than read at CL ONE? 
> 2. They require a good understanding of failure modes to do well. It's not 
> uncommon for people to use CL one and wonder why their data is missing.
> 3. The serial consistency level "bucket" is confusing to even write about and 
> easy to get wrong even for experienced users.
> So I propose the following steps:
> 1. Remove the "serial consistency" level of consistency levels and just have 
> all consistency levels in one bucket at the protocol level.
> 2. To enable #1 just reject writes or updates done without a condition when 
> SERIAL/LOCAL_SERIAL is specified.
> 3. add 3 new consistency levels pointing to existing ones but that infer 
> intent much more cleanly:
>* EVENTUALLY_CONSISTENT = LOCAL_ONE reads and writes
>* HIGHLY_CONSISTENT = LOCAL_QUORUM reads and writes
>* TRANSACTIONALLY_CONSISTENT = LOCAL_SERIAL reads and writes
> for global levels of this I propose keeping the old ones around, they're 
> rarely used in the field except by accident or particularly opinionated and 
> advanced users.
> Drivers should put the new consistency levels in a new package and docs 
> should be updated to suggest their use. Likewise setting default CL should 
> only provide those three settings and applying it for reads and writes at the 
> same time.
> CQLSH I'm gonna suggest should default to HIGHLY_CONSISTENT. New sysadmins 
> get surprised by this frequently and I can think of a couple very major 
> escalations because people were confused what the default behavior was.
> The benefit to all this change is we shrink the surface area that one has to 
> understand when learning Cassandra greatly, and we have far less bad initial 
> experiences and surprises. New users will more likely be able to wrap their 
> brains around those 3 ideas more readily then they can "what happens when I 
> have RF2, QUROUM writes and ONE reads". Advanced users get access to all the 
> way still, while new users don't have to learn all the ins and outs of 
> distributed theory just to write data and be able to read it back.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (CASSANDRA-8844) Change Data Capture (CDC)

2017-02-17 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis reassigned CASSANDRA-8844:
-

Assignee: Joshua McKenzie  (was: Yasuharu Goto)

> Change Data Capture (CDC)
> -
>
> Key: CASSANDRA-8844
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8844
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Coordination, Local Write-Read Paths
>Reporter: Tupshin Harper
>Assignee: Joshua McKenzie
>Priority: Critical
> Fix For: 3.8
>
>
> "In databases, change data capture (CDC) is a set of software design patterns 
> used to determine (and track) the data that has changed so that action can be 
> taken using the changed data. Also, Change data capture (CDC) is an approach 
> to data integration that is based on the identification, capture and delivery 
> of the changes made to enterprise data sources."
> -Wikipedia
> As Cassandra is increasingly being used as the Source of Record (SoR) for 
> mission critical data in large enterprises, it is increasingly being called 
> upon to act as the central hub of traffic and data flow to other systems. In 
> order to try to address the general need, we (cc [~brianmhess]), propose 
> implementing a simple data logging mechanism to enable per-table CDC patterns.
> h2. The goals:
> # Use CQL as the primary ingestion mechanism, in order to leverage its 
> Consistency Level semantics, and in order to treat it as the single 
> reliable/durable SoR for the data.
> # To provide a mechanism for implementing good and reliable 
> (deliver-at-least-once with possible mechanisms for deliver-exactly-once ) 
> continuous semi-realtime feeds of mutations going into a Cassandra cluster.
> # To eliminate the developmental and operational burden of users so that they 
> don't have to do dual writes to other systems.
> # For users that are currently doing batch export from a Cassandra system, 
> give them the opportunity to make that realtime with a minimum of coding.
> h2. The mechanism:
> We propose a durable logging mechanism that functions similar to a commitlog, 
> with the following nuances:
> - Takes place on every node, not just the coordinator, so RF number of copies 
> are logged.
> - Separate log per table.
> - Per-table configuration. Only tables that are specified as CDC_LOG would do 
> any logging.
> - Per DC. We are trying to keep the complexity to a minimum to make this an 
> easy enhancement, but most likely use cases would prefer to only implement 
> CDC logging in one (or a subset) of the DCs that are being replicated to
> - In the critical path of ConsistencyLevel acknowledgment. Just as with the 
> commitlog, failure to write to the CDC log should fail that node's write. If 
> that means the requested consistency level was not met, then clients *should* 
> experience UnavailableExceptions.
> - Be written in a Row-centric manner such that it is easy for consumers to 
> reconstitute rows atomically.
> - Written in a simple format designed to be consumed *directly* by daemons 
> written in non JVM languages
> h2. Nice-to-haves
> I strongly suspect that the following features will be asked for, but I also 
> believe that they can be deferred for a subsequent release, and to guage 
> actual interest.
> - Multiple logs per table. This would make it easy to have multiple 
> "subscribers" to a single table's changes. A workaround would be to create a 
> forking daemon listener, but that's not a great answer.
> - Log filtering. Being able to apply filters, including UDF-based filters 
> would make Casandra a much more versatile feeder into other systems, and 
> again, reduce complexity that would otherwise need to be built into the 
> daemons.
> h2. Format and Consumption
> - Cassandra would only write to the CDC log, and never delete from it. 
> - Cleaning up consumed logfiles would be the client daemon's responibility
> - Logfile size should probably be configurable.
> - Logfiles should be named with a predictable naming schema, making it 
> triivial to process them in order.
> - Daemons should be able to checkpoint their work, and resume from where they 
> left off. This means they would have to leave some file artifact in the CDC 
> log's directory.
> - A sophisticated daemon should be able to be written that could 
> -- Catch up, in written-order, even when it is multiple logfiles behind in 
> processing
> -- Be able to continuously "tail" the most recent logfile and get 
> low-latency(ms?) access to the data as it is written.
> h2. Alternate approach
> In order to make consuming a change log easy and efficient to do with low 
> latency, the following could supplement the approach outlined above
> - Instead of writing to a logfile, by default, Cassandra could expose a 
> socket for a daemon to connect 

[jira] [Commented] (CASSANDRA-12921) Parallel opening of sstables slows down startup

2016-11-17 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15674682#comment-15674682
 ] 

Jonathan Ellis commented on CASSANDRA-12921:


IMO we should be optimizing for SSD now, not HDD.

> Parallel opening of sstables slows down startup
> ---
>
> Key: CASSANDRA-12921
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12921
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Branimir Lambov
>Priority: Minor
>
> [{{SSTableReader.openAll}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/io/sstable/format/SSTableReader.java#L545],
>  used by initial startup, spawns multiple threads to open sstables.
> This is a bad idea as the longest step in the opening process is the loading 
> of bloom filters and when this is done in parallel the drives have to do a 
> lot of unnecessary seeking.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10993) Make read and write requests paths fully non-blocking, eliminate related stages

2016-08-17 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15425038#comment-15425038
 ] 

Jonathan Ellis commented on CASSANDRA-10993:


Isn't this a good example of a case where we can start with an off the shelf 
API (Rx, or Reactor) and optimize later?  Given that the approach won't change 
qualitatively, we can always refactor method names to a differently library 
easily enough.

If we have strong evidence that off the shelf is crap (and I think Tyler's 
tests show it's decent enough to be competitive with state machine, at least), 
AND that we know enough to build it "right," we could start doing that in 
parallel with the refactoring work to off-the-shelf.  But if we don't know 
enough yet, then imo that's again an indication that we should start with 
off-the-shelf.

Am I missing something?

> Make read and write requests paths fully non-blocking, eliminate related 
> stages
> ---
>
> Key: CASSANDRA-10993
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10993
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Coordination, Local Write-Read Paths
>Reporter: Aleksey Yeschenko
>Assignee: Tyler Hobbs
> Fix For: 3.x
>
> Attachments: 10993-reads-no-evloop-integration-six-node-stress.svg, 
> tpc-benchmarks-2.txt, tpc-benchmarks.txt
>
>
> Building on work done by [~tjake] (CASSANDRA-10528), [~slebresne] 
> (CASSANDRA-5239), and others, convert read and write request paths to be 
> fully non-blocking, to enable the eventual transition from SEDA to TPC 
> (CASSANDRA-10989)
> Eliminate {{MUTATION}}, {{COUNTER_MUTATION}}, {{VIEW_MUTATION}}, {{READ}}, 
> and {{READ_REPAIR}} stages, move read and write execution directly to Netty 
> context.
> For lack of decent async I/O options on Linux, we’ll still have to retain an 
> extra thread pool for serving read requests for data not residing in our page 
> cache (CASSANDRA-5863), however.
> Implementation-wise, we only have two options available to us: explicit FSMs 
> and chained futures. Fibers would be the third, and easiest option, but 
> aren’t feasible in Java without resorting to direct bytecode manipulation 
> (ourselves or using [quasar|https://github.com/puniverse/quasar]).
> I have seen 4 implementations bases on chained futures/promises now - three 
> in Java and one in C++ - and I’m not convinced that it’s the optimal (or 
> sane) choice for representing our complex logic - think 2i quorum read 
> requests with timeouts at all levels, read repair (blocking and 
> non-blocking), and speculative retries in the mix, {{SERIAL}} reads and 
> writes.
> I’m currently leaning towards an implementation based on explicit FSMs, and 
> intend to provide a prototype - soonish - for comparison with 
> {{CompletableFuture}}-like variants.
> Either way the transition is a relatively boring straightforward refactoring.
> There are, however, some extension points on both write and read paths that 
> we do not control:
> - authorisation implementations will have to be non-blocking. We have control 
> over built-in ones, but for any custom implementation we will have to execute 
> them in a separate thread pool
> - 2i hooks on the write path will need to be non-blocking
> - any trigger implementations will not be allowed to block
> - UDFs and UDAs
> We are further limited by API compatibility restrictions in the 3.x line, 
> forbidding us to alter, or add any non-{{default}} interface methods to those 
> extension points, so these pose a problem.
> Depending on logistics, expecting to get this done in time for 3.4 or 3.6 
> feature release.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10993) Make read and write requests paths fully non-blocking, eliminate related stages

2016-08-02 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15404795#comment-15404795
 ] 

Jonathan Ellis commented on CASSANDRA-10993:


My gut is that we stand to realize more of a performance win from e.g. not 
needing to use threadsafe memtables than we can wring from hand-coding state 
machines vs reactive streams.  (Especially if the commenter on 10528 is right 
that Reactor outperforms RxJava already.)  So I'd be inclined to move forward 
with the approach that lets us ship v1 and start working on those next-gen 
optimizations faster.

Separately, do you have any intuition for why state machines does better at 
mean latency but worse at the tail?

> Make read and write requests paths fully non-blocking, eliminate related 
> stages
> ---
>
> Key: CASSANDRA-10993
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10993
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Coordination, Local Write-Read Paths
>Reporter: Aleksey Yeschenko
>Assignee: Tyler Hobbs
> Fix For: 3.x
>
> Attachments: 10993-reads-no-evloop-integration-six-node-stress.svg, 
> tpc-benchmarks-2.txt, tpc-benchmarks.txt
>
>
> Building on work done by [~tjake] (CASSANDRA-10528), [~slebresne] 
> (CASSANDRA-5239), and others, convert read and write request paths to be 
> fully non-blocking, to enable the eventual transition from SEDA to TPC 
> (CASSANDRA-10989)
> Eliminate {{MUTATION}}, {{COUNTER_MUTATION}}, {{VIEW_MUTATION}}, {{READ}}, 
> and {{READ_REPAIR}} stages, move read and write execution directly to Netty 
> context.
> For lack of decent async I/O options on Linux, we’ll still have to retain an 
> extra thread pool for serving read requests for data not residing in our page 
> cache (CASSANDRA-5863), however.
> Implementation-wise, we only have two options available to us: explicit FSMs 
> and chained futures. Fibers would be the third, and easiest option, but 
> aren’t feasible in Java without resorting to direct bytecode manipulation 
> (ourselves or using [quasar|https://github.com/puniverse/quasar]).
> I have seen 4 implementations bases on chained futures/promises now - three 
> in Java and one in C++ - and I’m not convinced that it’s the optimal (or 
> sane) choice for representing our complex logic - think 2i quorum read 
> requests with timeouts at all levels, read repair (blocking and 
> non-blocking), and speculative retries in the mix, {{SERIAL}} reads and 
> writes.
> I’m currently leaning towards an implementation based on explicit FSMs, and 
> intend to provide a prototype - soonish - for comparison with 
> {{CompletableFuture}}-like variants.
> Either way the transition is a relatively boring straightforward refactoring.
> There are, however, some extension points on both write and read paths that 
> we do not control:
> - authorisation implementations will have to be non-blocking. We have control 
> over built-in ones, but for any custom implementation we will have to execute 
> them in a separate thread pool
> - 2i hooks on the write path will need to be non-blocking
> - any trigger implementations will not be allowed to block
> - UDFs and UDAs
> We are further limited by API compatibility restrictions in the 3.x line, 
> forbidding us to alter, or add any non-{{default}} interface methods to those 
> extension points, so these pose a problem.
> Depending on logistics, expecting to get this done in time for 3.4 or 3.6 
> feature release.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12151) Audit logging for database activity

2016-07-29 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399584#comment-15399584
 ] 

Jonathan Ellis commented on CASSANDRA-12151:


Remember that people almost always use Cassandra to drive applications at 
scale, not to do interactive analytics.  I can't see that logging 100,000 ops 
per second of the same ten queries is going to add much value.  I don't want to 
load that gun for people to blow their feet off with...

Generally auditing is most useful to see "who *changed* what" not "who *asked 
for* what."  (Again, the "who" for most of the latter is going to be "the 
application server.")  And again, it's not super useful to know that the app 
server inserted 10,000 new user accounts today, but it IS useful to know when 
Jonathan added a new column to the users table.  

(I would also include user logins as an interesting event.  This will be 
dominated by app servers still but much much less noise than logging every 
query or update.)

Besides changes over CQL, this could also include JMX changes, although there 
are so many entry points to JMX mbeans that this would be ugly to do by hand.  
Perhaps we could inject this with byteman?

> Audit logging for database activity
> ---
>
> Key: CASSANDRA-12151
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12151
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: stefan setyadi
> Fix For: 3.x
>
> Attachments: 12151.txt
>
>
> we would like a way to enable cassandra to log database activity being done 
> on our server.
> It should show username, remote address, timestamp, action type, keyspace, 
> column family, and the query statement.
> it should also be able to log connection attempt and changes to the 
> user/roles.
> I was thinking of making a new keyspace and insert an entry for every 
> activity that occurs.
> Then It would be possible to query for specific activity or a query targeting 
> a specific keyspace and column family.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12151) Audit logging for database activity

2016-07-28 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15397542#comment-15397542
 ] 

Jonathan Ellis commented on CASSANDRA-12151:


Can we back up a little and talk about design?

Some questions in my mind:

# Do we want a global audit log, or server-local?  If the former (easier for 
users to query), it should go in system_distributed keyspace; otherwise just in 
system (higher performance).
# Is there a use case where you'd want to log every query?  That seems like it 
would entail a prohibitive performance penalty.  I would think most users would 
be better served by logging meta-changes (adding roles, altering tables, etc)

> Audit logging for database activity
> ---
>
> Key: CASSANDRA-12151
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12151
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: stefan setyadi
> Fix For: 3.x
>
> Attachments: 12151.txt
>
>
> we would like a way to enable cassandra to log database activity being done 
> on our server.
> It should show username, remote address, timestamp, action type, keyspace, 
> column family, and the query statement.
> it should also be able to log connection attempt and changes to the 
> user/roles.
> I was thinking of making a new keyspace and insert an entry for every 
> activity that occurs.
> Then It would be possible to query for specific activity or a query targeting 
> a specific keyspace and column family.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12151) Audit logging for database activity

2016-07-28 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-12151:
---
Status: Open  (was: Patch Available)

> Audit logging for database activity
> ---
>
> Key: CASSANDRA-12151
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12151
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: stefan setyadi
> Fix For: 3.x
>
> Attachments: 12151.txt
>
>
> we would like a way to enable cassandra to log database activity being done 
> on our server.
> It should show username, remote address, timestamp, action type, keyspace, 
> column family, and the query statement.
> it should also be able to log connection attempt and changes to the 
> user/roles.
> I was thinking of making a new keyspace and insert an entry for every 
> activity that occurs.
> Then It would be possible to query for specific activity or a query targeting 
> a specific keyspace and column family.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12334) HP Fortify Analysis

2016-07-27 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15396423#comment-15396423
 ] 

Jonathan Ellis commented on CASSANDRA-12334:


Hi [~EdAInWestOC], thanks for taking the time to report your findings!  Please 
create any further tickets as subtasks of this one so we can track them better.

> HP Fortify Analysis
> ---
>
> Key: CASSANDRA-12334
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12334
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12326) Use of getByAddress() to retrieve InetAddress object

2016-07-27 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-12326:
---
Issue Type: Sub-task  (was: Bug)
Parent: CASSANDRA-12334

> Use of getByAddress() to retrieve InetAddress object
> 
>
> Key: CASSANDRA-12326
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12326
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Eduardo Aguinaga
>
> Overview:
> In May through June of 2016 a static analysis was performed on version 3.0.5 
> of the Cassandra source code. The analysis included an automated analysis 
> using HP Fortify v4.21 SCA and a manual analysis utilizing SciTools 
> Understand v4. The results of that analysis includes the issue below.
> Issue:
> There are four places in the Cassandra source code that rely upon a call to 
> getByAddress() to retrieve an InetAddress object. The information returned by 
> getByAddress() is not trustworthy. Attackers can spoof DNS entries and 
> depending on getByAddress alone invites DNS spoofing attacks.
> The four places in the Cassandra source code where getByAddress() is used:
> MutationVerbHandler.java Line 58
> CompactEndpointSerializationHelper.java Line 38
> InetAddressSerializer.java Line 38, 58
> MutationVerbHandler.java, lines 49-59:
> {code:java}
> 49 if (from == null)
> 50 {
> 51 replyTo = message.from;
> 52 byte[] forwardBytes = message.parameters.get(Mutation.FORWARD_TO);
> 53 if (forwardBytes != null)
> 54 forwardToLocalNodes(message.payload, message.verb, forwardBytes, 
> message.from);
> 55 }
> 56 else
> 57 {
> 58 replyTo = InetAddress.getByAddress(from);
> 59 }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12299) Privacy Violation - Heap Inspection

2016-07-27 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-12299:
---
Issue Type: Sub-task  (was: Bug)
Parent: CASSANDRA-12334

> Privacy Violation - Heap Inspection
> ---
>
> Key: CASSANDRA-12299
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12299
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Eduardo Aguinaga
>
> Overview:
> In May through June of 2016 a static analysis was performed on version 3.0.5 
> of the Cassandra source code. The analysis included an automated analysis 
> using HP Fortify v4.21 SCA and a manual analysis utilizing SciTools 
> Understand v4. The results of that analysis includes the issue below.
> Issue:
> In the file CqlConfigHelper.java on lines 508, 533, 534 and 592 a string 
> object is used to store sensitive data. String objects are immutable and 
> should not be used to store sensitive data. Sensitive data should be stored 
> in char or byte arrays and the contents of those arrays should be cleared 
> ASAP. Operations performed on string objects will require that the original 
> object be copied and the operation be applied in the new copy of the string 
> object. This results in the likelihood that multiple copies of sensitive data 
> will be present in the heap until garbage collection takes place.
> The snippet below shows the issue on line 508:
> CqlConfigHelper.java, lines 505-518:
> {code:java}
> 505 private static Optional 
> getDefaultAuthProvider(Configuration conf)
> 506 {
> 507 Optional username = getStringSetting(USERNAME, conf);
> 508 Optional password = getStringSetting(PASSWORD, conf);
> 509 
> 510 if (username.isPresent() && password.isPresent())
> 511 {
> 512 return Optional.of(new PlainTextAuthProvider(username.get(), 
> password.get()));
> 513 }
> 514 else
> 515 {
> 516 return Optional.absent();
> 517 }
> 518 }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12328) Path Manipulation

2016-07-27 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-12328:
---
Issue Type: Sub-task  (was: Bug)
Parent: CASSANDRA-12334

> Path Manipulation
> -
>
> Key: CASSANDRA-12328
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12328
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Eduardo Aguinaga
>
> Overview:
> In May through June of 2016 a static analysis was performed on version 3.0.5 
> of the Cassandra source code. The analysis included an automated analysis 
> using HP Fortify v4.21 SCA and a manual analysis utilizing SciTools 
> Understand v4. The results of that analysis includes the issue below.
> Issue:
> There are multiple places in the Cassandra source code where a string that 
> determines the path of a file is not examined prior to use. Path traversal 
> vulnerabilities are common software security problems and failure to validate 
> the path prior to open/creating a file may result in operating in a directory 
> that is outside the intended control sphere.
> Path manipulation issues were found in the following locations:
> CompactionManager.java Line 637
> Descriptor.java Line 224
> MetadataSerializer.java Line 83, 153
> CommitLog.java Line 199
> LogTransaction.java Line 311
> WindowsFailedSnapshotTracker.java Line 51, 55, 60, 78, 84, 95
> LegacyMetadataSerializer.java Line 84
> FileUtils.java Line 116, 172, 354, 368, 386, 437
> RewindableDataInputStreamPlus.java Line 226
> CassandraDaemon.java Line 557
> NodeTool.java Line 261
> CustomClassLoader.java Line 77
> CoalescingStrategies.java Line 54, 150
> FBUtilities.java Line 309, 748
> The following snippet is from CompactionManager.java where unvalidated input 
> is parsed and used to create a new File object on line 637:
> {code:java}
> CompactionManager.java, lines 621-638:
> 621 public void forceUserDefinedCompaction(String dataFiles)
> 622 {
> 623 String[] filenames = dataFiles.split(",");
> 624 Multimap descriptors = 
> ArrayListMultimap.create();
> 625 
> 626 for (String filename : filenames)
> 627 {
> 628 // extract keyspace and columnfamily name from filename
> 629 Descriptor desc = Descriptor.fromFilename(filename.trim());
> 630 if (Schema.instance.getCFMetaData(desc) == null)
> 631 {
> 632 logger.warn("Schema does not exist for file {}. Skipping.", 
> filename);
> 633 continue;
> 634 }
> 635 // group by keyspace/columnfamily
> 636 ColumnFamilyStore cfs = 
> Keyspace.open(desc.ksname).getColumnFamilyStore(desc.cfname);
> 637 descriptors.put(cfs, cfs.getDirectories().find(new 
> File(filename.trim()).getName()));
> 638 }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12324) Use of Dynamic Class Loading, Use of Externally-Controlled Input to Select Classes or Code

2016-07-27 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-12324:
---
Issue Type: Sub-task  (was: Bug)
Parent: CASSANDRA-12334

> Use of Dynamic Class Loading, Use of Externally-Controlled Input to Select 
> Classes or Code
> --
>
> Key: CASSANDRA-12324
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12324
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Eduardo Aguinaga
>
> Overview:
> In May through June of 2016 a static analysis was performed on version 3.0.5 
> of the Cassandra source code. The analysis included an automated analysis 
> using HP Fortify v4.21 SCA and a manual analysis utilizing SciTools 
> Understand v4. The results of that analysis includes the issue below.
> Issue:
> Dynamically loaded code has the potential to be malicious. The application 
> uses external input to select which classes or code to use, but it does not 
> sufficiently prevent the input from selecting improper classes or code.
> The snippet below shows the issue which ends on line 436 by returning an 
> object associated with a class by name.
> {code:java}
> FBUtilities.java, lines 432-442:
> 432 public static  Class classForName(String classname, String 
> readable) throws ConfigurationException
> 433 {
> 434 try
> 435 {
> 436 return (Class)Class.forName(classname);
> 437 }
> 438 catch (ClassNotFoundException | NoClassDefFoundError e)
> 439 {
> 440 throw new ConfigurationException(String.format("Unable to find %s 
> class '%s'", readable, classname), e);
> 441 }
> 442 }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12325) Access Specifier Manipulation

2016-07-27 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-12325:
---
Issue Type: Sub-task  (was: Bug)
Parent: CASSANDRA-12334

> Access Specifier Manipulation
> -
>
> Key: CASSANDRA-12325
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12325
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Eduardo Aguinaga
>
> Overview:
> In May through June of 2016 a static analysis was performed on version 3.0.5 
> of the Cassandra source code. The analysis included an automated analysis 
> using HP Fortify v4.21 SCA and a manual analysis utilizing SciTools 
> Understand v4. The results of that analysis includes the issue below.
> Issue:
> There are 18 instances in the Cassandra source code where setAccessible() is 
> used to suppress Java language access checking. Static analysis automation 
> tools, like Fortify, will log every instance of the use of setAccessible() 
> and its use represents a possible security issue.
> The use of setAccessble() can cause security problems if the Java access 
> checking is suppressed longer than required or another approach could be 
> taken other than suppressing access checking. This issue will list all 18 
> instances where setAccessible() is used and the usage of this method should 
> be reviewed and checked to make sure it is not used inappropriately.
> setAccessible() is used in the following places:
> UDHelper.java Line 49
> HadoopCompat.java Line 109, 113, 118, 150, 152, 154
> Memory.java Line 42
> GCInspector.java Line 68
> Locks.java Line 33
> Ref.java Line 626
> FastByteOperations.java Line 150
> FBUtilities.java Line 539
> Hex.java Line 128
> MemoryUtil.java Line 61
> SyncUtil.java Line 33, 45, 57
> UDHelper.java, lines 45-56:
> {code:java}
> 45 try
> 46 {
> 47 Class cls = 
> Class.forName("com.datastax.driver.core.DataTypeClassNameParser");
> 48 Method m = cls.getDeclaredMethod("parseOne", String.class, 
> ProtocolVersion.class, CodecRegistry.class);
> 49 m.setAccessible(true);
> 50 methodParseOne = MethodHandles.lookup().unreflect(m);
> 51 codecRegistry = new CodecRegistry();
> 52 }
> 53 catch (Exception e)
> 54 {
> 55 throw new RuntimeException(e);
> 56 }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12303) Privacy Violation - Heap Inspection

2016-07-27 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-12303:
---
Issue Type: Sub-task  (was: Bug)
Parent: CASSANDRA-12334

> Privacy Violation - Heap Inspection
> ---
>
> Key: CASSANDRA-12303
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12303
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Eduardo Aguinaga
>
> Overview:
> In May through June of 2016 a static analysis was performed on version 3.0.5 
> of the Cassandra source code. The analysis included an automated analysis 
> using HP Fortify v4.21 SCA and a manual analysis utilizing SciTools 
> Understand v4. The results of that analysis includes the issue below.
> Issue:
> In the file AbstractJmxClient.java on lines 69 and 147 a string object is 
> used to store sensitive data. String objects are immutable and should not be 
> used to store sensitive data. Sensitive data should be stored in char or byte 
> arrays and the contents of those arrays should be cleared ASAP. Operations 
> performed on string objects will require that the original object be copied 
> and the operation be applied in the new copy of the string object. This 
> results in the likelihood that multiple copies of sensitive data will be 
> present in the heap until garbage collection takes place.
> The snippet below shows the issue on line 69:
> AbstractJmxClient.java, lines 51-71:
> {code:java}
> 51 protected final String password;
> 52 protected JMXConnection jmxConn;
> 53 protected PrintStream out = System.out;
> . . .
> 64 public AbstractJmxClient(String host, Integer port, String username, 
> String password) throws IOException
> 65 {
> 66 this.host = (host != null) ? host : DEFAULT_HOST;
> 67 this.port = (port != null) ? port : DEFAULT_JMX_PORT;
> 68 this.username = username;
> 69 this.password = password;
> 70 jmxConn = new JMXConnection(this.host, this.port, username, password);
> 71 }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12298) Privacy Violation - Heap Inspection

2016-07-27 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-12298:
---
Issue Type: Sub-task  (was: Bug)
Parent: CASSANDRA-12334

> Privacy Violation - Heap Inspection
> ---
>
> Key: CASSANDRA-12298
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12298
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Eduardo Aguinaga
>Assignee: Jason Brown
>
> Overview:
> In May through June of 2016 a static analysis was performed on version 3.0.5 
> of the Cassandra source code. The analysis included 
> an automated analysis using HP Fortify v4.21 SCA and a manual analysis 
> utilizing SciTools Understand v4. The results of that 
> analysis includes the issue below.
> Issue:
> In the file RoleOptions.java on line 89 a string object is used to store 
> sensitive data. String objects are immutable and should not be used to store 
> sensitive data. Sensitive data should be stored in char or byte arrays and 
> the contents of those arrays should be cleared ASAP. Operations performed on 
> string objects will require that the original object be copied and the 
> operation be applied in the new copy of the string object. This results in 
> the likelihood that multiple copies of sensitive data will be present in the 
> heap until garbage collection takes place.
> The snippet below shows the issue on line 89:
> RoleOptions.java, lines 87-90:
> {code:java}
> 87 public Optional getPassword()
> 88 {
> 89 return 
> Optional.fromNullable((String)options.get(IRoleManager.Option.PASSWORD));
> 90 }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12301) Privacy Violation - Heap Inspection

2016-07-27 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-12301:
---
Issue Type: Sub-task  (was: Bug)
Parent: CASSANDRA-12334

> Privacy Violation - Heap Inspection
> ---
>
> Key: CASSANDRA-12301
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12301
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Eduardo Aguinaga
>
> Overview:
> In May through June of 2016 a static analysis was performed on version 3.0.5 
> of the Cassandra source code. The analysis included an automated analysis 
> using HP Fortify v4.21 SCA and a manual analysis utilizing SciTools 
> Understand v4. The results of that analysis includes the issue below.
> Issue:
> In the file SSLTransportFactory.java on lines 72 and 76 a string object is 
> used to store sensitive data. String objects are immutable and should not be 
> used to store sensitive data. Sensitive data should be stored in char or byte 
> arrays and the contents of those arrays should be cleared ASAP. Operations 
> performed on string objects will require that the original object be copied 
> and the operation be applied in the new copy of the string object. This 
> results in the likelihood that multiple copies of sensitive data will be 
> present in the heap until garbage collection takes place.
> The snippet below shows the issue on lines 72 and 76:
> SSLTransportFactory.java, lines 47-81:
> {code:java}
> 47 private String truststore;
> 48 private String truststorePassword;
> 49 private String keystore;
> 50 private String keystorePassword;
> 51 private String protocol;
> 52 private String[] cipherSuites;
> . . .
> 66 @Override
> 67 public void setOptions(Map options)
> 68 {
> 69 if (options.containsKey(TRUSTSTORE))
> 70 truststore = options.get(TRUSTSTORE);
> 71 if (options.containsKey(TRUSTSTORE_PASSWORD))
> 72 truststorePassword = options.get(TRUSTSTORE_PASSWORD);
> 73 if (options.containsKey(KEYSTORE))
> 74 keystore = options.get(KEYSTORE);
> 75 if (options.containsKey(KEYSTORE_PASSWORD))
> 76 keystorePassword = options.get(KEYSTORE_PASSWORD);
> 77 if (options.containsKey(PROTOCOL))
> 78 protocol = options.get(PROTOCOL);
> 79 if (options.containsKey(CIPHER_SUITES))
> 80 cipherSuites = options.get(CIPHER_SUITES).split(",");
> 81 }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12327) Use of getAllByName() to retrieve IP addresses

2016-07-27 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-12327:
---
Issue Type: Sub-task  (was: Bug)
Parent: CASSANDRA-12334

> Use of getAllByName() to retrieve IP addresses
> --
>
> Key: CASSANDRA-12327
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12327
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Eduardo Aguinaga
>
> Overview:
> In May through June of 2016 a static analysis was performed on version 3.0.5 
> of the Cassandra source code. The analysis included an automated analysis 
> using HP Fortify v4.21 SCA and a manual analysis utilizing SciTools 
> Understand v4. The results of that analysis includes the issue below.
> Issue:
> Use of getAllByName() to retrieve an IP addresses is not trustworthy. 
> Attackers can spoof DNS entries.
> The file LimitedLocalNodeFirstLocalBalancingPolicy.java calls getAllByName() 
> on line 66.
> LimitedLocalNodeFirstLocalBalancingPolicy.java, lines 64-72:
> {code:java}
> 64 try
> 65 {
> 66 InetAddress[] addresses = InetAddress.getAllByName(replica);
> 67 Collections.addAll(replicaAddresses, addresses);
> 68 }
> 69 catch (UnknownHostException e)
> 70 {
> 71 logger.warn("Invalid replica host name: {}, skipping it", replica);
> 72 }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12332) Weak SecurityManager Check: Overridable Method

2016-07-27 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-12332:
---
Issue Type: Sub-task  (was: Bug)
Parent: CASSANDRA-12334

> Weak SecurityManager Check: Overridable Method
> --
>
> Key: CASSANDRA-12332
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12332
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Eduardo Aguinaga
>
> Overview:
> In May through June of 2016 a static analysis was performed on version 3.0.5 
> of the Cassandra source code. The analysis included an automated analysis 
> using HP Fortify v4.21 SCA and a manual analysis utilizing SciTools 
> Understand v4. The results of that analysis includes the issue below.
> Issue:
> Non-final methods that perform security checks may be overridden in ways that 
> bypass security checks.
> {code:java}
> CassandraDaemon.java, lines 155-165:
> 155 protected void setup()
> 156 {
> 157 // Delete any failed snapshot deletions on Windows - see 
> CASSANDRA-9658
> 158 if (FBUtilities.isWindows())
> 159 WindowsFailedSnapshotTracker.deleteOldSnapshots();
> 160 
> 161 ThreadAwareSecurityManager.install();
> 162 
> 163 logSystemInfo();
> 164 
> 165 CLibrary.tryMlockall();
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12331) Unreleased Resource: Sockets

2016-07-27 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-12331:
---
Issue Type: Sub-task  (was: Bug)
Parent: CASSANDRA-12334

> Unreleased Resource: Sockets
> 
>
> Key: CASSANDRA-12331
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12331
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Eduardo Aguinaga
>
> Overview:
> In May through June of 2016 a static analysis was performed on version 3.0.5 
> of the Cassandra source code. The analysis included an automated analysis 
> using HP Fortify v4.21 SCA and a manual analysis utilizing SciTools 
> Understand v4. The results of that analysis includes the issue below.
> Issue:
> Sockets are low level resources that must be explicitly released so 
> subsequent callers will have access to previously used sockets. In the file 
> RMIServerSocketFactoryImpl.java on lines 15-16 a socket is acquired and 
> eventually returned to the caller on line 18.
> If an exception is thrown by the code on line 17 the socket acquired on lines 
> 15-16 will not be released for subsequent reuse.
> RMIServerSocketFactoryImpl.java, lines 13-19:
> {code:java}
> 13 public ServerSocket createServerSocket(final int pPort) throws IOException
> 14 {
> 15 ServerSocket socket = ServerSocketFactory.getDefault()
> 16  .createServerSocket(pPort, 0, 
> InetAddress.getLoopbackAddress());
> 17 socket.setReuseAddress(true);
> 18 return socket;
> 19 }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12329) Unreleased Resource: Sockets

2016-07-27 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-12329:
---
Issue Type: Sub-task  (was: Bug)
Parent: CASSANDRA-12334

> Unreleased Resource: Sockets
> 
>
> Key: CASSANDRA-12329
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12329
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Eduardo Aguinaga
>
> Overview:
> In May through June of 2016 a static analysis was performed on version 3.0.5 
> of the Cassandra source code. The analysis included an automated analysis 
> using HP Fortify v4.21 SCA and a manual analysis utilizing SciTools 
> Understand v4. The results of that analysis includes the issue below.
> Issue:
> Sockets are low level resources that must be explicitly released so 
> subsequent callers will have access to previously used sockets. In the file 
> SSLFactory.java on line 62 a SSL server socket is acquired and eventually 
> returned to the caller on line 69.
> If an exception is thrown by any of the code between lines 62 and 69 the 
> socket acquired on line 62 will not be released for subsequent reuse..
> {code:java}
> SSLFactory.java, lines 59-70:
> 59 public static SSLServerSocket getServerSocket(EncryptionOptions options, 
> InetAddress address, int port) throws IOException
> 60 {
> 61 SSLContext ctx = createSSLContext(options, true);
> 62 SSLServerSocket serverSocket = 
> (SSLServerSocket)ctx.getServerSocketFactory().createServerSocket();
> 63 serverSocket.setReuseAddress(true);
> 64 String[] suites = 
> filterCipherSuites(serverSocket.getSupportedCipherSuites(), 
> options.cipher_suites);
> 65 serverSocket.setEnabledCipherSuites(suites);
> 66 serverSocket.setNeedClientAuth(options.require_client_auth);
> 67 serverSocket.setEnabledProtocols(ACCEPTED_PROTOCOLS);
> 68 serverSocket.bind(new InetSocketAddress(address, port), 500);
> 69 return serverSocket;
> 70 }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12330) Unreleased Resource: Sockets

2016-07-27 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-12330:
---
Issue Type: Sub-task  (was: Bug)
Parent: CASSANDRA-12334

> Unreleased Resource: Sockets
> 
>
> Key: CASSANDRA-12330
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12330
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Eduardo Aguinaga
>
> Overview:
> In May through June of 2016 a static analysis was performed on version 3.0.5 
> of the Cassandra source code. The analysis included an automated analysis 
> using HP Fortify v4.21 SCA and a manual analysis utilizing SciTools 
> Understand v4. The results of that analysis includes the issue below.
> Issue:
> Sockets are low level resources that must be explicitly released so 
> subsequent callers will have access to previously used sockets. In the file 
> DefaultConnectionFactory.java on line 52 a socket is acquired and eventually 
> returned to the caller on line 55.
> If an exception is thrown by any of the code between lines 52 and 55 the 
> socket acquired on line 52 will not be released for subsequent reuse.
> DefaultConnectionFactory.java, lines 50-73:
> {code:java}
> 50 try
> 51 {
> 52 Socket socket = OutboundTcpConnectionPool.newSocket(peer);
> 53 
> socket.setSoTimeout(DatabaseDescriptor.getStreamingSocketTimeout());
> 54 socket.setKeepAlive(true);
> 55 return socket;
> 56 }
> 57 catch (IOException e)
> 58 {
> 59 if (++attempts >= MAX_CONNECT_ATTEMPTS)
> 60 throw e;
> 61 
> 62 long waitms = DatabaseDescriptor.getRpcTimeout() * 
> (long)Math.pow(2, attempts);
> 63 logger.warn("Failed attempt {} to connect to {}. Retrying in {} 
> ms. ({})", attempts, peer, waitms, e);
> 64 try
> 65 {
> 66 Thread.sleep(waitms);
> 67 }
> 68 catch (InterruptedException wtf)
> 69 {
> 70 throw new IOException("interrupted", wtf);
> 71 }
> 72 }
> 73 }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12333) Password Management: Hardcoded Password

2016-07-27 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-12333:
---
Issue Type: Sub-task  (was: Bug)
Parent: CASSANDRA-12334

> Password Management: Hardcoded Password
> ---
>
> Key: CASSANDRA-12333
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12333
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Eduardo Aguinaga
>
> Overview:
> In May through June of 2016 a static analysis was performed on version 3.0.5 
> of the Cassandra source code. The analysis included an automated analysis 
> using HP Fortify v4.21 SCA and a manual analysis utilizing SciTools 
> Understand v4. The results of that analysis includes the issue below.
> Issue:
> Hardcoded passwords may compromise system security in a way that cannot be 
> easily remedied. In CassandraRoleManager.java on line 77 the default 
> superuser password is set to "cassandra".
> CassandraRoleManager.java, lines 72-77:
> {code:java}
> 72 public class CassandraRoleManager implements IRoleManager
> 73 {
> 74 private static final Logger logger = 
> LoggerFactory.getLogger(CassandraRoleManager.class);
> 75 
> 76 static final String DEFAULT_SUPERUSER_NAME = "cassandra";
> 77 static final String DEFAULT_SUPERUSER_PASSWORD = "cassandra";
> CassandraRoleManager.java, lines 326-338:
> 326 private static void setupDefaultRole()
> 327 {
> 328 try
> 329 {
> 330 if (!hasExistingRoles())
> 331 {
> 332 QueryProcessor.process(String.format("INSERT INTO %s.%s 
> (role, is_superuser, can_login, salted_hash) " +
> 333  "VALUES ('%s', true, 
> true, '%s')",
> 334  AuthKeyspace.NAME,
> 335  AuthKeyspace.ROLES,
> 336  DEFAULT_SUPERUSER_NAME,
> 337  
> escape(hashpw(DEFAULT_SUPERUSER_PASSWORD))),
> 338
> consistencyForRole(DEFAULT_SUPERUSER_NAME));
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12310) Use of getByName() to retrieve IP address

2016-07-27 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-12310:
---
Issue Type: Sub-task  (was: Bug)
Parent: CASSANDRA-12334

> Use of getByName() to retrieve IP address
> -
>
> Key: CASSANDRA-12310
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12310
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Eduardo Aguinaga
>
> Overview:
> In May through June of 2016 a static analysis was performed on version 3.0.5 
> of the Cassandra source code. The analysis included an automated analysis 
> using HP Fortify v4.21 SCA and a manual analysis utilizing SciTools 
> Understand v4. The results of that analysis includes the issue below.
> Issue:
> There are many places in the Cassandra source code that rely upon a call to 
> getByName() to retrieve an IP address. The information returned by 
> getByName() is not trustworthy. Attackers can spoof DNS entries and depending 
> on getByName alone invites DNS spoofing attacks.
> getByName() is used in multiple locations within the CASSANDRA source code:
> DatabaseDescriptor.java Line 193, 213, 233, 254, 947, 949
> RingCache.java Line 82
> InetAddressType.java Line 52
> FailureDetector.java Line 186
> Gossiper.java Line 228, 571, 1517, 1522
> CqlBulkRecordWriter.java Line 142, 301
> HintsService.java Line 265
> DynamicEndpointSnitch.java Line 320
> Ec2MultiRegionSnitch.java Line 49
> EndpointSnitchInfo.java Line 46, 51
> PropertyFileSnitch.java Line 175
> ReconnectableSnitchHelper.java Line 52
> SimpleSeedProvider.java Line 55
> MessagingService.java Line 943
> StorageService.java Line 1766, 1835, 2526
> ProgressInfoCompositeData.java Line 96
> SessionInfoCompositeData.java Line 126, 127
> BulkLoader.java Line 399, 422
> SetHostStat.java Line 50
> This is an example from the file DatabaseDescriptor.java where there are 
> examples of the use of getByName() on line 193, 213, 233, 254, 947 and 949.
> DatabaseDescriptor.java, lines 231-238:
> {code:java}
> 231 try
> 232 {
> 233 rpcAddress = InetAddress.getByName(config.rpc_address);
> 234 }
> 235 catch (UnknownHostException e)
> 236 {
> 237 throw new ConfigurationException("Unknown host in rpc_address " + 
> config.rpc_address, false);
> 238 }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12322) Use of Dynamic Class Loading, Use of Externally-Controlled Input to Select Classes or Code

2016-07-27 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-12322:
---
Issue Type: Sub-task  (was: Bug)
Parent: CASSANDRA-12334

> Use of Dynamic Class Loading, Use of Externally-Controlled Input to Select 
> Classes or Code
> --
>
> Key: CASSANDRA-12322
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12322
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Eduardo Aguinaga
>
> Overview:
> In May through June of 2016 a static analysis was performed on version 3.0.5 
> of the Cassandra source code. The analysis included an automated analysis 
> using HP Fortify v4.21 SCA and a manual analysis utilizing SciTools 
> Understand v4. The results of that analysis includes the issue below.
> Issue:
> Dynamically loaded code has the potential to be malicious. The application 
> uses external input to select which classes or code to use, but it does not 
> sufficiently prevent the input from selecting improper classes or code.
> The snippet below shows the issue on lines 112-116 by instantiating a class 
> by name.
> FastByteOperations.java, lines 103-127:
> {code:java}
> 103 static ByteOperations getBest()
> 104 {
> 105 String arch = System.getProperty("os.arch");
> 106 boolean unaligned = arch.equals("i386") || arch.equals("x86")
> 107 || arch.equals("amd64") || 
> arch.equals("x86_64") || arch.equals("s390x");
> 108 if (!unaligned)
> 109 return new PureJavaOperations();
> 110 try
> 111 {
> 112 Class theClass = Class.forName(UNSAFE_COMPARER_NAME);
> 113 
> 114 // yes, UnsafeComparer does implement Comparer
> 115 @SuppressWarnings("unchecked")
> 116 ByteOperations comparer = (ByteOperations) 
> theClass.getConstructor().newInstance();
> 117 return comparer;
> 118 }
> 119 catch (Throwable t)
> 120 {
> 121 JVMStabilityInspector.inspectThrowable(t);
> 122 // ensure we really catch *everything*
> 123 return new PureJavaOperations();
> 124 }
> 125 }
> 126 
> 127 }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12318) Use of Dynamic Class Loading, Use of Externally-Controlled Input to Select Classes or Code

2016-07-27 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-12318:
---
Issue Type: Sub-task  (was: Bug)
Parent: CASSANDRA-12334

> Use of Dynamic Class Loading, Use of Externally-Controlled Input to Select 
> Classes or Code
> --
>
> Key: CASSANDRA-12318
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12318
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Eduardo Aguinaga
>
> Overview:
> In May through June of 2016 a static analysis was performed on version 3.0.5 
> of the Cassandra source code. The analysis included an automated analysis 
> using HP Fortify v4.21 SCA and a manual analysis utilizing SciTools 
> Understand v4. The results of that analysis includes the issue below.
> Issue:
> Dynamically loaded code has the potential to be malicious. The application 
> uses external input to select which classes or code to use, but it does not 
> sufficiently prevent the input from selecting improper classes or code.
> The snippet below shows the issue on lines 144-146 by instantiating an object 
> associated with a class by name.
> CacheService.java, lines 135-162:
> {code:java}
> 135 private AutoSavingCache initRowCache()
> 136 {
> 137 logger.info("Initializing row cache with capacity of {} MBs", 
> DatabaseDescriptor.getRowCacheSizeInMB());
> 138 
> 139 CacheProvider cacheProvider;
> 140 String cacheProviderClassName = 
> DatabaseDescriptor.getRowCacheSizeInMB() > 0
> 141 ? 
> DatabaseDescriptor.getRowCacheClassName() : 
> "org.apache.cassandra.cache.NopCacheProvider";
> 142 try
> 143 {
> 144 Class> 
> cacheProviderClass =
> 145 (Class>) 
> Class.forName(cacheProviderClassName);
> 146 cacheProvider = cacheProviderClass.newInstance();
> 147 }
> 148 catch (Exception e)
> 149 {
> 150 throw new RuntimeException("Cannot find configured row cache 
> provider class " + DatabaseDescriptor.getRowCacheClassName());
> 151 }
> 152 
> 153 // cache object
> 154 ICache rc = cacheProvider.create();
> 155 AutoSavingCache rowCache = new 
> AutoSavingCache<>(rc, CacheType.ROW_CACHE, new RowCacheSerializer());
> 156 
> 157 int rowCacheKeysToSave = DatabaseDescriptor.getRowCacheKeysToSave();
> 158 
> 159 rowCache.scheduleSaving(DatabaseDescriptor.getRowCacheSavePeriod(), 
> rowCacheKeysToSave);
> 160 
> 161 return rowCache;
> 162 }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12317) Use of Dynamic Class Loading, Use of Externally-Controlled Input to Select Classes or Code

2016-07-27 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-12317:
---
Issue Type: Sub-task  (was: Bug)
Parent: CASSANDRA-12334

> Use of Dynamic Class Loading, Use of Externally-Controlled Input to Select 
> Classes or Code
> --
>
> Key: CASSANDRA-12317
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12317
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Eduardo Aguinaga
>
> Overview:
> In May through June of 2016 a static analysis was performed on version 3.0.5 
> of the Cassandra source code. The analysis included an automated analysis 
> using HP Fortify v4.21 SCA and a manual analysis utilizing SciTools 
> Understand v4. The results of that analysis includes the issue below.
> Issue:
> Dynamically loaded code has the potential to be malicious. The application 
> uses external input to select which classes or code to use, but it does not 
> sufficiently prevent the input from selecting improper classes or code.
> The snippet below shows the issue which ends on line 198 by returning an 
> object associated with a class by name.
> CompressionParams.java, lines 190-204:
> {code:java}
> 190 private static Class parseCompressorClass(String className) throws 
> ConfigurationException
> 191 {
> 192 if (className == null || className.isEmpty())
> 193 return null;
> 194 
> 195 className = className.contains(".") ? className : 
> "org.apache.cassandra.io.compress." + className;
> 196 try
> 197 {
> 198 return Class.forName(className);
> 199 }
> 200 catch (Exception e)
> 201 {
> 202 throw new ConfigurationException("Could not create Compression 
> for type " + className, e);
> 203 }
> 204 }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12321) Use of Dynamic Class Loading, Use of Externally-Controlled Input to Select Classes or Code

2016-07-27 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-12321:
---
Issue Type: Sub-task  (was: Bug)
Parent: CASSANDRA-12334

> Use of Dynamic Class Loading, Use of Externally-Controlled Input to Select 
> Classes or Code
> --
>
> Key: CASSANDRA-12321
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12321
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Eduardo Aguinaga
>
> Overview:
> In May through June of 2016 a static analysis was performed on version 3.0.5 
> of the Cassandra source code. The analysis included an automated analysis 
> using HP Fortify v4.21 SCA and a manual analysis utilizing SciTools 
> Understand v4. The results of that analysis includes the issue below.
> Issue:
> Dynamically loaded code has the potential to be malicious. The application 
> uses external input to select which classes or code to use, but it does not 
> sufficiently prevent the input from selecting improper classes or code.
> The snippet below shows the issue on lines 523-532 by instantiating a class 
> by name.
> CoalescingStrategies.java, lines 494-538:
> {code:java}
> 494 @VisibleForTesting
> 495 static CoalescingStrategy newCoalescingStrategy(String strategy,
> 496 int coalesceWindow,
> 497 Parker parker,
> 498 Logger logger,
> 499 String displayName)
> 500 {
> 501 String classname = null;
> 502 String strategyCleaned = strategy.trim().toUpperCase();
> 503 switch(strategyCleaned)
> 504 {
> 505 case "MOVINGAVERAGE":
> 506 classname = MovingAverageCoalescingStrategy.class.getName();
> 507 break;
> 508 case "FIXED":
> 509 classname = FixedCoalescingStrategy.class.getName();
> 510 break;
> 511 case "TIMEHORIZON":
> 512 classname = 
> TimeHorizonMovingAverageCoalescingStrategy.class.getName();
> 513 break;
> 514 case "DISABLED":
> 515 classname = DisabledCoalescingStrategy.class.getName();
> 516 break;
> 517 default:
> 518 classname = strategy;
> 519 }
> 520 
> 521 try
> 522 {
> 523 Class clazz = Class.forName(classname);
> 524 
> 525 if (!CoalescingStrategy.class.isAssignableFrom(clazz))
> 526 {
> 527 throw new RuntimeException(classname + " is not an instance 
> of CoalescingStrategy");
> 528 }
> 529 
> 530 Constructor constructor = clazz.getConstructor(int.class, 
> Parker.class, Logger.class, String.class);
> 531 
> 532 return 
> (CoalescingStrategy)constructor.newInstance(coalesceWindow, parker, logger, 
> displayName);
> 533 }
> 534 catch (Exception e)
> 535 {
> 536 throw new RuntimeException(e);
> 537 }
> 538 }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12319) Use of Dynamic Class Loading, Use of Externally-Controlled Input to Select Classes or Code

2016-07-27 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-12319:
---
Issue Type: Sub-task  (was: Bug)
Parent: CASSANDRA-12334

> Use of Dynamic Class Loading, Use of Externally-Controlled Input to Select 
> Classes or Code
> --
>
> Key: CASSANDRA-12319
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12319
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Eduardo Aguinaga
>
> Overview:
> In May through June of 2016 a static analysis was performed on version 3.0.5 
> of the Cassandra source code. The analysis included an automated analysis 
> using HP Fortify v4.21 SCA and a manual analysis utilizing SciTools 
> Understand v4. The results of that analysis includes the issue below.
> Issue:
> Dynamically loaded code has the potential to be malicious. The application 
> uses external input to select which classes or code to use, but it does not 
> sufficiently prevent the input from selecting improper classes or code.
> The snippet below shows the issue which ends on line 63 by instantiating a 
> class by name.
> TServerCustomFactory.java, lines 41-73:
> {code:java}
> 41 public TServer buildTServer(TServerFactory.Args args)
> 42 {
> 43 TServer server;
> 44 if (ThriftServer.SYNC.equalsIgnoreCase(serverType))
> 45 {
> 46 server = new CustomTThreadPoolServer.Factory().buildTServer(args);
> 47 }
> 48 else if(ThriftServer.ASYNC.equalsIgnoreCase(serverType))
> 49 {
> 50 server = new CustomTNonBlockingServer.Factory().buildTServer(args);
> 51 logger.info(String.format("Using non-blocking/asynchronous thrift 
> server on %s : %s", args.addr.getHostName(), args.addr.getPort()));
> 52 }
> 53 else if(ThriftServer.HSHA.equalsIgnoreCase(serverType))
> 54 {
> 55 server = new THsHaDisruptorServer.Factory().buildTServer(args);
> 56 logger.info(String.format("Using custom half-sync/half-async 
> thrift server on %s : %s", args.addr.getHostName(), args.addr.getPort()));
> 57 }
> 58 else
> 59 {
> 60 TServerFactory serverFactory;
> 61 try
> 62 {
> 63 serverFactory = (TServerFactory) 
> Class.forName(serverType).newInstance();
> 64 }
> 65 catch (Exception e)
> 66 {
> 67 throw new RuntimeException("Failed to instantiate server 
> factory:" + serverType, e);
> 68 }
> 69 server = serverFactory.buildTServer(args);
> 70 logger.info(String.format("Using custom thrift server %s on %s : 
> %s", server.getClass().getName(), args.addr.getHostName(), 
> args.addr.getPort()));
> 71 }
> 72 return server;
> 73 }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12320) Use of Dynamic Class Loading, Use of Externally-Controlled Input to Select Classes or Code

2016-07-27 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-12320:
---
Issue Type: Sub-task  (was: Bug)
Parent: CASSANDRA-12334

> Use of Dynamic Class Loading, Use of Externally-Controlled Input to Select 
> Classes or Code
> --
>
> Key: CASSANDRA-12320
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12320
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Eduardo Aguinaga
>
> Overview:
> In May through June of 2016 a static analysis was performed on version 3.0.5 
> of the Cassandra source code. The analysis included an automated analysis 
> using HP Fortify v4.21 SCA and a manual analysis utilizing SciTools 
> Understand v4. The results of that analysis includes the issue below.
> Issue:
> Dynamically loaded code has the potential to be malicious. The application 
> uses external input to select which classes or code to use, but it does not 
> sufficiently prevent the input from selecting improper classes or code.
> The snippet below shows the issue on lines 537-539 and 568 by instantiating a 
> class by name.
> BulkLoader.java, lines 521-577:
> {code:java}
> 521 public LoaderOptions validateArguments()
> 522 {
> 523 // Both username and password need to be provided
> 524 if ((user != null) != (passwd != null))
> 525 errorMsg("Username and password must both be provided", 
> getCmdLineOptions());
> 526 
> 527 if (user != null)
> 528 {
> 529 // Support for 3rd party auth providers that support plain text 
> credentials.
> 530 // In this case the auth provider must provide a constructor of 
> the form:
> 531 //
> 532 // public MyAuthProvider(String username, String password)
> 533 if (authProviderName != null)
> 534 {
> 535 try
> 536 {
> 537 Class authProviderClass = Class.forName(authProviderName);
> 538 Constructor constructor = 
> authProviderClass.getConstructor(String.class, String.class);
> 539 authProvider = 
> (AuthProvider)constructor.newInstance(user, passwd);
> 540 }
> 541 catch (ClassNotFoundException e)
> 542 {
> 543 errorMsg("Unknown auth provider: " + e.getMessage(), 
> getCmdLineOptions());
> 544 }
> 545 catch (NoSuchMethodException e)
> 546 {
> 547 errorMsg("Auth provider does not support plain text 
> credentials: " + e.getMessage(), getCmdLineOptions());
> 548 }
> 549 catch (InstantiationException | IllegalAccessException | 
> IllegalArgumentException | InvocationTargetException e)
> 550 {
> 551 errorMsg("Could not create auth provider with plain text 
> credentials: " + e.getMessage(), getCmdLineOptions());
> 552 }
> 553 }
> 554 else
> 555 {
> 556 // If a 3rd party auth provider wasn't provided use the 
> driver plain text provider
> 557 authProvider = new PlainTextAuthProvider(user, passwd);
> 558 }
> 559 }
> 560 // Alternate support for 3rd party auth providers that don't use 
> plain text credentials.
> 561 // In this case the auth provider must provide a nullary constructor 
> of the form:
> 562 //
> 563 // public MyAuthProvider()
> 564 else if (authProviderName != null)
> 565 {
> 566 try
> 567 {
> 568 authProvider = 
> (AuthProvider)Class.forName(authProviderName).newInstance();
> 569 }
> 570 catch (ClassNotFoundException | InstantiationException | 
> IllegalAccessException e)
> 571 {
> 572 errorMsg("Unknown auth provider" + e.getMessage(), 
> getCmdLineOptions());
> 573 }
> 574 }
> 575 
> 576 return this;
> 577 }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12304) Privacy Violation - Heap Inspection

2016-07-27 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-12304:
---
Issue Type: Sub-task  (was: Bug)
Parent: CASSANDRA-12334

> Privacy Violation - Heap Inspection
> ---
>
> Key: CASSANDRA-12304
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12304
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Eduardo Aguinaga
>
> Overview:
> In May through June of 2016 a static analysis was performed on version 3.0.5 
> of the Cassandra source code. The analysis included an automated analysis 
> using HP Fortify v4.21 SCA and a manual analysis utilizing SciTools 
> Understand v4. The results of that analysis includes the issue below.
> Issue:
> In the file BulkLoader.java on line 387 a string object is used to store 
> sensitive data. String objects are immutable and should not be used to store 
> sensitive data. Sensitive data should be stored in char or byte arrays and 
> the contents of those arrays should be cleared ASAP. Operations performed on 
> string objects will require that the original object be copied and the 
> operation be applied in the new copy of the string object. This results in 
> the likelihood that multiple copies of sensitive data will be present in the 
> heap until garbage collection takes place.
> The snippet below shows the issue on line 387:
> BulkLoader.java, lines 318-387:
> {code:java}
> 318 public String passwd;
> . . .
> 337 public static LoaderOptions parseArgs(String cmdArgs[])
> 338 {
> 339 CommandLineParser parser = new GnuParser();
> 340 CmdLineOptions options = getCmdLineOptions();
> 341 try
> 342 {
> . . .
> 386 if (cmd.hasOption(PASSWD_OPTION))
> 387 opts.passwd = cmd.getOptionValue(PASSWD_OPTION);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12306) Privacy VIolation - Heap Inspection

2016-07-27 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-12306:
---
Issue Type: Sub-task  (was: Bug)
Parent: CASSANDRA-12334

> Privacy VIolation - Heap Inspection
> ---
>
> Key: CASSANDRA-12306
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12306
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Eduardo Aguinaga
>
> Overview:
> In May through June of 2016 a static analysis was performed on version 3.0.5 
> of the Cassandra source code. The analysis included an automated analysis 
> using HP Fortify v4.21 SCA and a manual analysis utilizing SciTools 
> Understand v4. The results of that analysis includes the issue below.
> Issue:
> In the file NodeTool.java on lines 239, 242 and 291 a string object is used 
> to store sensitive data. String objects are immutable and should not be used 
> to store sensitive data. Sensitive data should be stored in char or byte 
> arrays and the contents of those arrays should be cleared ASAP. Operations 
> performed on string objects will require that the original object be copied 
> and the operation be applied in the new copy of the string object. This 
> results in the likelihood that multiple copies of sensitive data will be 
> present in the heap until garbage collection takes place.
> The snippet below shows the issue on line 239 and 242:
> NodeTool.java, lines 229-243:
> {code:java}
> 229 private String password = EMPTY;
> 230 
> 231 @Option(type = OptionType.GLOBAL, name = {"-pwf", "--password-file"}, 
> description = "Path to the JMX password file")
> 232 private String passwordFilePath = EMPTY;
> 233 
> 234 @Override
> 235 public void run()
> 236 {
> 237 if (isNotEmpty(username)) {
> 238 if (isNotEmpty(passwordFilePath))
> 239 password = readUserPasswordFromFile(username, 
> passwordFilePath);
> 240 
> 241 if (isEmpty(password))
> 242 password = promptAndReadPassword();
> 243 }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12309) Use of Dynamic Class Loading, Use of Externally-Controlled Input to Select Classes or Code

2016-07-27 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-12309:
---
Issue Type: Sub-task  (was: Bug)
Parent: CASSANDRA-12334

> Use of Dynamic Class Loading, Use of Externally-Controlled Input to Select 
> Classes or Code
> --
>
> Key: CASSANDRA-12309
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12309
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Eduardo Aguinaga
>
> Overview:
> In May through June of 2016 a static analysis was performed on version 3.0.5 
> of the Cassandra source code. The analysis included an automated analysis 
> using HP Fortify v4.21 SCA and a manual analysis utilizing SciTools 
> Understand v4. The results of that analysis includes the issue below.
> Issue:
> Dynamically loaded code has the potential to be malicious. The application 
> uses external input to select which classes or code to use, but it does not 
> sufficiently prevent the input from selecting improper classes or code.
> The snippet below shows the issue on line 588 and the method returns a new 
> instance on line 594 or 598.
> CqlConfigHelper.java, lines 584-605:
> {code:java}
> 584 private static AuthProvider getClientAuthProvider(String 
> factoryClassName, Configuration conf)
> 585 {
> 586 try
> 587 {
> 588 Class c = Class.forName(factoryClassName);
> 589 if (PlainTextAuthProvider.class.equals(c))
> 590 {
> 591 String username = getStringSetting(USERNAME, conf).or("");
> 592 String password = getStringSetting(PASSWORD, conf).or("");
> 593 return (AuthProvider) c.getConstructor(String.class, 
> String.class)
> 594 .newInstance(username, password);
> 595 }
> 596 else
> 597 {
> 598 return (AuthProvider) c.newInstance();
> 599 }
> 600 }
> 601 catch (Exception e)
> 602 {
> 603 throw new RuntimeException("Failed to instantiate auth provider:" 
> + factoryClassName, e);
> 604 }
> 605 }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12308) Use of Dynamic Class Loading, Use of Externally-Controlled Input to Select Classes or Code

2016-07-27 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-12308:
---
Issue Type: Sub-task  (was: Bug)
Parent: CASSANDRA-12334

> Use of Dynamic Class Loading, Use of Externally-Controlled Input to Select 
> Classes or Code
> --
>
> Key: CASSANDRA-12308
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12308
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Eduardo Aguinaga
>
> Overview:
> In May through June of 2016 a static analysis was performed on version 3.0.5 
> of the Cassandra source code. The analysis included an automated analysis 
> using HP Fortify v4.21 SCA and a manual analysis utilizing SciTools 
> Understand v4. The results of that analysis includes the issue below.
> Issue:
> Dynamically loaded code has the potential to be malicious. The application 
> uses external input to select which classes or code to use, but it does not 
> sufficiently prevent the input from selecting improper classes or code.
> The snippet below shows the issue which ends on line 585 by instantiating a 
> class by name.
> ConfigHelper.java, lines 558-591:
> {code:java}
> 558 @SuppressWarnings("resource")
> 559 public static Cassandra.Client createConnection(Configuration conf, 
> String host, Integer port) throws IOException
> 560 {
> 561 try
> 562 {
> 563 TTransport transport = 
> getClientTransportFactory(conf).openTransport(host, port);
> 564 return new Cassandra.Client(new TBinaryProtocol(transport, true, 
> true));
> 565 }
> 566 catch (Exception e)
> 567 {
> 568 throw new IOException("Unable to connect to server " + host + ":" 
> + port, e);
> 569 }
> 570 }
> 571 
> 572 public static ITransportFactory getClientTransportFactory(Configuration 
> conf)
> 573 {
> 574 String factoryClassName = conf.get(ITransportFactory.PROPERTY_KEY, 
> TFramedTransportFactory.class.getName());
> 575 ITransportFactory factory = 
> getClientTransportFactory(factoryClassName);
> 576 Map options = getOptions(conf, 
> factory.supportedOptions());
> 577 factory.setOptions(options);
> 578 return factory;
> 579 }
> 580 
> 581 private static ITransportFactory getClientTransportFactory(String 
> factoryClassName)
> 582 {
> 583 try
> 584 {
> 585 return (ITransportFactory) 
> Class.forName(factoryClassName).newInstance();
> 586 }
> 587 catch (Exception e)
> 588 {
> 589 throw new RuntimeException("Failed to instantiate transport 
> factory:" + factoryClassName, e);
> 590 }
> 591 }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12295) Double check locking pattern

2016-07-27 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-12295:
---
Issue Type: Sub-task  (was: Bug)
Parent: CASSANDRA-12334

> Double check locking pattern
> 
>
> Key: CASSANDRA-12295
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12295
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Eduardo Aguinaga
>
> Overview:
> In May through June of 2016 a static analysis was performed on version 3.0.5 
> of the Cassandra source code. The analysis included an automated analysis 
> using HP Fortify v4.21 SCA and a manual analysis utilizing SciTools 
> Understand v4. The results of that analysis includes the issue below.
> Issue:
> The file ketspace.java includes a double check locking pattern. The double 
> check locking pattern is an incorrect idiom that does not achieve its 
> intended effect.For more information see LCK-10J in the CERT Oracle Coding 
> Standard for Java 
> https://www.securecoding.cert.org/confluence/display/java/LCK10-J.+Use+a+correct+form+of+the+double-checked+locking+idiom
> The snippet below shows the double check locking pattern:
> Keyspace.java, lines 115-135:
> {code:java}
> 115 private static Keyspace open(String keyspaceName, Schema schema, boolean 
> loadSSTables)
> 116 {
> 117 Keyspace keyspaceInstance = schema.getKeyspaceInstance(keyspaceName);
> 118 
> 119 if (keyspaceInstance == null)
> 120 {
> 121 // instantiate the Keyspace.  we could use putIfAbsent but it's 
> important to making sure it is only done once
> 122 // per keyspace, so we synchronize and re-check before doing it.
> 123 synchronized (Keyspace.class)
> 124 {
> 125 keyspaceInstance = schema.getKeyspaceInstance(keyspaceName);
> 126 if (keyspaceInstance == null)
> 127 {
> 128 // open and store the keyspace
> 129 keyspaceInstance = new Keyspace(keyspaceName, 
> loadSSTables);
> 130 schema.storeKeyspaceInstance(keyspaceInstance);
> 131 }
> 132 }
> 133 }
> 134 return keyspaceInstance;
> 135 }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12297) Privacy VIolation - Heap Inspection

2016-07-27 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-12297:
---
Issue Type: Sub-task  (was: Bug)
Parent: CASSANDRA-12334

> Privacy VIolation - Heap Inspection
> ---
>
> Key: CASSANDRA-12297
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12297
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Eduardo Aguinaga
>Assignee: Jason Brown
>
> Overview:
> In May through June of 2016 a static analysis was performed on version 3.0.5 
> of the Cassandra source code. The analysis included 
> an automated analysis using HP Fortify v4.21 SCA and a manual analysis 
> utilizing SciTools Understand v4. The results of that 
> analysis includes the issue below.
> Issue:
> In the file PasswordAuthenticator.java on line 129, 164 and 222 a string 
> object is used to store sensitive data. String objects are immutable and 
> should not be used to store sensitive data. Sensitive data should be stored 
> in char or byte arrays and the contents of those arrays should be cleared 
> ASAP. Operations performed on string objects will require that the original 
> object be copied and the operation be applied in the new copy of the string 
> object. This results in the likelihood that multiple copies of sensitive data 
> being present in the heap until garbage collection takes place.
> The snippet below shows the issue on line 129:
> PasswordAuthenticator.java, lines 123-134:
> {code:java}
> 123 public AuthenticatedUser legacyAuthenticate(Map 
> credentials) throws AuthenticationException
> 124 {
> 125 String username = credentials.get(USERNAME_KEY);
> 126 if (username == null)
> 127 throw new AuthenticationException(String.format("Required key 
> '%s' is missing", USERNAME_KEY));
> 128 
> 129 String password = credentials.get(PASSWORD_KEY);
> 130 if (password == null)
> 131 throw new AuthenticationException(String.format("Required key 
> '%s' is missing", PASSWORD_KEY));
> 132 
> 133 return authenticate(username, password);
> 134 }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12305) Privacy VIolation - Heap Inspection

2016-07-27 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-12305:
---
Issue Type: Sub-task  (was: Bug)
Parent: CASSANDRA-12334

> Privacy VIolation - Heap Inspection
> ---
>
> Key: CASSANDRA-12305
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12305
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Eduardo Aguinaga
> Fix For: 3.0.x
>
>
> Overview:
> In May through June of 2016 a static analysis was performed on version 3.0.5 
> of the Cassandra source code. The analysis included an automated analysis 
> using HP Fortify v4.21 SCA and a manual analysis utilizing SciTools 
> Understand v4. The results of that analysis includes the issue below.
> Issue:
> In the file NodeProbe.java on lines 139 and 181 a string object is used to 
> store sensitive data. String objects are immutable and should not be used to 
> store sensitive data. Sensitive data should be stored in char or byte arrays 
> and the contents of those arrays should be cleared ASAP. Operations performed 
> on string objects will require that the original object be copied and the 
> operation be applied in the new copy of the string object. This results in 
> the likelihood that multiple copies of sensitive data will be present in the 
> heap until garbage collection takes place.
> The snippet below shows the issue on line 139:
> NodeProbe.java, lines 105-141:
> {code:java}
> 105 private String password;
> . . .
> 131 public NodeProbe(String host, int port, String username, String password) 
> throws IOException
> 132 {
> 133 assert username != null && !username.isEmpty() && password != null && 
> !password.isEmpty()
> 134: "neither username nor password can be blank";
> 135 
> 136 this.host = host;
> 137 this.port = port;
> 138 this.username = username;
> 139 this.password = password;
> 140 connect();
> 141 }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12307) Command Injection

2016-07-27 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-12307:
---
Issue Type: Sub-task  (was: Bug)
Parent: CASSANDRA-12334

> Command Injection
> -
>
> Key: CASSANDRA-12307
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12307
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Eduardo Aguinaga
>Priority: Critical
>
> Overview:
> In May through June of 2016 a static analysis was performed on version 3.0.5 
> of the Cassandra source code. The analysis included an automated analysis 
> using HP Fortify v4.21 SCA and a manual analysis utilizing SciTools 
> Understand v4. The results of that analysis includes the issue below.
> Issue:
> Two commands, archiveCommand and restoreCommand, are stored as string 
> properties and retrieved on lines 91 and 92 of CommitLogArchiver.java. The 
> only processing performed on the command strings is that tokens are replaced 
> by data available at runtime. 
> A malicious command could be entered into the system by storing the malicious 
> command in place of the valid archiveCommand or restoreCommand. The malicious 
> command would then be executed on line 265 within the exec method.
> Any commands that are stored and retrieved should be verified prior to 
> execution. Assuming that the command is safe because it is stored as a local 
> property invites security issues.
> {code:java}
> CommitLogArchiver.java, lines 91-92:
> 91 String archiveCommand = commitlog_commands.getProperty("archive_command");
> 92 String restoreCommand = commitlog_commands.getProperty("restore_command");
> CommitLogArchiver.java, lines 261-266:
> 261 private void exec(String command) throws IOException
> 262 {
> 263 ProcessBuilder pb = new ProcessBuilder(command.split(" "));
> 264 pb.redirectErrorStream(true);
> 265 FBUtilities.exec(pb);
> 266 }
> CommitLogArchiver.java, lines 152-166:
> 152 public void maybeArchive(final String path, final String name)
> 153 {
> 154 if (Strings.isNullOrEmpty(archiveCommand))
> 155 return;
> 156 
> 157 archivePending.put(name, executor.submit(new WrappedRunnable()
> 158 {
> 159 protected void runMayThrow() throws IOException
> 160 {
> 161 String command = archiveCommand.replace("%name", name);
> 162 command = command.replace("%path", path);
> 163 exec(command);
> 164 }
> 165 }));
> 166 }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-12334) HP Fortify Analysis

2016-07-27 Thread Jonathan Ellis (JIRA)
Jonathan Ellis created CASSANDRA-12334:
--

 Summary: HP Fortify Analysis
 Key: CASSANDRA-12334
 URL: https://issues.apache.org/jira/browse/CASSANDRA-12334
 Project: Cassandra
  Issue Type: Bug
Reporter: Jonathan Ellis
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CASSANDRA-7021) With tracing enabled, queries should still be recorded when using prepared and batch statements

2016-07-27 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-7021.
---
   Resolution: Duplicate
Fix Version/s: (was: 2.1.x)

> With tracing enabled, queries should still be recorded when using prepared 
> and batch statements
> ---
>
> Key: CASSANDRA-7021
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7021
> Project: Cassandra
>  Issue Type: Improvement
> Environment: C* 2.0.6 running on Ubuntu 12.04
>Reporter: Bill Joyce
>Priority: Minor
>
> I've enabled tracing on my cluster and am analyzing data in the 
> system_traces.sessions table. Single statement, non-prepared queries show up 
> with data in the 'parameters' field like 'query=select * from tablename where 
> x=1' and the request field is execute_cql3_query. But batches have null in 
> the parameters field and prepared statements just have 'page size=5000' in 
> the parameters field (the request field values are 'Execute batch of CQL3 
> queries' and 'Execute CQL3 prepared query'). Please include the actual query 
> text with prepared and batch statements. This will make performance analysis 
> much easier so I can do things like sort by duration and find my most 
> expensive queries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10243) Warn or fail when changing cluster topology live

2016-07-26 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15394003#comment-15394003
 ] 

Jonathan Ellis commented on CASSANDRA-10243:


I edited cassandra.yaml as follows:

{code}
# CASSANDRA WILL NOT ALLOW YOU TO SWITCH TO AN INCOMPATIBLE SNITCH
# ONCE DATA IS INSERTED INTO THE CLUSTER.  This would cause data loss.
# This means that if you start with the default SimpleSnitch, which
# locates every node on "rack1" in "datacenter1", your only options
# if you need to add another datacenter are GossipingPropertyFileSnitch
# (and the older PFS).  From there, if you want to migrate to an
# incompatible snitch like Ec2Snitch you can do it by adding new nodes
# under Ec2Snitch (which will locate them in a new "datacenter") and
# decommissioning the old ones.
{code}

> Warn or fail when changing cluster topology live
> 
>
> Key: CASSANDRA-10243
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10243
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Stefania
>Priority: Critical
> Fix For: 2.1.12, 2.2.4, 3.0.1, 3.1, 3.2
>
>
> Moving a node from one rack to another in the snitch, while it is alive, is 
> almost always the wrong thing to do.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CASSANDRA-7793) Utilize BATCH statements in COPY FROM

2016-07-22 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-7793.
---
   Resolution: Duplicate
Fix Version/s: (was: 2.1.x)

> Utilize BATCH statements in COPY FROM
> -
>
> Key: CASSANDRA-7793
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7793
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Mikhail Stepura
>Priority: Minor
>  Labels: cqlsh
>
> If we assume that a significant subset of COPY FROM csv's are going to be 
> results of COPY TO command, then rows will be grouped by the partition key. 
> In that case we'd win from batching (until another partition key is met, and 
> constrained by some limit of rows per batch, we don't want huge batches)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CASSANDRA-11738) Re-think the use of Severity in the DynamicEndpointSnitch calculation

2016-07-22 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-11738.

   Resolution: Fixed
Fix Version/s: (was: 3.x)
   3.10

committed with the tweak to getSeverity logic you suggested.  thanks!

> Re-think the use of Severity in the DynamicEndpointSnitch calculation
> -
>
> Key: CASSANDRA-11738
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11738
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Jeremiah Jordan
>Assignee: Jonathan Ellis
>Priority: Minor
> Fix For: 3.10
>
> Attachments: 11738.txt
>
>
> CASSANDRA-11737 was opened to allow completely disabling the use of severity 
> in the DynamicEndpointSnitch calculation, but that is a pretty big hammer.  
> There is probably something we can do to better use the score.
> The issue seems to be that severity is given equal weight with latency in the 
> current code, also that severity is only based on disk io.  If you have a 
> node that is CPU bound on something (say catching up on LCS compactions 
> because of bootstrap/repair/replace) the IO wait can be low, but the latency 
> to the node is high.
> Some ideas I had are:
> 1. Allowing a yaml parameter to tune how much impact the severity score has 
> in the calculation.
> 2. Taking CPU load into account as well as IO Wait (this would probably help 
> in the cases I have seen things go sideways)
> 3. Move the -D from CASSANDRA-11737 to being a yaml level setting
> 4. Go back to just relying on Latency and get rid of severity all together.  
> Now that we have rapid read protection, maybe just using latency is enough, 
> as it can help where the predictive nature of IO wait would have been useful.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CASSANDRA-8965) Cassandra retains a file handle to the directory its writing to for each writer instance

2016-07-22 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-8965.
---
   Resolution: Won't Fix
Fix Version/s: (was: 2.2.x)
   (was: 2.1.x)
   (was: 3.x)

closing as wontfix since 2.1 is close to EOL and not a problem in 3.0

> Cassandra retains a file handle to the directory its writing to for each 
> writer instance
> 
>
> Key: CASSANDRA-8965
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8965
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Benedict
>Priority: Trivial
>
> We could either share this amongst the CF object, or have a shared 
> ref-counted cache that opens a reference and shares it amongst all writer 
> instances, closing it once they all close.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7021) With tracing enabled, queries should still be recorded when using prepared and batch statements

2016-07-22 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15390283#comment-15390283
 ] 

Jonathan Ellis commented on CASSANDRA-7021:
---

[~snazy] was this part of what you did in 3.8?

> With tracing enabled, queries should still be recorded when using prepared 
> and batch statements
> ---
>
> Key: CASSANDRA-7021
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7021
> Project: Cassandra
>  Issue Type: Improvement
> Environment: C* 2.0.6 running on Ubuntu 12.04
>Reporter: Bill Joyce
>Priority: Minor
> Fix For: 2.1.x
>
>
> I've enabled tracing on my cluster and am analyzing data in the 
> system_traces.sessions table. Single statement, non-prepared queries show up 
> with data in the 'parameters' field like 'query=select * from tablename where 
> x=1' and the request field is execute_cql3_query. But batches have null in 
> the parameters field and prepared statements just have 'page size=5000' in 
> the parameters field (the request field values are 'Execute batch of CQL3 
> queries' and 'Execute CQL3 prepared query'). Please include the actual query 
> text with prepared and batch statements. This will make performance analysis 
> much easier so I can do things like sort by duration and find my most 
> expensive queries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7793) Utilize BATCH statements in COPY FROM

2016-07-22 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15390281#comment-15390281
 ] 

Jonathan Ellis commented on CASSANDRA-7793:
---

[~Stefania], I think COPY does this now?

> Utilize BATCH statements in COPY FROM
> -
>
> Key: CASSANDRA-7793
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7793
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Mikhail Stepura
>Priority: Minor
>  Labels: cqlsh
> Fix For: 2.1.x
>
>
> If we assume that a significant subset of COPY FROM csv's are going to be 
> results of COPY TO command, then rows will be grouped by the partition key. 
> In that case we'd win from batching (until another partition key is met, and 
> constrained by some limit of rows per batch, we don't want huge batches)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CASSANDRA-7794) Utilize a prepared statement in COPY FROM

2016-07-22 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-7794.
---
   Resolution: Duplicate
Fix Version/s: (was: 2.1.x)

obsoleted by CASSANDRA-11053 and linked tickets

> Utilize a prepared statement in COPY FROM
> -
>
> Key: CASSANDRA-7794
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7794
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Mikhail Stepura
>Priority: Minor
>  Labels: cqlsh
>
> Switch to prepared statements for writes (assuming that python serialization 
> cost wouldn't outweigh the server-side benefits). It's a bit involved though, 
> but may be worth it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8956) Deprecate SSTableSimpleWriter, SSTableImport, and SSTableExport in 2.1

2016-07-22 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15390008#comment-15390008
 ] 

Jonathan Ellis commented on CASSANDRA-8956:
---

I see that SSTableSimpleWriter is still around in trunk...

> Deprecate SSTableSimpleWriter, SSTableImport, and SSTableExport in 2.1
> --
>
> Key: CASSANDRA-8956
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8956
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Aleksey Yeschenko
>Priority: Minor
>  Labels: docs-impacting
> Fix For: 2.1.x
>
>
> SSTableSimpleWriter doesn't make much sense in post CASSANDRA-8099 world, and 
> will be removed in 3.0.
> To do that we should deprecate it in 2.1 first, however.
> Same goes for both SSTableImport and SSTableExport - we should deprecate them 
> in 2.1.4, and eventually replace with CASSANDRA-7464 in 3.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12014) IndexSummary > 2G causes an assertion error

2016-07-22 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15389938#comment-15389938
 ] 

Jonathan Ellis commented on CASSANDRA-12014:


Still an issue on 3.0.x.

> IndexSummary > 2G causes an assertion error
> ---
>
> Key: CASSANDRA-12014
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12014
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Brandon Williams
>Priority: Minor
> Fix For: 3.0.x, 3.x
>
>
> {noformat}
> ERROR [CompactionExecutor:1546280] 2016-06-01 13:21:00,444  
> CassandraDaemon.java:229 - Exception in thread 
> Thread[CompactionExecutor:1546280,1,main]
> java.lang.AssertionError: null
> at 
> org.apache.cassandra.io.sstable.IndexSummaryBuilder.maybeAddEntry(IndexSummaryBuilder.java:171)
>  ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046]
> at 
> org.apache.cassandra.io.sstable.SSTableWriter$IndexWriter.append(SSTableWriter.java:634)
>  ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046]
> at 
> org.apache.cassandra.io.sstable.SSTableWriter.afterAppend(SSTableWriter.java:179)
>  ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046]
> at 
> org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:205) 
> ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046]
> at 
> org.apache.cassandra.io.sstable.SSTableRewriter.append(SSTableRewriter.java:126)
>  ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046]
> at 
> org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:197)
>  ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046]
> at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
> ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046]
> at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:73)
>  ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046]
> at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
>  ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046]
> at 
> org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:263)
>  ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> ~[na:1.7.0_51]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[na:1.7.0_51]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  ~[na:1.7.0_51]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_51]
> at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51]
> {noformat}
> I believe this can be fixed by raising the min_index_interval, but we should 
> have a better method of coping with this than throwing the AE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12014) IndexSummary > 2G causes an assertion error

2016-07-22 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-12014:
---
Priority: Minor  (was: Major)

> IndexSummary > 2G causes an assertion error
> ---
>
> Key: CASSANDRA-12014
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12014
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Brandon Williams
>Priority: Minor
> Fix For: 3.0.x, 3.x
>
>
> {noformat}
> ERROR [CompactionExecutor:1546280] 2016-06-01 13:21:00,444  
> CassandraDaemon.java:229 - Exception in thread 
> Thread[CompactionExecutor:1546280,1,main]
> java.lang.AssertionError: null
> at 
> org.apache.cassandra.io.sstable.IndexSummaryBuilder.maybeAddEntry(IndexSummaryBuilder.java:171)
>  ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046]
> at 
> org.apache.cassandra.io.sstable.SSTableWriter$IndexWriter.append(SSTableWriter.java:634)
>  ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046]
> at 
> org.apache.cassandra.io.sstable.SSTableWriter.afterAppend(SSTableWriter.java:179)
>  ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046]
> at 
> org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:205) 
> ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046]
> at 
> org.apache.cassandra.io.sstable.SSTableRewriter.append(SSTableRewriter.java:126)
>  ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046]
> at 
> org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:197)
>  ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046]
> at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
> ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046]
> at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:73)
>  ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046]
> at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
>  ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046]
> at 
> org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:263)
>  ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> ~[na:1.7.0_51]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[na:1.7.0_51]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  ~[na:1.7.0_51]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_51]
> at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51]
> {noformat}
> I believe this can be fixed by raising the min_index_interval, but we should 
> have a better method of coping with this than throwing the AE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12014) IndexSummary > 2G causes an assertion error

2016-07-22 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-12014:
---
Fix Version/s: (was: 2.1.x)
   3.x
   3.0.x

> IndexSummary > 2G causes an assertion error
> ---
>
> Key: CASSANDRA-12014
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12014
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Brandon Williams
> Fix For: 3.0.x, 3.x
>
>
> {noformat}
> ERROR [CompactionExecutor:1546280] 2016-06-01 13:21:00,444  
> CassandraDaemon.java:229 - Exception in thread 
> Thread[CompactionExecutor:1546280,1,main]
> java.lang.AssertionError: null
> at 
> org.apache.cassandra.io.sstable.IndexSummaryBuilder.maybeAddEntry(IndexSummaryBuilder.java:171)
>  ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046]
> at 
> org.apache.cassandra.io.sstable.SSTableWriter$IndexWriter.append(SSTableWriter.java:634)
>  ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046]
> at 
> org.apache.cassandra.io.sstable.SSTableWriter.afterAppend(SSTableWriter.java:179)
>  ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046]
> at 
> org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:205) 
> ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046]
> at 
> org.apache.cassandra.io.sstable.SSTableRewriter.append(SSTableRewriter.java:126)
>  ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046]
> at 
> org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:197)
>  ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046]
> at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
> ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046]
> at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:73)
>  ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046]
> at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
>  ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046]
> at 
> org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:263)
>  ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> ~[na:1.7.0_51]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[na:1.7.0_51]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  ~[na:1.7.0_51]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_51]
> at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51]
> {noformat}
> I believe this can be fixed by raising the min_index_interval, but we should 
> have a better method of coping with this than throwing the AE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9639) size_estimates is inacurate in multi-dc clusters

2016-07-22 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9639:
--
Fix Version/s: (was: 2.2.x)
   (was: 2.1.x)
   3.0.x

> size_estimates is inacurate in multi-dc clusters
> 
>
> Key: CASSANDRA-9639
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9639
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Sebastian Estevez
>Priority: Minor
> Fix For: 3.0.x
>
>
> CASSANDRA-7688 introduced size_estimates to replace the thrift 
> describe_splits_ex command.
> Users have reported seeing estimates that are widely off in multi-dc clusters.
> system.size_estimates show the wrong range_start / range_end



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11625) CFS.CANONICAL_SSTABLES adds compacting sstables without checking if they are still live

2016-07-22 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15389933#comment-15389933
 ] 

Jonathan Ellis commented on CASSANDRA-11625:


Is this still a problem in 3.0.x?  If not I think we should wontfix.

> CFS.CANONICAL_SSTABLES adds compacting sstables without checking if they are 
> still live
> ---
>
> Key: CASSANDRA-11625
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11625
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Marcus Eriksson
> Fix For: 2.1.x, 2.2.x
>
>
> In 2.1 and 2.2 we blindly add all compacting sstables to the 
> ColumnFamilyStore.CANONICAL_SSTABLES
> This could cause issues as we unmark compacting after removing sstables from 
> the tracker and compaction strategies. For example, when creating scanners 
> for validation with LCS we might get overlap within a level as both the old 
> sstables and the new ones could be in CANONICAL_SSTABLES
> What we need to do is to get the *version* of the sstable from the compacting 
> set as it holds the original sstable without moved starts etc (that is what 
> we do in 3.0+)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9639) size_estimates is inacurate in multi-dc clusters

2016-07-22 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9639:
--
Priority: Minor  (was: Major)

> size_estimates is inacurate in multi-dc clusters
> 
>
> Key: CASSANDRA-9639
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9639
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Sebastian Estevez
>Priority: Minor
> Fix For: 3.0.x
>
>
> CASSANDRA-7688 introduced size_estimates to replace the thrift 
> describe_splits_ex command.
> Users have reported seeing estimates that are widely off in multi-dc clusters.
> system.size_estimates show the wrong range_start / range_end



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10983) Metrics for tracking offending queries

2016-07-22 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15389915#comment-15389915
 ] 

Jonathan Ellis commented on CASSANDRA-10983:


Perhaps this should be added to Tracing instead of Metrics?

> Metrics for tracking offending queries
> --
>
> Key: CASSANDRA-10983
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10983
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Sharvanath Pathak
>  Labels: github-import
> Fix For: 2.1.x
>
>
> I have seen big GC pauses leading to nodes being marked DOWN in our cluster. 
> The most common issue is someone, would add a large range scan and it would 
> be difficult to pin-point the specific query. I have added a mechanism to 
> account the memory allocation for a specific query. In order to allow 
> aggregates over a period I added a metric as well. Attached is the diff.
> I was wondering if something like this would be interesting for more general 
> audience. There are some things which need to be fixed for proper release. 
> For instance, Cleaning up existing metrics on server restart. However, just 
> wanted to check before that if something like this would be useful for others.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CASSANDRA-10370) upgrade_tests.cql_tests:TestCQL.static_columns_with_distinct_test fails in 2.1 nodes

2016-07-22 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-10370.

   Resolution: Cannot Reproduce
Fix Version/s: (was: 2.1.x)

I think this report is obsolete now.

> upgrade_tests.cql_tests:TestCQL.static_columns_with_distinct_test fails in 
> 2.1 nodes
> 
>
> Key: CASSANDRA-10370
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10370
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Blake Eggleston
>
> When running the dtest 
> {{upgrade_tests.cql_tests:TestCQL.static_columns_with_distinct_test}} against 
> 2x2.1 nodes, the test fails due to a row being returned multiple times. This 
> isn't reproducible between 3.0 nodes, but is between 2.1<->3.0 nodes, so it's 
> safe to say it's not a 3.0 bug. To reproduce, run:
> {noformat}UPGRADE_MODE=none nosetests 
> upgrade_tests/cql_tests.py:TestCQL.static_columns_with_distinct_test{noformat}
>  with the environment variable {{OLD_CASSANDRA_DIR}} set to a repo with 
> cassandra-2.1 checked out.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CASSANDRA-9434) If a node loses schema_columns SSTables it could delete all secondary indexes from the schema

2016-07-22 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-9434.
---
   Resolution: Won't Fix
Fix Version/s: (was: 2.2.x)
   (was: 2.1.x)

With 2.1 approaching EOL as well, closing as wontfix.

> If a node loses schema_columns SSTables it could delete all secondary indexes 
> from the schema
> -
>
> Key: CASSANDRA-9434
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9434
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Richard Low
>
> It is possible that a single bad node can delete all secondary indexes if it 
> restarts and cannot read its schema_columns SSTables. Here's a reproduction:
> * Create a 2 node cluster (we saw it on 2.0.11)
> * Create the schema:
> {code}
> create keyspace myks with replication = {'class':'SimpleStrategy', 
> 'replication_factor':1};
> use myks;
> create table mytable (a text, b text, c text, PRIMARY KEY (a, b) );
> create index myindex on mytable(b);
> {code}
> NB index must be on clustering column to repro
> * Kill one node
> * Wipe its commitlog and system/schema_columns sstables.
> * Start it again
> * Run on this node
> select index_name from system.schema_columns where keyspace_name = 'myks' and 
> columnfamily_name = 'mytable' and column_name = 'b';
> and you'll see the index is null.
> * Run 'describe schema' on the other node. Sometimes it will not show the 
> index, but you might need to bounce for it to disappear.
> I think the culprit is SystemKeyspace.copyAllAliasesToColumnsProper.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CASSANDRA-10142) Protocol v1 and v2 don't deal with frozen type correctly

2016-07-22 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-10142.

   Resolution: Won't Fix
Fix Version/s: (was: 2.2.x)
   (was: 2.1.x)

With 2.1 nearing EOL it's time to wontfix this.

> Protocol v1 and v2 don't deal with frozen type correctly
> 
>
> Key: CASSANDRA-10142
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10142
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL
>Reporter: Jim Witschey
>  Labels: client-impacting
> Attachments: repro.sh
>
>
> When trying to connect with the Python driver to a cluster running trunk, the 
> connection fails with a {{NoHostAvailable}} exception if the protocol version 
> is set to 1 or 2.
> The attached script reproduces the error and demonstrates that the connection 
> works if the protocol version is 3 or 4. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10195) TWCS experiments and improvement proposals

2016-07-22 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-10195:
---
Priority: Minor  (was: Major)

> TWCS experiments and improvement proposals
> --
>
> Key: CASSANDRA-10195
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10195
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Antti Nissinen
>Priority: Minor
>  Labels: dtcs
> Attachments: 20150814_1027_compaction_hierarchy.txt, 
> node0_20150727_1250_time_graph.txt, node0_20150810_1017_time_graph.txt, 
> node0_20150812_1531_time_graph.txt, node0_20150813_0835_time_graph.txt, 
> node0_20150814_1054_time_graph.txt, node1_20150727_1250_time_graph.txt, 
> node1_20150810_1017_time_graph.txt, node1_20150812_1531_time_graph.txt, 
> node1_20150813_0835_time_graph.txt, node1_20150814_1054_time_graph.txt, 
> node2_20150727_1250_time_graph.txt, node2_20150810_1017_time_graph.txt, 
> node2_20150812_1531_time_graph.txt, node2_20150813_0835_time_graph.txt, 
> node2_20150814_1054_time_graph.txt, sstable_count_figure1.png, 
> sstable_count_figure2.png
>
>
> This JIRA item describes experiments with DateTieredCompactionStartegy (DTCS) 
> and TimeWindowCompactionStrategy (TWCS) and proposes modifications to the 
> TWCS. In a test system several crashes were caused intentionally (and 
> unintentionally) and repair operations were executed leading to flood of 
> small SSTables. Target was to be able compact those files are release disk 
> space reserved by duplicate data. Setup is following:
> - Three nodes
> - DateTieredCompactionStrategy, max_sstable_age_days = 5
> Cassandra 2.1.2
> The setup and data format has been documented in detailed here 
> https://issues.apache.org/jira/browse/CASSANDRA-9644.
> The test was started by dumping  few days worth of data to the database for 
> 100 000 signals. Time graphs of SStables from different nodes indicates that 
> the DTCS has been working as expected and SStables are nicely ordered in time 
> wise.
> See files:
> node0_20150727_1250_time_graph.txt
> node1_20150727_1250_time_graph.txt
> node2_20150727_1250_time_graph.txt
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  AddressLoad   Tokens  OwnsHost ID 
>   Rack
> UN  139.66.43.170  188.87 GB  256 ?   
> dfc29863-c935-4909-9d7f-c59a47eda03d  rack1
> UN  139.66.43.169  198.37 GB  256 ?   
> 12e7628b-7f05-48f6-b7e4-35a82010021a  rack1
> UN  139.66.43.168  191.88 GB  256 ?   
> 26088392-f803-4d59-9073-c75f857fb332  rack1
> All nodes crashed due to power failure (know beforehand) and repair 
> operations were started for each node one at the time. Below is the behavior 
> of SSTable count on different nodes. New data was dumped simultaneously with 
> repair operation.
> SEE FIGURE: sstable_count_figure1.png
> Vertical lines indicate following events.
> 1) Cluster was down due to power shutdown and was restarted. At the first 
> vertical line the repair operation (nodetool repair -pr) was started for the 
> first node
> 2) Repair for the second repair operation was started after the first node 
> was successfully repaired.
> 3) Repair for the third repair operation was started
> 4) Third repair operation was finished
> 5) One of the nodes crashed (unknown reason in OS level)
> 6) Repair operation (nodetool repair -pr) was started for the first node
> 7) Repair operation for the second node was started
> 8) Repair operation for the third node was started
> 9) Repair operations finished
> These repair operations are leading to huge amount of small SSTables covering 
> the whole time span of the data. The compaction horizon of DTCS was limited 
> to 5 days (max_sstable_age_days) due to the size of the SStables on the disc. 
> Therefore, small SStables won't be compacted. Below are the time graphs from 
> SSTables after the second round of repairs.
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  AddressLoad   Tokens  OwnsHost ID 
>   Rack
> UN  xx.xx.xx.170  663.61 GB  256 ?   
> dfc29863-c935-4909-9d7f-c59a47eda03d  rack1
> UN  xx.xx.xx.169  763.52 GB  256 ?   
> 12e7628b-7f05-48f6-b7e4-35a82010021a  rack1
> UN  xx.xx.xx.168  651.59 GB  256 ?   
> 26088392-f803-4d59-9073-c75f857fb332  rack1
> See files:
> node0_20150810_1017_time_graph.txt
> node1_20150810_1017_time_graph.txt
> node2_20150810_1017_time_graph.txt
> To get rid of the SStables the TimeWindowCompactionStrategy was taken into 
> use. Window size was set to 5 days. Cassandra version was updated to 2.1.8. 
> Below figure shows the behavior of SStable count. TWCS was taken into use 
> 10.8.2015 at 13:10. The maximum amount of files to be compacted in one task 
> 

[jira] [Updated] (CASSANDRA-10195) TWCS experiments and improvement proposals

2016-07-22 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-10195:
---
Fix Version/s: (was: 2.2.x)
   (was: 2.1.x)

> TWCS experiments and improvement proposals
> --
>
> Key: CASSANDRA-10195
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10195
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Antti Nissinen
>  Labels: dtcs
> Attachments: 20150814_1027_compaction_hierarchy.txt, 
> node0_20150727_1250_time_graph.txt, node0_20150810_1017_time_graph.txt, 
> node0_20150812_1531_time_graph.txt, node0_20150813_0835_time_graph.txt, 
> node0_20150814_1054_time_graph.txt, node1_20150727_1250_time_graph.txt, 
> node1_20150810_1017_time_graph.txt, node1_20150812_1531_time_graph.txt, 
> node1_20150813_0835_time_graph.txt, node1_20150814_1054_time_graph.txt, 
> node2_20150727_1250_time_graph.txt, node2_20150810_1017_time_graph.txt, 
> node2_20150812_1531_time_graph.txt, node2_20150813_0835_time_graph.txt, 
> node2_20150814_1054_time_graph.txt, sstable_count_figure1.png, 
> sstable_count_figure2.png
>
>
> This JIRA item describes experiments with DateTieredCompactionStartegy (DTCS) 
> and TimeWindowCompactionStrategy (TWCS) and proposes modifications to the 
> TWCS. In a test system several crashes were caused intentionally (and 
> unintentionally) and repair operations were executed leading to flood of 
> small SSTables. Target was to be able compact those files are release disk 
> space reserved by duplicate data. Setup is following:
> - Three nodes
> - DateTieredCompactionStrategy, max_sstable_age_days = 5
> Cassandra 2.1.2
> The setup and data format has been documented in detailed here 
> https://issues.apache.org/jira/browse/CASSANDRA-9644.
> The test was started by dumping  few days worth of data to the database for 
> 100 000 signals. Time graphs of SStables from different nodes indicates that 
> the DTCS has been working as expected and SStables are nicely ordered in time 
> wise.
> See files:
> node0_20150727_1250_time_graph.txt
> node1_20150727_1250_time_graph.txt
> node2_20150727_1250_time_graph.txt
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  AddressLoad   Tokens  OwnsHost ID 
>   Rack
> UN  139.66.43.170  188.87 GB  256 ?   
> dfc29863-c935-4909-9d7f-c59a47eda03d  rack1
> UN  139.66.43.169  198.37 GB  256 ?   
> 12e7628b-7f05-48f6-b7e4-35a82010021a  rack1
> UN  139.66.43.168  191.88 GB  256 ?   
> 26088392-f803-4d59-9073-c75f857fb332  rack1
> All nodes crashed due to power failure (know beforehand) and repair 
> operations were started for each node one at the time. Below is the behavior 
> of SSTable count on different nodes. New data was dumped simultaneously with 
> repair operation.
> SEE FIGURE: sstable_count_figure1.png
> Vertical lines indicate following events.
> 1) Cluster was down due to power shutdown and was restarted. At the first 
> vertical line the repair operation (nodetool repair -pr) was started for the 
> first node
> 2) Repair for the second repair operation was started after the first node 
> was successfully repaired.
> 3) Repair for the third repair operation was started
> 4) Third repair operation was finished
> 5) One of the nodes crashed (unknown reason in OS level)
> 6) Repair operation (nodetool repair -pr) was started for the first node
> 7) Repair operation for the second node was started
> 8) Repair operation for the third node was started
> 9) Repair operations finished
> These repair operations are leading to huge amount of small SSTables covering 
> the whole time span of the data. The compaction horizon of DTCS was limited 
> to 5 days (max_sstable_age_days) due to the size of the SStables on the disc. 
> Therefore, small SStables won't be compacted. Below are the time graphs from 
> SSTables after the second round of repairs.
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  AddressLoad   Tokens  OwnsHost ID 
>   Rack
> UN  xx.xx.xx.170  663.61 GB  256 ?   
> dfc29863-c935-4909-9d7f-c59a47eda03d  rack1
> UN  xx.xx.xx.169  763.52 GB  256 ?   
> 12e7628b-7f05-48f6-b7e4-35a82010021a  rack1
> UN  xx.xx.xx.168  651.59 GB  256 ?   
> 26088392-f803-4d59-9073-c75f857fb332  rack1
> See files:
> node0_20150810_1017_time_graph.txt
> node1_20150810_1017_time_graph.txt
> node2_20150810_1017_time_graph.txt
> To get rid of the SStables the TimeWindowCompactionStrategy was taken into 
> use. Window size was set to 5 days. Cassandra version was updated to 2.1.8. 
> Below figure shows the behavior of SStable count. TWCS was taken into use 
> 10.8.2015 at 13:10. The maximum amount of files to be compacted in one 

[jira] [Resolved] (CASSANDRA-9273) Avoid calculating maxPurgableTimestamp for partitions containing non-expired TTL cells

2016-07-22 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-9273.
---
   Resolution: Duplicate
Fix Version/s: (was: 2.1.x)

think this is a duplicate of CASSANDRA-11834

> Avoid calculating maxPurgableTimestamp for partitions containing non-expired 
> TTL cells
> --
>
> Key: CASSANDRA-9273
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9273
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Marcus Eriksson
>
> Seems we still calculate maxPurgableTimestamp for partitions where we have 
> TTL that is known to not be expired, we should try to avoid that



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9384) Update jBCrypt dependency to version 0.4

2016-07-22 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9384:
--
Assignee: (was: Marko Denda)

> Update jBCrypt dependency to version 0.4
> 
>
> Key: CASSANDRA-9384
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9384
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Sam Tunnicliffe
> Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x
>
>
> https://bugzilla.mindrot.org/show_bug.cgi?id=2097
> Although the bug tracker lists it as NEW/OPEN, the release notes for 0.4 
> indicate that this is now fixed, so we should update.
> Thanks to [~Bereng] for identifying the issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CASSANDRA-9323) Bulk loading is slow

2016-07-22 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-9323.
---
   Resolution: Won't Fix
Fix Version/s: (was: 2.1.x)

The preferred way to bulk load is now COPY; see CASSANDRA-11053 and linked 
tickets.

> Bulk loading is slow
> 
>
> Key: CASSANDRA-9323
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9323
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Pierre N.
> Attachments: App.java
>
>
> When I bulk upload sstable created with CQLSSTableWriter, it's very slow. I 
> tested on a fresh cassandra node (nothing in keyspace, nor tables) with good 
> hardware (8x2.8ghz, 32G ram), but with classic hard disk (performance won't 
> be improved with SSD in this case I think). 
> When I upload from a different server an sstable using sstableloader I get an 
> average of 3 MB/sec, in the attached example I managed to get 5 MB/sec, which 
> is still slow.
> During the streaming process  I noticed that one core of the server is full 
> CPU, so I think the operation is CPU bound server side. I quickly attached a 
> sample profiler to the cassandra instance and got the following output : 
> https://i.imgur.com/IfLc2Ip.png
> So, I think, but I may be wrong because it's inaccurate sampling, during 
> streaming the table is unserialized and reserialized to another sstable, and 
> that's this unserialize/serialize process which is taking a big amount of 
> CPU, slowing down the insert speed.
> Can someone confirm the bulk load is slow ? I tested also on my computer and 
> barely reach 1MB/sec 
> I don't understand the point of totally unserializing the table I just did 
> build using the CQLSStableWriter (because it's already a long process to 
> build and sort the table), couldn't it just copy the table from offset X to 
> offset Y (using index information by example) without 
> unserializing/reserializing it ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11053) COPY FROM on large datasets: fix progress report and optimize performance part 4

2016-07-22 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-11053:
---
Summary: COPY FROM on large datasets: fix progress report and optimize 
performance part 4  (was: COPY FROM on large datasets: fix progress report and 
debug performance)

> COPY FROM on large datasets: fix progress report and optimize performance 
> part 4
> 
>
> Key: CASSANDRA-11053
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11053
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Stefania
>Assignee: Stefania
>  Labels: doc-impacting
> Fix For: 2.1.14, 2.2.6, 3.0.5, 3.5
>
> Attachments: bisect_test.py, copy_from_large_benchmark.txt, 
> copy_from_large_benchmark_2.txt, parent_profile.txt, parent_profile_2.txt, 
> worker_profiles.txt, worker_profiles_2.txt
>
>
> h5. Description
> Running COPY from on a large dataset (20G divided in 20M records) revealed 
> two issues:
> * The progress report is incorrect, it is very slow until almost the end of 
> the test at which point it catches up extremely quickly.
> * The performance in rows per second is similar to running smaller tests with 
> a smaller cluster locally (approx 35,000 rows per second). As a comparison, 
> cassandra-stress manages 50,000 rows per second under the same set-up, 
> therefore resulting 1.5 times faster. 
> See attached file _copy_from_large_benchmark.txt_ for the benchmark details.
> h5. Doc-impacting changes to COPY FROM options
> * A new option was added: PREPAREDSTATEMENTS - it indicates if prepared 
> statements should be used; it defaults to true.
> * The default value of CHUNKSIZE changed from 1000 to 5000.
> * The default value of MINBATCHSIZE changed from 2 to 10.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CASSANDRA-9659) Better messages/protection against bad paging state

2016-07-22 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-9659.
---
   Resolution: Later
Fix Version/s: (was: 2.1.x)

> Better messages/protection against bad paging state
> ---
>
> Key: CASSANDRA-9659
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9659
> Project: Cassandra
>  Issue Type: Improvement
> Environment: CentOS 6.5
> Cassandra 2.1.3
>Reporter: Dave Decicco
> Attachments: CqlProxyInvocationHandler sets PagingState AFTER 1st 
> company call for results.PNG, Multiple companyid IN clause with secondary 
> index failure.PNG, rawmessage column family.PNG, trackingdevice column 
> family.PNG
>
>
> An issue exists where a java.lang.AssertionError occurs for a select number 
> of read queries from Cassandra within our application.
> It was suggested that a ticket be created to see if the error below is the 
> same as CASSANDRA-8949 which was fixed in version 2.1.5.
> Here is a portion of the Cassandra log file where the exception occurs:
> {code}
> INFO  [MemtableFlushWriter:50153] 2015-06-23 13:11:17,517 Memtable.java:385 - 
> Completed flushing; nothing needed to be retained.  Commitlog position was 
> ReplayPosition(segmentId=1425054853780, position=8886361)
> ERROR [SharedPool-Worker-1] 2015-06-23 13:11:29,047 Message.java:538 - 
> Unexpected exception during request; channel = [id: 0x8f1ca59e, 
> /10.30.43.68:33717 => /10.30.43.146:9042]javaa.lang.AssertionError: 
> [DecoratedKey(5747358200379796162, 
> 6462346538352d653235382d343130352d616131612d346230396635353965666364),DecoratedKey(3303996443194009861,
>  34623632646562322d626234332d346661642d613263312d356334613233633037353932)]
> at org.apache.cassandra.dht.Bounds.(Bounds.java:41) 
> ~[apache-cassandra-2.1.3.jar:2.1.3]
> at org.apache.cassandra.dht.Bounds.(Bounds.java:34) 
> ~[apache-cassandra-2.1.3.jar:2.1.3]
> at 
> org.apache.cassandra.service.pager.RangeSliceQueryPager.makeIncludingKeyBounds(RangeSliceQueryPager.java:123)
>  ~[apache-cassandra-2.1.3.jar:2.1.3]
> at 
> org.apache.cassandra.service.pager.RangeSliceQueryPager.queryNextPage(RangeSliceQueryPager.java:74)
>  ~[apache-cassandra-2.1.3.jar:2.1.3]
> at 
> org.apache.cassandra.service.pager.AbstractQueryPager.fetchPage(AbstractQueryPager.java:87)
>  ~[apache-cassandra-2.1.3.jar:2.1.3]
> at 
> org.apache.cassandra.service.pager.RangeSliceQueryPager.fetchPage(RangeSliceQueryPager.java:37)
>  ~[apache-cassandra-2.1.3.jar:2.1.3]
> at 
> org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:219)
>  ~[apache-cassandra-2.1.3.jar:2.1.3]
> at 
> org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:62)
>  ~[apache-cassandra-2.1.3.jar:2.1.3]
> at 
> org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:238)
>  ~[apache-cassandra-2.1.3.jar:2.1.3]
> at 
> org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:493)
>  ~[apache-cassandra-2.1.3.jar:2.1.3]
> at 
> org.apache.cassandra.transport.messages.ExecuteMessage.execute(ExecuteMessage.java:134)
>  ~[apache-cassandra-2.1.3.jar:2.1.3]
> at 
> org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:439)
>  [apache-cassandra-2.1.3.jar:2.1.3]
> at 
> org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:335)
>  [apache-cassandra-2.1.3.jar:2.1.3]
> at 
> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
>  [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
>  [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:32)
>  [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext$8.run(AbstractChannelHandlerContext.java:324)
>  [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at java.util.concurrent.Executors$RunnableAdapter.call(Unknown 
> Source) [na:1.7.0_76]
> at 
> org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
>  [apache-cassandra-2.1.3.jar:2.1.3]
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
> [apache-cassandra-2.1.3.jar:2.1.3]
> at java.lang.Thread.run(Unknown Source) [na:1.7.0_76]
> INFO  [BatchlogTasks:1] 2015-06-23 13:12:17,521 ColumnFamilyStore.java:877 - 
> Enqueuing flush of batchlog: 27641 (0%) on-heap, 0 (0%) off-heap
> INFO  [MemtableFlushWriter:50154] 2015-06-23 

[jira] [Updated] (CASSANDRA-9652) Nodetool cleanup does not work for nodes taken out of replication

2016-07-22 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9652:
--
Fix Version/s: (was: 2.1.x)
   3.x

> Nodetool cleanup does not work for nodes taken out of replication
> -
>
> Key: CASSANDRA-9652
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9652
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Erick Ramirez
> Fix For: 3.x
>
>
> After taking a node (DC) out of replication, running a cleanup does not get 
> rid of the data on the node. The SSTables remain on disk and no data is 
> cleared out.
> The following entry is recorded in {{system.log}}:
> {noformat}
>  INFO [CompactionExecutor:8] 2015-06-25 12:33:01,417 CompactionManager.java 
> (line 527) Cleanup cannot run before a node has joined the ring
> {noformat}
> *STEPS TO REPRODUCE*
> # Build a (C* 2.0.10) cluster with multiple DCs.
> # Run {{cassandra-stress -n1}}  to create schema.
> # Alter schema to replicate to all DCs.
> {noformat}
> cqlsh> ALTER KEYSPACE "Keyspace1" WITH replication = { 'class' : 
> 'NetworkTopologyStrategy', 'DC1' : 2, 'DC2' : 2, 'DC3' : 1 } ;
> {noformat}
> # Run {{cassandra-stress -n10}} to generate data.
> # Alter schema to stop replication to {{DC3}}.
> # On node in {{DC3}}, run {{nodetool cleanup}}.
> *WORKAROUND*
> # Stop Cassandra.
> # Manually delete the SSTables on disk.
> # Start Cassandra.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9652) Nodetool cleanup does not work for nodes taken out of replication

2016-07-22 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9652:
--
Priority: Minor  (was: Major)

> Nodetool cleanup does not work for nodes taken out of replication
> -
>
> Key: CASSANDRA-9652
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9652
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Erick Ramirez
>Priority: Minor
> Fix For: 3.x
>
>
> After taking a node (DC) out of replication, running a cleanup does not get 
> rid of the data on the node. The SSTables remain on disk and no data is 
> cleared out.
> The following entry is recorded in {{system.log}}:
> {noformat}
>  INFO [CompactionExecutor:8] 2015-06-25 12:33:01,417 CompactionManager.java 
> (line 527) Cleanup cannot run before a node has joined the ring
> {noformat}
> *STEPS TO REPRODUCE*
> # Build a (C* 2.0.10) cluster with multiple DCs.
> # Run {{cassandra-stress -n1}}  to create schema.
> # Alter schema to replicate to all DCs.
> {noformat}
> cqlsh> ALTER KEYSPACE "Keyspace1" WITH replication = { 'class' : 
> 'NetworkTopologyStrategy', 'DC1' : 2, 'DC2' : 2, 'DC3' : 1 } ;
> {noformat}
> # Run {{cassandra-stress -n10}} to generate data.
> # Alter schema to stop replication to {{DC3}}.
> # On node in {{DC3}}, run {{nodetool cleanup}}.
> *WORKAROUND*
> # Stop Cassandra.
> # Manually delete the SSTables on disk.
> # Start Cassandra.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CASSANDRA-9702) Repair running really slow

2016-07-22 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-9702.
---
   Resolution: Cannot Reproduce
Fix Version/s: (was: 2.1.x)

> Repair running really slow
> --
>
> Key: CASSANDRA-9702
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9702
> Project: Cassandra
>  Issue Type: Bug
>  Components: Streaming and Messaging
> Environment: C* 2.1.7, Debian Wheezy
>Reporter: mlowicki
> Attachments: db1.system.log
>
>
> We're using 2.1.x since the very beginning and we always had problem with 
> failing or slow repair. In one data center we aren't able to finish repair 
> for many weeks (partially because CASSANDRA-9681 as we needed to reboot nodes 
> periodically).
> I've launched it today morning (12 hours now) and monitor using 
> https://github.com/spotify/cassandra-opstools/blob/master/bin/spcassandra-repairstats.
>  For the first hour it progressed to 9.43% but then it took ~10 hours to 
> reach 9.44%. I see very rarely logs related to repair (each 15-20 minutes but 
> sometimes nothing new for 1 hour).
> Repair launched with:
> {code}
> nodetool repair --partitioner-range --parallel --in-local-dc {keyspace}
> {code}
> Attached log file from today.
> We've ~4.1TB of data in 12 nodes with RF set to 3 (2 DC with 6 nodes each).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CASSANDRA-9729) CQLSH exception - OverflowError: normalized days too large to fit in a C int

2016-07-22 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-9729.
---
   Resolution: Invalid
Fix Version/s: (was: 2.1.x)

> CQLSH exception - OverflowError: normalized days too large to fit in a C int
> 
>
> Key: CASSANDRA-9729
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9729
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
> Environment: OSX 10.10.2
>Reporter: Chandran Anjur Narasimhan
>  Labels: cqlsh
>
> Running a select command using CQLSH 2.1.5, 2.1.7 throws exception. This 
> works nicely in 2.0.14 version.
> Environment:
> 
> JAVA - 1.8
> Python - 2.7.6
> Cassandra Server - 2.1.7
> CQLSH - 5.0.1
> Logs:
> ==
> CQLSH - cassandra 2.0.14 - working with no issues
> -
> {noformat}
> NCHAN-M-D0LZ:apache nchan$ cd apache-cassandra-2.0.14/
> NCHAN-M-D0LZ:apache-cassandra-2.0.14 nchan$ bin/cqlsh
> Connected to CCC Multi-Region Cassandra Cluster at :9160.
> [cqlsh 4.1.1 | Cassandra 2.1.7 | CQL spec 3.1.1 | Thrift protocol 19.39.0]
> Use HELP for help.
> cqlsh> use ccc;
> cqlsh:ccc> select count(*) from task_result where 
> submissionid='40f89a3d1f4711e5ac2b005056bb0e8b';
>  count
> ---
> 25
> (1 rows)
> cqlsh:ccc> select * from task_result where 
> submissionid='40f89a3d1f4711e5ac2b005056bb0e8b';
> < i get all the 25 values>
> {noformat}
> CQLSH - cassandra 2.1.5  - python exception
> -
> {noformat}
> NCHAN-M-D0LZ:apache-cassandra-2.1.5 nchan$ bin/cqlsh
> Connected to CCC Multi-Region Cassandra Cluster at :9042.
> [cqlsh 5.0.1 | Cassandra 2.1.7 | CQL spec 3.2.0 | Native protocol v3]
> Use HELP for help.
> cqlsh> use ccc;
> cqlsh:ccc> select count(*) from task_result where 
> submissionid='40f89a3d1f4711e5ac2b005056bb0e8b';
>  count
> ---
> 25
> (1 rows)
> cqlsh:ccc> select * from task_result where 
> submissionid='40f89a3d1f4711e5ac2b005056bb0e8b';
> Traceback (most recent call last):
>   File "bin/cqlsh", line 1001, in perform_simple_statement
> rows = self.session.execute(statement, trace=self.tracing_enabled)
>   File 
> "/Users/nchan/Programs/apache/apache-cassandra-2.1.5/bin/../lib/cassandra-driver-internal-only-2.5.0.zip/cassandra-driver-2.5.0/cassandra/cluster.py",
>  line 1404, in execute
> result = future.result(timeout)
>   File 
> "/Users/nchan/Programs/apache/apache-cassandra-2.1.5/bin/../lib/cassandra-driver-internal-only-2.5.0.zip/cassandra-driver-2.5.0/cassandra/cluster.py",
>  line 2974, in result
> raise self._final_exception
> OverflowError: normalized days too large to fit in a C int
> cqlsh:ccc> 
> {noformat}
> CQLSH - cassandra 2.1.7 - python exception
> -
> {noformat}
> NCHAN-M-D0LZ:apache-cassandra-2.1.7 nchan$ bin/cqlsh
> Connected to CCC Multi-Region Cassandra Cluster at 171.71.189.11:9042.
> [cqlsh 5.0.1 | Cassandra 2.1.7 | CQL spec 3.2.0 | Native protocol v3]
> Use HELP for help.
> cqlsh> use ccc;
> cqlsh:ccc> select count(*) from task_result where 
> submissionid='40f89a3d1f4711e5ac2b005056bb0e8b';
>  count
> ---
> 25
> (1 rows)
> cqlsh:ccc> select * from task_result where 
> submissionid='40f89a3d1f4711e5ac2b005056bb0e8b';
> Traceback (most recent call last):
>   File "bin/cqlsh", line 1041, in perform_simple_statement
> rows = self.session.execute(statement, trace=self.tracing_enabled)
>   File 
> "/Users/nchan/Programs/apache/apache-cassandra-2.1.7/bin/../lib/cassandra-driver-internal-only-2.5.1.zip/cassandra-driver-2.5.1/cassandra/cluster.py",
>  line 1405, in execute
> result = future.result(timeout)
>   File 
> "/Users/nchan/Programs/apache/apache-cassandra-2.1.7/bin/../lib/cassandra-driver-internal-only-2.5.1.zip/cassandra-driver-2.5.1/cassandra/cluster.py",
>  line 2976, in result
> raise self._final_exception
> OverflowError: normalized days too large to fit in a C int
> cqlsh:ccc> 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9772) Bound the number of concurrent range requests

2016-07-22 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9772:
--
Fix Version/s: (was: 2.2.x)
   (was: 2.1.x)
   3.x

> Bound the number of concurrent range requests
> -
>
> Key: CASSANDRA-9772
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9772
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Tyler Hobbs
> Fix For: 3.x
>
>
> After CASSANDRA-1337, we will execute requests for many token ranges 
> concurrently based on our estimate of how many ranges will be required to 
> meet the requested LIMIT.  For queries with a lot of results this is 
> generally fine, because it will only take a few ranges to satisfy the limit.  
> However, for queries with very few results, this may result in the 
> coordinator concurrently requesting all token ranges.  On large vnode 
> clusters, this will be particularly problematic.
> Placing a simple bound on the number of concurrent requests is a good first 
> step.  Long-term, we should look into creating a new range command that 
> supports requesting multiple ranges.  This would eliminate the overhead of 
> serializing and handling hundreds of separate commands.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9641) Occasional timeouts with blockFor=all for LOCAL_QUORUM query

2016-07-22 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15389843#comment-15389843
 ] 

Jonathan Ellis commented on CASSANDRA-9641:
---

Were you able to track this down further, Richard?  What version are you on now?

> Occasional timeouts with blockFor=all for LOCAL_QUORUM query
> 
>
> Key: CASSANDRA-9641
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9641
> Project: Cassandra
>  Issue Type: Bug
>  Components: Coordination
>Reporter: Richard Low
> Fix For: 2.1.x, 2.2.x, 3.0.x
>
>
> We have a keyspace using NetworkTopologyStrategy with options DC1:3, DC2:3. 
> Our tables have
> read_repair_chance = 0.0
> dclocal_read_repair_chance = 0.1
> speculative_retry = ’99.0PERCENTILE'
> and all reads are at LOCAL_QUORUM. On 2.0.11, we occasionally see this 
> timeout:
> Cassandra timeout during read query at consistency ALL (6 responses were 
> required but only 5 replica responded)
> (sometimes only 4 respond). The ALL is probably due to CASSANDRA-7947 if this 
> occurs during a digest mismatch, but what is interesting is it is expecting 6 
> responses i.e. blockFor is set to all replicas. I can’t see how this should 
> happen. From the code it should never set blockFor to more than 4 (although 4 
> is still wrong - I'll make a separate JIRA for that).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CASSANDRA-9661) Endless compaction to a tiny, tombstoned SStable

2016-07-22 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-9661.
---
   Resolution: Duplicate
Fix Version/s: (was: 3.0.x)
   (was: 2.2.x)
   (was: 2.1.x)

Yes.

> Endless compaction to a tiny, tombstoned SStable
> 
>
> Key: CASSANDRA-9661
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9661
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: WeiFan
>  Labels: compaction, dtcs
>
> We deployed a 3-nodes cluster (with 2.1.5) which worked under stable write 
> requests ( about 2k wps) to a CF with DTCS, a default TTL as 43200s and 
> gc_grace as 21600s. The CF contained inserted only, complete time series 
> data. We found cassandra will occasionally keep writing logs like this:
> INFO  [CompactionExecutor:30551] 2015-06-26 18:10:06,195 
> CompactionTask.java:270 - Compacted 1 sstables to 
> [/home/cassandra/workdata/data/sen_vaas_test/nodestatus-f96c7c50155811e589f69752ac9b06c7/sen_vaas_test-nodestatus-ka-2516270,].
>   449 bytes to 449 (~100% of original) in 12ms = 0.035683MB/s.  4 total 
> partitions merged to 4.  Partition merge counts were {1:4, }
> INFO  [CompactionExecutor:30551] 2015-06-26 18:10:06,241 
> CompactionTask.java:140 - Compacting 
> [SSTableReader(path='/home/cassandra/workdata/data/sen_vaas_test/nodestatus-f96c7c50155811e589f69752ac9b06c7/sen_vaas_test-nodestatus-ka-2516270-Data.db')]
> INFO  [CompactionExecutor:30551] 2015-06-26 18:10:06,253 
> CompactionTask.java:270 - Compacted 1 sstables to 
> [/home/cassandra/workdata/data/sen_vaas_test/nodestatus-f96c7c50155811e589f69752ac9b06c7/sen_vaas_test-nodestatus-ka-2516271,].
>   449 bytes to 449 (~100% of original) in 12ms = 0.035683MB/s.  4 total 
> partitions merged to 4.  Partition merge counts were {1:4, }
> It seems that cassandra kept doing compacting to a single SStable, serveral 
> times per second, and lasted for many hours. Tons of logs were thrown and one 
> CPU core exhausted during this time. The endless compacting finally end when 
> another compaction started with a group of SStables (including previous one). 
> All of our 3 nodes have been hit by this problem, but occurred in different 
> time.
> We could not figure out how the problematic SStable come up because the log 
> has wrapped around. 
> We have dumped the records in the SStable and found it has the oldest data in 
> our CF (again, our data was time series), and all of the record in this 
> SStable have bben expired for more than 18 hours (12 hrs TTL + 6 hrs gc) so 
> they should be dropped. However, c* do nothing to this SStable but compacting 
> it again and again, until more SStable were out-dated enough to be considered 
> for compacting together with this one by DTCS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   3   4   5   6   7   8   9   10   >