date:20160216

[jira] [Commented] (CASSANDRA-10587) sstablemetadata NPE on cassandra 2.2

2016-02-16 Thread Roman Skvazh (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-10587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15150067#comment-15150067
 ] 

Roman Skvazh commented on CASSANDRA-10587:
--

[~yukim], yeah, with absolute path it works! Thanks.

> sstablemetadata NPE on cassandra 2.2
> 
>
> Key: CASSANDRA-10587
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10587
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Tiago Batista
>Assignee: Yuki Morishita
>Priority: Minor
> Fix For: 2.2.x, 3.x
>
>
> I have recently upgraded my cassandra cluster to 2.2, currently running 
> 2.2.3. After running the first repair, cassandra renames the sstables to the 
> new naming schema that does not contain the keyspace name.
>  This causes sstablemetadata to fail with the following stack trace:
> {noformat}
> Exception in thread "main" java.lang.NullPointerException
> at 
> org.apache.cassandra.io.sstable.Descriptor.fromFilename(Descriptor.java:275)
> at 
> org.apache.cassandra.io.sstable.Descriptor.fromFilename(Descriptor.java:172)
> at 
> org.apache.cassandra.tools.SSTableMetadataViewer.main(SSTableMetadataViewer.java:52)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11172) Infinite loop bug adding high-level SSTableReader in compaction

2016-02-16 Thread Will Hayworth (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15150049#comment-15150049
 ] 

Will Hayworth commented on CASSANDRA-11172:
---

debug.log? I can go hunt through my compressed files but this quickly pushes 
the other logs out of rotation. I've watched it happen. Thousands upon 
thousands of lines. And it's happened for a bunch of different nodes. The 
solution has been to kill them and restart but it's temporary.

I just checked--all 20 zipped system.logs are filled with 40,000+ lines like 
the ones in my sample file. I was going to upload some to you but they're just 
identical, including the earliest lines I have:
{noformat}
INFO  [CompactionExecutor:35] 2016-02-17 01:59:03,111 LeveledManifest.java:438 
- Adding high-level (L0) 
BigTableReader(path='/var/lib/cassandra/data/segmentation/domain_events_by_event_domain_time-e81d74f0cd3a11e5aad8e7b84e29e52f/ma-3663-big-Data.db')
 to candidates
INFO  [CompactionExecutor:37] 2016-02-17 01:59:03,112 LeveledManifest.java:438 
- Adding high-level (L3) 
BigTableReader(path='/var/lib/cassandra/data/segmentation/times_by_event_domain_user-e6112a30cd3a11e5ba896547d15a24f6/ma-4586-big-Data.db')
 to candidates
INFO  [CompactionExecutor:35] 2016-02-17 01:59:03,276 LeveledManifest.java:438 
- Adding high-level (L0) 
BigTableReader(path='/var/lib/cassandra/data/segmentation/domain_events_by_event_domain_time-e81d74f0cd3a11e5aad8e7b84e29e52f/ma-3663-big-Data.db')
 to candidates
INFO  [CompactionExecutor:37] 2016-02-17 01:59:03,276 LeveledManifest.java:438 
- Adding high-level (L3) 
BigTableReader(path='/var/lib/cassandra/data/segmentation/times_by_event_domain_user-e6112a30cd3a11e5ba896547d15a24f6/ma-4586-big-Data.db')
 to candidates
{noformat}

> Infinite loop bug adding high-level SSTableReader in compaction
> ---
>
> Key: CASSANDRA-11172
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11172
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: DSE 4.x / Cassandra 2.1.11.969
>Reporter: Jeff Ferland
>Assignee: Marcus Eriksson
> Attachments: beep.txt
>
>
> Observed that after a large repair on LCS that sometimes the system will 
> enter an infinite loop with vast amounts of logs lines recording, "Adding 
> high-level (L${LEVEL}) SSTableReader(path='${TABLE}') to candidates"
> This results in an outage of the node and eventual crashing. The log spam 
> quickly rotates out possibly useful earlier debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11124) Change default cqlsh encoding to utf-8

2016-02-16 Thread Lucas Amorim (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15150041#comment-15150041
 ] 

Lucas Amorim commented on CASSANDRA-11124:
--

[~pauloricardomg] Can I work on it?

> Change default cqlsh encoding to utf-8
> --
>
> Key: CASSANDRA-11124
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11124
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Paulo Motta
>Assignee: Paulo Motta
>Priority: Trivial
>  Labels: cqlsh
>
> Strange things can happen when utf-8 is not the default cqlsh encoding (see 
> CASSANDRA-11030). This ticket proposes changing the default cqlsh encoding to 
> utf-8.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-11173) Add extension points in storage and streaming classes

2016-02-16 Thread Marcus Eriksson (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-11173:

Reviewer: Marcus Eriksson

> Add extension points in storage and streaming classes
> -
>
> Key: CASSANDRA-11173
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11173
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
> Fix For: 3.4
>
>
> It would be useful if 3rd party classes could be notified when reads and 
> writes occur on a table+partition, when sstables are being streamed out/in, 
> and could also intercept the creation of row iterators from sstables. I have 
> a [v1 branch here|https://github.com/bdeggleston/cassandra/tree/hooksV1]. It 
> illustrates the extension points I'm looking for, but is not necessarily the 
> best api.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11172) Infinite loop bug adding high-level SSTableReader in compaction

2016-02-16 Thread Marcus Eriksson (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15149983#comment-15149983
 ] 

Marcus Eriksson commented on CASSANDRA-11172:
-

greping the logs for ma-3663 would be helpful as well

> Infinite loop bug adding high-level SSTableReader in compaction
> ---
>
> Key: CASSANDRA-11172
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11172
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: DSE 4.x / Cassandra 2.1.11.969
>Reporter: Jeff Ferland
>Assignee: Marcus Eriksson
> Attachments: beep.txt
>
>
> Observed that after a large repair on LCS that sometimes the system will 
> enter an infinite loop with vast amounts of logs lines recording, "Adding 
> high-level (L${LEVEL}) SSTableReader(path='${TABLE}') to candidates"
> This results in an outage of the node and eventual crashing. The log spam 
> quickly rotates out possibly useful earlier debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11172) Infinite loop bug adding high-level SSTableReader in compaction

2016-02-16 Thread Marcus Eriksson (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15149978#comment-15149978
 ] 

Marcus Eriksson commented on CASSANDRA-11172:
-

[~_wsh] do you have the logs leading up to the breakage? The debug.log would be 
helpful

Does it happen every time for this node?

> Infinite loop bug adding high-level SSTableReader in compaction
> ---
>
> Key: CASSANDRA-11172
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11172
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: DSE 4.x / Cassandra 2.1.11.969
>Reporter: Jeff Ferland
>Assignee: Marcus Eriksson
> Attachments: beep.txt
>
>
> Observed that after a large repair on LCS that sometimes the system will 
> enter an infinite loop with vast amounts of logs lines recording, "Adding 
> high-level (L${LEVEL}) SSTableReader(path='${TABLE}') to candidates"
> This results in an outage of the node and eventual crashing. The log spam 
> quickly rotates out possibly useful earlier debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11169) [sasi] exception thrown when trying to index row with index on set

2016-02-16 Thread Jason Brown (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15149912#comment-15149912
 ] 

Jason Brown commented on CASSANDRA-11169:
-

+1, lgtm

> [sasi] exception thrown when trying to index row with index on set
> 
>
> Key: CASSANDRA-11169
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11169
> Project: Cassandra
>  Issue Type: Bug
>  Components: sasi
>Reporter: Jon Haddad
>Assignee: Pavel Yaskevich
> Fix For: 3.4
>
>
> I have a brand new cluster, built off 1944bf507d66b5c103c136319caeb4a9e3767a69
> I created a new table with a set, then a SASI index on the set.  I 
> tried to insert a row with a set, Cassandra throws an exception and becomes 
> unavailable.
> {code}
> cqlsh> create KEYSPACE test WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> cqlsh> use test;
> cqlsh:test> create table a (id int PRIMARY KEY , s set );
> cqlsh:test> create CUSTOM INDEX on a(s) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex';
> cqlsh:test> insert into a (id, s) values (1, {'jon', 'haddad'});
> WriteTimeout: code=1100 [Coordinator node timed out waiting for replica 
> nodes' responses] message="Operation timed out - received only 0 responses." 
> info={'received_responses': 0, 'required_responses': 1, 'consistency': 'ONE'}
> {code}
> Cassandra stacktrace:
> {code}
> java.lang.AssertionError: null
>   at org.apache.cassandra.db.rows.BTreeRow.getCell(BTreeRow.java:212) 
> ~[main/:na]
>   at 
> org.apache.cassandra.index.sasi.conf.ColumnIndex.getValueOf(ColumnIndex.java:194)
>  ~[main/:na]
>   at 
> org.apache.cassandra.index.sasi.conf.ColumnIndex.index(ColumnIndex.java:95) 
> ~[main/:na]
>   at 
> org.apache.cassandra.index.sasi.SASIIndex$1.insertRow(SASIIndex.java:247) 
> ~[main/:na]
>   at 
> org.apache.cassandra.index.SecondaryIndexManager$WriteTimeTransaction.onInserted(SecondaryIndexManager.java:808)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.partitions.AtomicBTreePartition$RowUpdater.apply(AtomicBTreePartition.java:335)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.partitions.AtomicBTreePartition$RowUpdater.apply(AtomicBTreePartition.java:295)
>  ~[main/:na]
>   at org.apache.cassandra.utils.btree.BTree.buildInternal(BTree.java:136) 
> ~[main/:na]
>   at org.apache.cassandra.utils.btree.BTree.build(BTree.java:118) 
> ~[main/:na]
>   at org.apache.cassandra.utils.btree.BTree.update(BTree.java:177) 
> ~[main/:na]
>   at 
> org.apache.cassandra.db.partitions.AtomicBTreePartition.addAllWithSizeDelta(AtomicBTreePartition.java:156)
>  ~[main/:na]
>   at org.apache.cassandra.db.Memtable.put(Memtable.java:244) ~[main/:na]
>   at 
> org.apache.cassandra.db.ColumnFamilyStore.apply(ColumnFamilyStore.java:1216) 
> ~[main/:na]
>   at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:531) ~[main/:na]
>   at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:399) ~[main/:na]
>   at org.apache.cassandra.db.Mutation.applyFuture(Mutation.java:202) 
> ~[main/:na]
>   at org.apache.cassandra.db.Mutation.apply(Mutation.java:214) ~[main/:na]
>   at org.apache.cassandra.db.Mutation.apply(Mutation.java:228) ~[main/:na]
>   at 
> org.apache.cassandra.service.StorageProxy$$Lambda$201/413275033.run(Unknown 
> Source) ~[na:na]
>   at 
> org.apache.cassandra.service.StorageProxy$8.runMayThrow(StorageProxy.java:1343)
>  ~[main/:na]
>   at 
> org.apache.cassandra.service.StorageProxy$LocalMutationRunnable.run(StorageProxy.java:2520)
>  ~[main/:na]
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_45]
>   at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
>  ~[main/:na]
>   at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136)
>  [main/:na]
>   at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
> [main/:na]
>   at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-11172) Infinite loop bug adding high-level SSTableReader in compaction

2016-02-16 Thread Will Hayworth (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15149814#comment-15149814
 ] 

Will Hayworth edited comment on CASSANDRA-11172 at 2/17/16 4:16 AM:


Seeing this on C* 3.3 after running a full repair on another node. {{nodetool 
repair --full -pr -j 4 my_keyspace_name}} I can provide whatever further 
details you'd like, obviously.


was (Author: _wsh):
Seeing this on C* 3.3 after running a full repair on another node. {{nodetool 
repair --full -pr -j 4 my_keyspace_name}}

> Infinite loop bug adding high-level SSTableReader in compaction
> ---
>
> Key: CASSANDRA-11172
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11172
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: DSE 4.x / Cassandra 2.1.11.969
>Reporter: Jeff Ferland
>Assignee: Marcus Eriksson
> Attachments: beep.txt
>
>
> Observed that after a large repair on LCS that sometimes the system will 
> enter an infinite loop with vast amounts of logs lines recording, "Adding 
> high-level (L${LEVEL}) SSTableReader(path='${TABLE}') to candidates"
> This results in an outage of the node and eventual crashing. The log spam 
> quickly rotates out possibly useful earlier debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Issue Comment Deleted] (CASSANDRA-11172) Infinite loop bug adding high-level SSTableReader in compaction

2016-02-16 Thread Will Hayworth (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Will Hayworth updated CASSANDRA-11172:
--
Comment: was deleted

(was: I'm seeing this exact problem with full repairs on C* 3.3.)

> Infinite loop bug adding high-level SSTableReader in compaction
> ---
>
> Key: CASSANDRA-11172
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11172
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: DSE 4.x / Cassandra 2.1.11.969
>Reporter: Jeff Ferland
>Assignee: Marcus Eriksson
> Attachments: beep.txt
>
>
> Observed that after a large repair on LCS that sometimes the system will 
> enter an infinite loop with vast amounts of logs lines recording, "Adding 
> high-level (L${LEVEL}) SSTableReader(path='${TABLE}') to candidates"
> This results in an outage of the node and eventual crashing. The log spam 
> quickly rotates out possibly useful earlier debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-11172) Infinite loop bug adding high-level SSTableReader in compaction

2016-02-16 Thread Will Hayworth (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Will Hayworth updated CASSANDRA-11172:
--
Attachment: beep.txt

Seeing this on C* 3.3 after running a full repair on another node. {{nodetool 
repair --full -pr -j 4 my_keyspace_name}}

> Infinite loop bug adding high-level SSTableReader in compaction
> ---
>
> Key: CASSANDRA-11172
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11172
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: DSE 4.x / Cassandra 2.1.11.969
>Reporter: Jeff Ferland
>Assignee: Marcus Eriksson
> Attachments: beep.txt
>
>
> Observed that after a large repair on LCS that sometimes the system will 
> enter an infinite loop with vast amounts of logs lines recording, "Adding 
> high-level (L${LEVEL}) SSTableReader(path='${TABLE}') to candidates"
> This results in an outage of the node and eventual crashing. The log spam 
> quickly rotates out possibly useful earlier debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11172) Infinite loop bug adding high-level SSTableReader in compaction

2016-02-16 Thread Will Hayworth (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15149812#comment-15149812
 ] 

Will Hayworth commented on CASSANDRA-11172:
---

I'm seeing this exact problem with full repairs on C* 3.3.

> Infinite loop bug adding high-level SSTableReader in compaction
> ---
>
> Key: CASSANDRA-11172
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11172
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: DSE 4.x / Cassandra 2.1.11.969
>Reporter: Jeff Ferland
>Assignee: Marcus Eriksson
>
> Observed that after a large repair on LCS that sometimes the system will 
> enter an infinite loop with vast amounts of logs lines recording, "Adding 
> high-level (L${LEVEL}) SSTableReader(path='${TABLE}') to candidates"
> This results in an outage of the node and eventual crashing. The log spam 
> quickly rotates out possibly useful earlier debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11169) [sasi] exception thrown when trying to index row with index on set

2016-02-16 Thread Pavel Yaskevich (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15149775#comment-15149775
 ] 

Pavel Yaskevich commented on CASSANDRA-11169:
-

[~beobal] I've pushed index validation changes to 
[CASSANDRA-11169|https://github.com/xedin/cassandra/tree/CASSANDRA-11169] it 
validates/rejects the case when column in question is complex and/or 
partitioner is not Murmur3Partitioner.

> [sasi] exception thrown when trying to index row with index on set
> 
>
> Key: CASSANDRA-11169
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11169
> Project: Cassandra
>  Issue Type: Bug
>  Components: sasi
>Reporter: Jon Haddad
>Assignee: Pavel Yaskevich
> Fix For: 3.4
>
>
> I have a brand new cluster, built off 1944bf507d66b5c103c136319caeb4a9e3767a69
> I created a new table with a set, then a SASI index on the set.  I 
> tried to insert a row with a set, Cassandra throws an exception and becomes 
> unavailable.
> {code}
> cqlsh> create KEYSPACE test WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> cqlsh> use test;
> cqlsh:test> create table a (id int PRIMARY KEY , s set );
> cqlsh:test> create CUSTOM INDEX on a(s) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex';
> cqlsh:test> insert into a (id, s) values (1, {'jon', 'haddad'});
> WriteTimeout: code=1100 [Coordinator node timed out waiting for replica 
> nodes' responses] message="Operation timed out - received only 0 responses." 
> info={'received_responses': 0, 'required_responses': 1, 'consistency': 'ONE'}
> {code}
> Cassandra stacktrace:
> {code}
> java.lang.AssertionError: null
>   at org.apache.cassandra.db.rows.BTreeRow.getCell(BTreeRow.java:212) 
> ~[main/:na]
>   at 
> org.apache.cassandra.index.sasi.conf.ColumnIndex.getValueOf(ColumnIndex.java:194)
>  ~[main/:na]
>   at 
> org.apache.cassandra.index.sasi.conf.ColumnIndex.index(ColumnIndex.java:95) 
> ~[main/:na]
>   at 
> org.apache.cassandra.index.sasi.SASIIndex$1.insertRow(SASIIndex.java:247) 
> ~[main/:na]
>   at 
> org.apache.cassandra.index.SecondaryIndexManager$WriteTimeTransaction.onInserted(SecondaryIndexManager.java:808)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.partitions.AtomicBTreePartition$RowUpdater.apply(AtomicBTreePartition.java:335)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.partitions.AtomicBTreePartition$RowUpdater.apply(AtomicBTreePartition.java:295)
>  ~[main/:na]
>   at org.apache.cassandra.utils.btree.BTree.buildInternal(BTree.java:136) 
> ~[main/:na]
>   at org.apache.cassandra.utils.btree.BTree.build(BTree.java:118) 
> ~[main/:na]
>   at org.apache.cassandra.utils.btree.BTree.update(BTree.java:177) 
> ~[main/:na]
>   at 
> org.apache.cassandra.db.partitions.AtomicBTreePartition.addAllWithSizeDelta(AtomicBTreePartition.java:156)
>  ~[main/:na]
>   at org.apache.cassandra.db.Memtable.put(Memtable.java:244) ~[main/:na]
>   at 
> org.apache.cassandra.db.ColumnFamilyStore.apply(ColumnFamilyStore.java:1216) 
> ~[main/:na]
>   at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:531) ~[main/:na]
>   at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:399) ~[main/:na]
>   at org.apache.cassandra.db.Mutation.applyFuture(Mutation.java:202) 
> ~[main/:na]
>   at org.apache.cassandra.db.Mutation.apply(Mutation.java:214) ~[main/:na]
>   at org.apache.cassandra.db.Mutation.apply(Mutation.java:228) ~[main/:na]
>   at 
> org.apache.cassandra.service.StorageProxy$$Lambda$201/413275033.run(Unknown 
> Source) ~[na:na]
>   at 
> org.apache.cassandra.service.StorageProxy$8.runMayThrow(StorageProxy.java:1343)
>  ~[main/:na]
>   at 
> org.apache.cassandra.service.StorageProxy$LocalMutationRunnable.run(StorageProxy.java:2520)
>  ~[main/:na]
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_45]
>   at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
>  ~[main/:na]
>   at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136)
>  [main/:na]
>   at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
> [main/:na]
>   at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-11169) [sasi] exception thrown when trying to index row with index on set

2016-02-16 Thread Pavel Yaskevich (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Yaskevich updated CASSANDRA-11169:

 Reviewer: Sam Tunnicliffe
Fix Version/s: 3.4
  Component/s: sasi

> [sasi] exception thrown when trying to index row with index on set
> 
>
> Key: CASSANDRA-11169
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11169
> Project: Cassandra
>  Issue Type: Bug
>  Components: sasi
>Reporter: Jon Haddad
>Assignee: Pavel Yaskevich
> Fix For: 3.4
>
>
> I have a brand new cluster, built off 1944bf507d66b5c103c136319caeb4a9e3767a69
> I created a new table with a set, then a SASI index on the set.  I 
> tried to insert a row with a set, Cassandra throws an exception and becomes 
> unavailable.
> {code}
> cqlsh> create KEYSPACE test WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> cqlsh> use test;
> cqlsh:test> create table a (id int PRIMARY KEY , s set );
> cqlsh:test> create CUSTOM INDEX on a(s) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex';
> cqlsh:test> insert into a (id, s) values (1, {'jon', 'haddad'});
> WriteTimeout: code=1100 [Coordinator node timed out waiting for replica 
> nodes' responses] message="Operation timed out - received only 0 responses." 
> info={'received_responses': 0, 'required_responses': 1, 'consistency': 'ONE'}
> {code}
> Cassandra stacktrace:
> {code}
> java.lang.AssertionError: null
>   at org.apache.cassandra.db.rows.BTreeRow.getCell(BTreeRow.java:212) 
> ~[main/:na]
>   at 
> org.apache.cassandra.index.sasi.conf.ColumnIndex.getValueOf(ColumnIndex.java:194)
>  ~[main/:na]
>   at 
> org.apache.cassandra.index.sasi.conf.ColumnIndex.index(ColumnIndex.java:95) 
> ~[main/:na]
>   at 
> org.apache.cassandra.index.sasi.SASIIndex$1.insertRow(SASIIndex.java:247) 
> ~[main/:na]
>   at 
> org.apache.cassandra.index.SecondaryIndexManager$WriteTimeTransaction.onInserted(SecondaryIndexManager.java:808)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.partitions.AtomicBTreePartition$RowUpdater.apply(AtomicBTreePartition.java:335)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.partitions.AtomicBTreePartition$RowUpdater.apply(AtomicBTreePartition.java:295)
>  ~[main/:na]
>   at org.apache.cassandra.utils.btree.BTree.buildInternal(BTree.java:136) 
> ~[main/:na]
>   at org.apache.cassandra.utils.btree.BTree.build(BTree.java:118) 
> ~[main/:na]
>   at org.apache.cassandra.utils.btree.BTree.update(BTree.java:177) 
> ~[main/:na]
>   at 
> org.apache.cassandra.db.partitions.AtomicBTreePartition.addAllWithSizeDelta(AtomicBTreePartition.java:156)
>  ~[main/:na]
>   at org.apache.cassandra.db.Memtable.put(Memtable.java:244) ~[main/:na]
>   at 
> org.apache.cassandra.db.ColumnFamilyStore.apply(ColumnFamilyStore.java:1216) 
> ~[main/:na]
>   at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:531) ~[main/:na]
>   at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:399) ~[main/:na]
>   at org.apache.cassandra.db.Mutation.applyFuture(Mutation.java:202) 
> ~[main/:na]
>   at org.apache.cassandra.db.Mutation.apply(Mutation.java:214) ~[main/:na]
>   at org.apache.cassandra.db.Mutation.apply(Mutation.java:228) ~[main/:na]
>   at 
> org.apache.cassandra.service.StorageProxy$$Lambda$201/413275033.run(Unknown 
> Source) ~[na:na]
>   at 
> org.apache.cassandra.service.StorageProxy$8.runMayThrow(StorageProxy.java:1343)
>  ~[main/:na]
>   at 
> org.apache.cassandra.service.StorageProxy$LocalMutationRunnable.run(StorageProxy.java:2520)
>  ~[main/:na]
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_45]
>   at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
>  ~[main/:na]
>   at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136)
>  [main/:na]
>   at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
> [main/:na]
>   at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (CASSANDRA-11169) [sasi] exception thrown when trying to index row with index on set

2016-02-16 Thread Pavel Yaskevich (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Yaskevich reassigned CASSANDRA-11169:
---

Assignee: Pavel Yaskevich

> [sasi] exception thrown when trying to index row with index on set
> 
>
> Key: CASSANDRA-11169
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11169
> Project: Cassandra
>  Issue Type: Bug
>  Components: sasi
>Reporter: Jon Haddad
>Assignee: Pavel Yaskevich
> Fix For: 3.4
>
>
> I have a brand new cluster, built off 1944bf507d66b5c103c136319caeb4a9e3767a69
> I created a new table with a set, then a SASI index on the set.  I 
> tried to insert a row with a set, Cassandra throws an exception and becomes 
> unavailable.
> {code}
> cqlsh> create KEYSPACE test WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> cqlsh> use test;
> cqlsh:test> create table a (id int PRIMARY KEY , s set );
> cqlsh:test> create CUSTOM INDEX on a(s) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex';
> cqlsh:test> insert into a (id, s) values (1, {'jon', 'haddad'});
> WriteTimeout: code=1100 [Coordinator node timed out waiting for replica 
> nodes' responses] message="Operation timed out - received only 0 responses." 
> info={'received_responses': 0, 'required_responses': 1, 'consistency': 'ONE'}
> {code}
> Cassandra stacktrace:
> {code}
> java.lang.AssertionError: null
>   at org.apache.cassandra.db.rows.BTreeRow.getCell(BTreeRow.java:212) 
> ~[main/:na]
>   at 
> org.apache.cassandra.index.sasi.conf.ColumnIndex.getValueOf(ColumnIndex.java:194)
>  ~[main/:na]
>   at 
> org.apache.cassandra.index.sasi.conf.ColumnIndex.index(ColumnIndex.java:95) 
> ~[main/:na]
>   at 
> org.apache.cassandra.index.sasi.SASIIndex$1.insertRow(SASIIndex.java:247) 
> ~[main/:na]
>   at 
> org.apache.cassandra.index.SecondaryIndexManager$WriteTimeTransaction.onInserted(SecondaryIndexManager.java:808)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.partitions.AtomicBTreePartition$RowUpdater.apply(AtomicBTreePartition.java:335)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.partitions.AtomicBTreePartition$RowUpdater.apply(AtomicBTreePartition.java:295)
>  ~[main/:na]
>   at org.apache.cassandra.utils.btree.BTree.buildInternal(BTree.java:136) 
> ~[main/:na]
>   at org.apache.cassandra.utils.btree.BTree.build(BTree.java:118) 
> ~[main/:na]
>   at org.apache.cassandra.utils.btree.BTree.update(BTree.java:177) 
> ~[main/:na]
>   at 
> org.apache.cassandra.db.partitions.AtomicBTreePartition.addAllWithSizeDelta(AtomicBTreePartition.java:156)
>  ~[main/:na]
>   at org.apache.cassandra.db.Memtable.put(Memtable.java:244) ~[main/:na]
>   at 
> org.apache.cassandra.db.ColumnFamilyStore.apply(ColumnFamilyStore.java:1216) 
> ~[main/:na]
>   at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:531) ~[main/:na]
>   at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:399) ~[main/:na]
>   at org.apache.cassandra.db.Mutation.applyFuture(Mutation.java:202) 
> ~[main/:na]
>   at org.apache.cassandra.db.Mutation.apply(Mutation.java:214) ~[main/:na]
>   at org.apache.cassandra.db.Mutation.apply(Mutation.java:228) ~[main/:na]
>   at 
> org.apache.cassandra.service.StorageProxy$$Lambda$201/413275033.run(Unknown 
> Source) ~[na:na]
>   at 
> org.apache.cassandra.service.StorageProxy$8.runMayThrow(StorageProxy.java:1343)
>  ~[main/:na]
>   at 
> org.apache.cassandra.service.StorageProxy$LocalMutationRunnable.run(StorageProxy.java:2520)
>  ~[main/:na]
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_45]
>   at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
>  ~[main/:na]
>   at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136)
>  [main/:na]
>   at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
> [main/:na]
>   at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11173) Add extension points in storage and streaming classes

2016-02-16 Thread Blake Eggleston (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15149607#comment-15149607
 ] 

Blake Eggleston commented on CASSANDRA-11173:
-

With the streaming hooks, the goal is to allow the sender to make information 
available about the sstable being sent, which the receiver can then retrieve, 
sort of like an additional sstable component. I'm not sure the StreamEvents 
would be a good fit for this, since they're more like high level notifications. 
Ideally, we could add an {{Map}} to {{OutgoingFileMessage}}, 
but an esoteric hook probably isn't going to warrant a streaming protocol 
version bump.

> Add extension points in storage and streaming classes
> -
>
> Key: CASSANDRA-11173
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11173
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
> Fix For: 3.4
>
>
> It would be useful if 3rd party classes could be notified when reads and 
> writes occur on a table+partition, when sstables are being streamed out/in, 
> and could also intercept the creation of row iterators from sstables. I have 
> a [v1 branch here|https://github.com/bdeggleston/cassandra/tree/hooksV1]. It 
> illustrates the extension points I'm looking for, but is not necessarily the 
> best api.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11173) Add extension points in storage and streaming classes

2016-02-16 Thread Yuki Morishita (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15149572#comment-15149572
 ] 

Yuki Morishita commented on CASSANDRA-11173:


For streaming, can we just extend existing 
{{StreamEvent}}/{{StreamEventHandler}}?

> Add extension points in storage and streaming classes
> -
>
> Key: CASSANDRA-11173
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11173
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
> Fix For: 3.4
>
>
> It would be useful if 3rd party classes could be notified when reads and 
> writes occur on a table+partition, when sstables are being streamed out/in, 
> and could also intercept the creation of row iterators from sstables. I have 
> a [v1 branch here|https://github.com/bdeggleston/cassandra/tree/hooksV1]. It 
> illustrates the extension points I'm looking for, but is not necessarily the 
> best api.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CASSANDRA-11173) Add extension points in storage and streaming classes

2016-02-16 Thread Blake Eggleston (JIRA)

Blake Eggleston created CASSANDRA-11173:
---

 Summary: Add extension points in storage and streaming classes
 Key: CASSANDRA-11173
 URL: https://issues.apache.org/jira/browse/CASSANDRA-11173
 Project: Cassandra
  Issue Type: New Feature
Reporter: Blake Eggleston
Assignee: Blake Eggleston
 Fix For: 3.4


It would be useful if 3rd party classes could be notified when reads and writes 
occur on a table+partition, when sstables are being streamed out/in, and could 
also intercept the creation of row iterators from sstables. I have a [v1 branch 
here|https://github.com/bdeggleston/cassandra/tree/hooksV1]. It illustrates the 
extension points I'm looking for, but is not necessarily the best api.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11173) Add extension points in storage and streaming classes

2016-02-16 Thread Blake Eggleston (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15149550#comment-15149550
 ] 

Blake Eggleston commented on CASSANDRA-11173:
-

/cc [~iamaleksey], [~krummas]

> Add extension points in storage and streaming classes
> -
>
> Key: CASSANDRA-11173
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11173
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
> Fix For: 3.4
>
>
> It would be useful if 3rd party classes could be notified when reads and 
> writes occur on a table+partition, when sstables are being streamed out/in, 
> and could also intercept the creation of row iterators from sstables. I have 
> a [v1 branch here|https://github.com/bdeggleston/cassandra/tree/hooksV1]. It 
> illustrates the extension points I'm looking for, but is not necessarily the 
> best api.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11123) cqlsh pg-style-strings broken if line ends with ';'

2016-02-16 Thread Stefania (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15149534#comment-15149534
 ] 

Stefania commented on CASSANDRA-11123:
--

Yes GTG, +1.

> cqlsh pg-style-strings broken if line ends with ';'
> ---
>
> Key: CASSANDRA-11123
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11123
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Robert Stupp
>Assignee: Robert Stupp
>Priority: Minor
> Fix For: 2.2.x
>
>
> If one separate line in a multi-line pg-style-string ends with a semicolon in 
> cqlsh, cqlsh incorrectly assumes that that is the end of the statement.
> {code}
> cqlsh:foo> insert into tab (pk, val) values (2, $$
>... wepofjef
>... wefoijew
>... ;
> SyntaxException:  message="line 4:0 mismatched character ';' expecting '$'">
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-10992) Hanging streaming sessions

2016-02-16 Thread Paulo Motta (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-10992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15149487#comment-15149487
 ] 

Paulo Motta commented on CASSANDRA-10992:
-

Thanks!

> Hanging streaming sessions
> --
>
> Key: CASSANDRA-10992
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10992
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 2.1.12, Debian Wheezy
>Reporter: mlowicki
>Assignee: Paulo Motta
> Fix For: 2.1.12
>
> Attachments: apache-cassandra-2.1.12-SNAPSHOT.jar
>
>
> I've started recently running repair using [Cassandra 
> Reaper|https://github.com/spotify/cassandra-reaper]  (built-in {{nodetool 
> repair}} doesn't work for me - CASSANDRA-9935). It behaves fine but I've 
> noticed hanging streaming sessions:
> {code}
> root@db1:~# date
> Sat Jan  9 16:43:00 UTC 2016
> root@db1:~# nt netstats -H | grep total
> Receiving 5 files, 46.59 MB total. Already received 1 files, 11.32 MB 
> total
> Sending 7 files, 46.28 MB total. Already sent 7 files, 46.28 MB total
> Receiving 6 files, 64.15 MB total. Already received 1 files, 12.14 MB 
> total
> Sending 5 files, 61.15 MB total. Already sent 5 files, 61.15 MB total
> Receiving 4 files, 7.75 MB total. Already received 3 files, 7.58 MB 
> total
> Sending 4 files, 4.29 MB total. Already sent 4 files, 4.29 MB total
> Receiving 12 files, 13.79 MB total. Already received 11 files, 7.66 
> MB total
> Sending 5 files, 15.32 MB total. Already sent 5 files, 15.32 MB total
> Receiving 8 files, 20.35 MB total. Already received 1 files, 13.63 MB 
> total
> Sending 38 files, 125.34 MB total. Already sent 38 files, 125.34 MB 
> total
> root@db1:~# date
> Sat Jan  9 17:45:42 UTC 2016
> root@db1:~# nt netstats -H | grep total
> Receiving 5 files, 46.59 MB total. Already received 1 files, 11.32 MB 
> total
> Sending 7 files, 46.28 MB total. Already sent 7 files, 46.28 MB total
> Receiving 6 files, 64.15 MB total. Already received 1 files, 12.14 MB 
> total
> Sending 5 files, 61.15 MB total. Already sent 5 files, 61.15 MB total
> Receiving 4 files, 7.75 MB total. Already received 3 files, 7.58 MB 
> total
> Sending 4 files, 4.29 MB total. Already sent 4 files, 4.29 MB total
> Receiving 12 files, 13.79 MB total. Already received 11 files, 7.66 
> MB total
> Sending 5 files, 15.32 MB total. Already sent 5 files, 15.32 MB total
> Receiving 8 files, 20.35 MB total. Already received 1 files, 13.63 MB 
> total
> Sending 38 files, 125.34 MB total. Already sent 38 files, 125.34 MB 
> total
> {code}
> Such sessions are left even when repair job is long time done (confirmed by 
> checking Reaper's and Cassandra's logs). {{streaming_socket_timeout_in_ms}} 
> in cassandra.yaml is set to default value (360).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11065) null pointer exception in CassandraDaemon.java:195

2016-02-16 Thread Paulo Motta (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15149485#comment-15149485
 ] 

Paulo Motta commented on CASSANDRA-11065:
-

Thanks for the review [~yukim]. Please see below a patch with suggested changes:

||2.2||3.0||trunk||
|[branch|https://github.com/apache/cassandra/compare/cassandra-2.2...pauloricardomg:2.2-11065]|[branch|https://github.com/apache/cassandra/compare/cassandra-3.0...pauloricardomg:3.0-11065]|[branch|https://github.com/apache/cassandra/compare/trunk...pauloricardomg:trunk-11065]|
|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-2.2-11065-testall/lastCompletedBuild/testReport/]|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-3.0-11065-testall/lastCompletedBuild/testReport/]|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-trunk-11065-testall/lastCompletedBuild/testReport/]|
|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-2.2-11065-dtest/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-3.0-11065-dtest/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-trunk-11065-dtest/lastCompletedBuild/testReport/]|


* commit info: 2.2 has merge conflict with 3.0, but 3.0 merge cleanly upwards 
to trunk.

> null pointer exception in CassandraDaemon.java:195
> --
>
> Key: CASSANDRA-11065
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11065
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Vassil Lunchev
>Assignee: Paulo Motta
>Priority: Minor
>
> Running Cassandra 3.0.1 installed from apt-get on debian.
> I had a keyspace called 'tests'. I dropped it. Then I checked some nodes and 
> one of them still had that keyspace 'tests'. On a node that still has the 
> dropped keyspace I ran:
> nodetools repair tests;
> In the system logs of another node that did not have keyspace 'tests' I am 
> seeing a null pointer exception:
> {code:java}
> ERROR [AntiEntropyStage:2] 2016-01-25 15:02:46,323 
> RepairMessageVerbHandler.java:161 - Got error, removing parent repair session
> ERROR [AntiEntropyStage:2] 2016-01-25 15:02:46,324 CassandraDaemon.java:195 - 
> Exception in thread Thread[AntiEntropyStage:2,5,main]
> java.lang.RuntimeException: java.lang.NullPointerException
>   at 
> org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:164)
>  ~[apache-cassandra-3.0.1.jar:3.0.1]
>   at 
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:67) 
> ~[apache-cassandra-3.0.1.jar:3.0.1]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[na:1.8.0_66-internal]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  ~[na:1.8.0_66-internal]
>   at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_66-internal]
> Caused by: java.lang.NullPointerException: null
>   at 
> org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:69)
>  ~[apache-cassandra-3.0.1.jar:3.0.1]
>   ... 4 common frames omitted
> {code}
> The error appears every time I run:
> nodetools repair tests;
> I can see it in the logs of all nodes, including the node on which I run the 
> repair.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-7423) Allow updating individual subfields of UDT

2016-02-16 Thread Tyler Hobbs (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tyler Hobbs updated CASSANDRA-7423:
---
Since Version:   (was: 2.1 beta1)

> Allow updating individual subfields of UDT
> --
>
> Key: CASSANDRA-7423
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7423
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: Tupshin Harper
>Assignee: Tyler Hobbs
>  Labels: cql, docs-impacting
> Fix For: 3.x
>
>
> Since user defined types were implemented in CASSANDRA-5590 as blobs (you 
> have to rewrite the entire type in order to make any modifications), they 
> can't be safely used without LWT for any operation that wants to modify a 
> subset of the UDT's fields by any client process that is not authoritative 
> for the entire blob. 
> When trying to use UDTs to model complex records (particularly with nesting), 
> this is not an exceptional circumstance, this is the totally expected normal 
> situation. 
> The use of UDTs for anything non-trivial is harmful to either performance or 
> consistency or both.
> edit: to clarify, i believe that most potential uses of UDTs should be 
> considered anti-patterns until/unless we have field-level r/w access to 
> individual elements of the UDT, with individual timestamps and standard LWW 
> semantics



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-7423) Allow updating individual subfields of UDT

2016-02-16 Thread Tyler Hobbs (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tyler Hobbs updated CASSANDRA-7423:
---
Labels: cql docs-impacting  (was: cql)

> Allow updating individual subfields of UDT
> --
>
> Key: CASSANDRA-7423
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7423
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: Tupshin Harper
>Assignee: Tyler Hobbs
>  Labels: cql, docs-impacting
> Fix For: 3.x
>
>
> Since user defined types were implemented in CASSANDRA-5590 as blobs (you 
> have to rewrite the entire type in order to make any modifications), they 
> can't be safely used without LWT for any operation that wants to modify a 
> subset of the UDT's fields by any client process that is not authoritative 
> for the entire blob. 
> When trying to use UDTs to model complex records (particularly with nesting), 
> this is not an exceptional circumstance, this is the totally expected normal 
> situation. 
> The use of UDTs for anything non-trivial is harmful to either performance or 
> consistency or both.
> edit: to clarify, i believe that most potential uses of UDTs should be 
> considered anti-patterns until/unless we have field-level r/w access to 
> individual elements of the UDT, with individual timestamps and standard LWW 
> semantics



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-10992) Hanging streaming sessions

2016-02-16 Thread mlowicki (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-10992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15149380#comment-15149380
 ] 

mlowicki commented on CASSANDRA-10992:
--

We'll upgrade our cluster this or next week (have been waiting a bit after 
release to make sure no critical issues have been introduced). Will let you 
know here when done.

> Hanging streaming sessions
> --
>
> Key: CASSANDRA-10992
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10992
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 2.1.12, Debian Wheezy
>Reporter: mlowicki
>Assignee: Paulo Motta
> Fix For: 2.1.12
>
> Attachments: apache-cassandra-2.1.12-SNAPSHOT.jar
>
>
> I've started recently running repair using [Cassandra 
> Reaper|https://github.com/spotify/cassandra-reaper]  (built-in {{nodetool 
> repair}} doesn't work for me - CASSANDRA-9935). It behaves fine but I've 
> noticed hanging streaming sessions:
> {code}
> root@db1:~# date
> Sat Jan  9 16:43:00 UTC 2016
> root@db1:~# nt netstats -H | grep total
> Receiving 5 files, 46.59 MB total. Already received 1 files, 11.32 MB 
> total
> Sending 7 files, 46.28 MB total. Already sent 7 files, 46.28 MB total
> Receiving 6 files, 64.15 MB total. Already received 1 files, 12.14 MB 
> total
> Sending 5 files, 61.15 MB total. Already sent 5 files, 61.15 MB total
> Receiving 4 files, 7.75 MB total. Already received 3 files, 7.58 MB 
> total
> Sending 4 files, 4.29 MB total. Already sent 4 files, 4.29 MB total
> Receiving 12 files, 13.79 MB total. Already received 11 files, 7.66 
> MB total
> Sending 5 files, 15.32 MB total. Already sent 5 files, 15.32 MB total
> Receiving 8 files, 20.35 MB total. Already received 1 files, 13.63 MB 
> total
> Sending 38 files, 125.34 MB total. Already sent 38 files, 125.34 MB 
> total
> root@db1:~# date
> Sat Jan  9 17:45:42 UTC 2016
> root@db1:~# nt netstats -H | grep total
> Receiving 5 files, 46.59 MB total. Already received 1 files, 11.32 MB 
> total
> Sending 7 files, 46.28 MB total. Already sent 7 files, 46.28 MB total
> Receiving 6 files, 64.15 MB total. Already received 1 files, 12.14 MB 
> total
> Sending 5 files, 61.15 MB total. Already sent 5 files, 61.15 MB total
> Receiving 4 files, 7.75 MB total. Already received 3 files, 7.58 MB 
> total
> Sending 4 files, 4.29 MB total. Already sent 4 files, 4.29 MB total
> Receiving 12 files, 13.79 MB total. Already received 11 files, 7.66 
> MB total
> Sending 5 files, 15.32 MB total. Already sent 5 files, 15.32 MB total
> Receiving 8 files, 20.35 MB total. Already received 1 files, 13.63 MB 
> total
> Sending 38 files, 125.34 MB total. Already sent 38 files, 125.34 MB 
> total
> {code}
> Such sessions are left even when repair job is long time done (confirmed by 
> checking Reaper's and Cassandra's logs). {{streaming_socket_timeout_in_ms}} 
> in cassandra.yaml is set to default value (360).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-10992) Hanging streaming sessions

2016-02-16 Thread Paulo Motta (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-10992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15149346#comment-15149346
 ] 

Paulo Motta commented on CASSANDRA-10992:
-

[~mlowicki] were you able to see if the problem still holds on 2.1.13?

> Hanging streaming sessions
> --
>
> Key: CASSANDRA-10992
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10992
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 2.1.12, Debian Wheezy
>Reporter: mlowicki
>Assignee: Paulo Motta
> Fix For: 2.1.12
>
> Attachments: apache-cassandra-2.1.12-SNAPSHOT.jar
>
>
> I've started recently running repair using [Cassandra 
> Reaper|https://github.com/spotify/cassandra-reaper]  (built-in {{nodetool 
> repair}} doesn't work for me - CASSANDRA-9935). It behaves fine but I've 
> noticed hanging streaming sessions:
> {code}
> root@db1:~# date
> Sat Jan  9 16:43:00 UTC 2016
> root@db1:~# nt netstats -H | grep total
> Receiving 5 files, 46.59 MB total. Already received 1 files, 11.32 MB 
> total
> Sending 7 files, 46.28 MB total. Already sent 7 files, 46.28 MB total
> Receiving 6 files, 64.15 MB total. Already received 1 files, 12.14 MB 
> total
> Sending 5 files, 61.15 MB total. Already sent 5 files, 61.15 MB total
> Receiving 4 files, 7.75 MB total. Already received 3 files, 7.58 MB 
> total
> Sending 4 files, 4.29 MB total. Already sent 4 files, 4.29 MB total
> Receiving 12 files, 13.79 MB total. Already received 11 files, 7.66 
> MB total
> Sending 5 files, 15.32 MB total. Already sent 5 files, 15.32 MB total
> Receiving 8 files, 20.35 MB total. Already received 1 files, 13.63 MB 
> total
> Sending 38 files, 125.34 MB total. Already sent 38 files, 125.34 MB 
> total
> root@db1:~# date
> Sat Jan  9 17:45:42 UTC 2016
> root@db1:~# nt netstats -H | grep total
> Receiving 5 files, 46.59 MB total. Already received 1 files, 11.32 MB 
> total
> Sending 7 files, 46.28 MB total. Already sent 7 files, 46.28 MB total
> Receiving 6 files, 64.15 MB total. Already received 1 files, 12.14 MB 
> total
> Sending 5 files, 61.15 MB total. Already sent 5 files, 61.15 MB total
> Receiving 4 files, 7.75 MB total. Already received 3 files, 7.58 MB 
> total
> Sending 4 files, 4.29 MB total. Already sent 4 files, 4.29 MB total
> Receiving 12 files, 13.79 MB total. Already received 11 files, 7.66 
> MB total
> Sending 5 files, 15.32 MB total. Already sent 5 files, 15.32 MB total
> Receiving 8 files, 20.35 MB total. Already received 1 files, 13.63 MB 
> total
> Sending 38 files, 125.34 MB total. Already sent 38 files, 125.34 MB 
> total
> {code}
> Such sessions are left even when repair job is long time done (confirmed by 
> checking Reaper's and Cassandra's logs). {{streaming_socket_timeout_in_ms}} 
> in cassandra.yaml is set to default value (360).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9692) Print sensible units for all log messages

2016-02-16 Thread Paulo Motta (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15149335#comment-15149335
 ] 

Paulo Motta commented on CASSANDRA-9692:


I will ask some admin to add your username to the assignable list. You can 
start work on this task meanwhile.

> Print sensible units for all log messages
> -
>
> Key: CASSANDRA-9692
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9692
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Benedict
>Priority: Minor
>  Labels: lhf
> Fix For: 3.x
>
>
> Like CASSANDRA-9691, this has bugged me too long. it also adversely impacts 
> log analysis. I've introduced some improvements to the bits I touched for 
> CASSANDRA-9681, but we should do this across the codebase. It's a small 
> investment for a lot of long term clarity in the logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9766) Bootstrap outgoing streaming speeds are much slower than during repair

2016-02-16 Thread Eric Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15149333#comment-15149333
 ] 

Eric Evans commented on CASSANDRA-9766:
---

So if I'm understanding this correctly (and I'm probably not), increasing the 
receiving buffer would get more of the data over the wire before blocking on 
the read from buffer, decompression, etc (up to however much the buffer was 
increased by).  Is that right?  If so, that wouldn't really help much; That 
would seem to imply that processing the compressed data is the bottleneck, and 
that the blocking is (rightfully) applying back-pressure to the network-side.

> Bootstrap outgoing streaming speeds are much slower than during repair
> --
>
> Key: CASSANDRA-9766
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9766
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 2.1.2. more details in the pdf attached 
>Reporter: Alexei K
> Fix For: 2.1.x
>
> Attachments: problem.pdf
>
>
> I have a cluster in Amazon cloud , its described in detail in the attachment. 
> What I've noticed is that we during bootstrap we never go above 12MB/sec 
> transmission speeds and also those speeds flat line almost like we're hitting 
> some sort of a limit ( this remains true for other tests that I've ran) 
> however during the repair we see much higher,variable sending rates. I've 
> provided network charts in the attachment as well . Is there an explanation 
> for this? Is something wrong with my configuration, or is it a possible bug?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9692) Print sensible units for all log messages

2016-02-16 Thread Giampaolo (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15149318#comment-15149318
 ] 

Giampaolo commented on CASSANDRA-9692:
--

Ok Paulo, is it ok to you to assign me the task? (to this account, not the 
other that I wrongly used)

> Print sensible units for all log messages
> -
>
> Key: CASSANDRA-9692
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9692
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Benedict
>Priority: Minor
>  Labels: lhf
> Fix For: 3.x
>
>
> Like CASSANDRA-9691, this has bugged me too long. it also adversely impacts 
> log analysis. I've introduced some improvements to the bits I touched for 
> CASSANDRA-9681, but we should do this across the codebase. It's a small 
> investment for a lot of long term clarity in the logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (CASSANDRA-11159) SASI indexes don't switch memtable on flush

2016-02-16 Thread Pavel Yaskevich (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Yaskevich resolved CASSANDRA-11159.
-
Resolution: Fixed
  Assignee: Pavel Yaskevich  (was: Sam Tunnicliffe)
  Reviewer: Sam Tunnicliffe

Added @VisibleForTesting and committed. Thanks you for review, [~beobal]!

> SASI indexes don't switch memtable on flush
> ---
>
> Key: CASSANDRA-11159
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11159
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local Write-Read Paths
>Reporter: Sam Tunnicliffe
>Assignee: Pavel Yaskevich
>Priority: Critical
> Fix For: 3.4
>
>
> SASI maintains its own in-memory structures for indexing the contents of a 
> base Memtable. On flush, these are simply discarded & replaced with an new 
> instance, whilst the on disk index is built as the base memtable is flushed 
> to SSTables. 
> SASIIndex implements INotificationHandler and this switching of the index 
> memtable is triggered by receipt of a MemtableRenewedNotification. In the 
> original SASI implementation, one of the necessary modifications to C* was to 
> emit this notification from DataTracker::switchMemtable, but this was 
> overlooked when porting to 3.0. The net result is that the index memtable is 
> never switched out, which eventually leads to OOME. 
> Simply applying the original modification isn't entirely appropriate though, 
> as it creates a window where it's possible for the index memtable to have 
> been switched, but the flushwriter is yet to finish writing the new index 
> sstables. During this window, index entries will be missing and query results 
> inaccurate. 
> I propose leaving Tracker::switchMemtable as is, so that 
> INotificationConsumers are only notified from there when truncating, but 
> adding similar notifications in Tracker::replaceFlushed, to fire after the 
> View is updated. 
> I'm leaning toward re-using MemtableRenewedNotification for this as 
> semantically I don't believe there's any meaningful difference between the 
> flush and truncation cases here. If anyone has a compelling argument for a 
> new notification type though to distinguish the two events, I'm open to hear 
> it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

cassandra git commit: fix SASI memtable switching of flush

2016-02-16 Thread xedin

Repository: cassandra
Updated Branches:
  refs/heads/trunk f9a1a80af -> 48815d4a1


fix SASI memtable switching of flush

patch by xedin; reviewed by beobal for CASSANDRA-11159


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/48815d4a
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/48815d4a
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/48815d4a

Branch: refs/heads/trunk
Commit: 48815d4a182915e852888cb35273b8e896cea440
Parents: f9a1a80
Author: Pavel Yaskevich 
Authored: Thu Feb 11 18:54:04 2016 -0800
Committer: Pavel Yaskevich 
Committed: Tue Feb 16 13:06:39 2016 -0800

--
 CHANGES.txt |  1 +
 .../apache/cassandra/db/lifecycle/Tracker.java  | 26 ++-
 .../apache/cassandra/index/sasi/SASIIndex.java  |  8 ++
 .../cassandra/index/sasi/conf/ColumnIndex.java  | 40 +-
 .../cassandra/index/sasi/conf/view/View.java|  4 +-
 .../index/sasi/plan/QueryController.java| 30 +++-
 .../MemtableDiscardedNotification.java  | 30 
 .../MemtableSwitchedNotification.java   | 30 
 .../cassandra/index/sasi/SASIIndexTest.java | 77 
 9 files changed, 220 insertions(+), 26 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/48815d4a/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index c3bfdc3..f20e983 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 3.4
+ * fix SASI memtable switching on flush (CASSANDRA-11159)
  * Remove duplicate offline compaction tracking (CASSANDRA-11148)
  * fix EQ semantics of analyzed SASI indexes (CASSANDRA-11130)
  * Support long name output for nodetool commands (CASSANDRA-7950)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/48815d4a/src/java/org/apache/cassandra/db/lifecycle/Tracker.java
--
diff --git a/src/java/org/apache/cassandra/db/lifecycle/Tracker.java 
b/src/java/org/apache/cassandra/db/lifecycle/Tracker.java
index 4c73472..dd07b19 100644
--- a/src/java/org/apache/cassandra/db/lifecycle/Tracker.java
+++ b/src/java/org/apache/cassandra/db/lifecycle/Tracker.java
@@ -318,6 +318,8 @@ public class Tracker
 Pair result = apply(View.switchMemtable(newMemtable));
 if (truncating)
 notifyRenewed(newMemtable);
+else
+notifySwitched(result.left.getCurrentMemtable());
 
 return result.left.getCurrentMemtable();
 }
@@ -349,6 +351,8 @@ public class Tracker
 // TODO: if we're invalidated, should we notifyadded AND removed, or 
just skip both?
 fail = notifyAdded(sstables, fail);
 
+notifyDiscarded(memtable);
+
 if (!isDummy() && !cfstore.isValid())
 dropSSTables();
 
@@ -441,16 +445,30 @@ public class Tracker
 subscriber.handleNotification(notification, this);
 }
 
-public void notifyRenewed(Memtable renewed)
+public void notifyTruncated(long truncatedAt)
 {
-INotification notification = new MemtableRenewedNotification(renewed);
+INotification notification = new TruncationNotification(truncatedAt);
 for (INotificationConsumer subscriber : subscribers)
 subscriber.handleNotification(notification, this);
 }
 
-public void notifyTruncated(long truncatedAt)
+public void notifyRenewed(Memtable renewed)
+{
+notify(new MemtableRenewedNotification(renewed));
+}
+
+public void notifySwitched(Memtable previous)
+{
+notify(new MemtableSwitchedNotification(previous));
+}
+
+public void notifyDiscarded(Memtable discarded)
+{
+notify(new MemtableDiscardedNotification(discarded));
+}
+
+private void notify(INotification notification)
 {
-INotification notification = new TruncationNotification(truncatedAt);
 for (INotificationConsumer subscriber : subscribers)
 subscriber.handleNotification(notification, this);
 }

http://git-wip-us.apache.org/repos/asf/cassandra/blob/48815d4a/src/java/org/apache/cassandra/index/sasi/SASIIndex.java
--
diff --git a/src/java/org/apache/cassandra/index/sasi/SASIIndex.java 
b/src/java/org/apache/cassandra/index/sasi/SASIIndex.java
index d480b82..90cc72e 100644
--- a/src/java/org/apache/cassandra/index/sasi/SASIIndex.java
+++ b/src/java/org/apache/cassandra/index/sasi/SASIIndex.java
@@ -311,6 +311,14 @@ public class SASIIndex implements Index, 
INotificationConsumer
 {
 index.switchMemtable();
 }
+else if (notification instanceof

[jira] [Commented] (CASSANDRA-9754) Make index info heap friendly for large CQL partitions

2016-02-16 Thread Michael Kjellman (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15149269#comment-15149269
 ] 

Michael Kjellman commented on CASSANDRA-9754:
-

well - then you'd lose the cache-line aware logic I've implemented to make the 
B+ tree efficient...

> Make index info heap friendly for large CQL partitions
> --
>
> Key: CASSANDRA-9754
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9754
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Assignee: Michael Kjellman
>Priority: Minor
>
>  Looking at a heap dump of 2.0 cluster, I found that majority of the objects 
> are IndexInfo and its ByteBuffers. This is specially bad in endpoints with 
> large CQL partitions. If a CQL partition is say 6,4GB, it will have 100K 
> IndexInfo objects and 200K ByteBuffers. This will create a lot of churn for 
> GC. Can this be improved by not creating so many objects?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9754) Make index info heap friendly for large CQL partitions

2016-02-16 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15149265#comment-15149265
 ] 

Jonathan Ellis commented on CASSANDRA-9754:
---

Would it make sense to always used buffered i/o for the B+ instead of mmap?

> Make index info heap friendly for large CQL partitions
> --
>
> Key: CASSANDRA-9754
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9754
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Assignee: Michael Kjellman
>Priority: Minor
>
>  Looking at a heap dump of 2.0 cluster, I found that majority of the objects 
> are IndexInfo and its ByteBuffers. This is specially bad in endpoints with 
> large CQL partitions. If a CQL partition is say 6,4GB, it will have 100K 
> IndexInfo objects and 200K ByteBuffers. This will create a lot of churn for 
> GC. Can this be improved by not creating so many objects?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9766) Bootstrap outgoing streaming speeds are much slower than during repair

2016-02-16 Thread Eric Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15149229#comment-15149229
 ] 

Eric Evans commented on CASSANDRA-9766:
---

I'm seeing something similar here; I get an eerily consistent 4.5MB/s _per 
stream_, (much less than the stream throughput limit, and the capability of the 
network).  We have large partitions, large SSTables, and a mixture of 256k and 
512k chunk lengths.

[~yukim] what would be the best test of this, would 
https://gist.github.com/eevans/81f02849eab7634871c9 do?

> Bootstrap outgoing streaming speeds are much slower than during repair
> --
>
> Key: CASSANDRA-9766
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9766
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 2.1.2. more details in the pdf attached 
>Reporter: Alexei K
> Fix For: 2.1.x
>
> Attachments: problem.pdf
>
>
> I have a cluster in Amazon cloud , its described in detail in the attachment. 
> What I've noticed is that we during bootstrap we never go above 12MB/sec 
> transmission speeds and also those speeds flat line almost like we're hitting 
> some sort of a limit ( this remains true for other tests that I've ran) 
> however during the repair we see much higher,variable sending rates. I've 
> provided network charts in the attachment as well . Is there an explanation 
> for this? Is something wrong with my configuration, or is it a possible bug?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11172) Infinite loop bug adding high-level SSTableReader in compaction

2016-02-16 Thread Marcus Eriksson (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15149188#comment-15149188
 ] 

Marcus Eriksson commented on CASSANDRA-11172:
-

you should definitely upgrade

but I have never seen this happen, but LCS with inc repair was very broken 
before CASSANDRA-10831

> Infinite loop bug adding high-level SSTableReader in compaction
> ---
>
> Key: CASSANDRA-11172
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11172
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: DSE 4.x / Cassandra 2.1.11.969
>Reporter: Jeff Ferland
>Assignee: Marcus Eriksson
>
> Observed that after a large repair on LCS that sometimes the system will 
> enter an infinite loop with vast amounts of logs lines recording, "Adding 
> high-level (L${LEVEL}) SSTableReader(path='${TABLE}') to candidates"
> This results in an outage of the node and eventual crashing. The log spam 
> quickly rotates out possibly useful earlier debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9692) Print sensible units for all log messages

2016-02-16 Thread Paulo Motta (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15149136#comment-15149136
 ] 

Paulo Motta commented on CASSANDRA-9692:


Sounds like a good place to start. You can work in the trunk branch. Please see 
https://wiki.apache.org/cassandra/HowToContribute for how to setup your 
environment and coding guidelines.

You can have a look in [this 
commit|https://github.com/apache/cassandra/commit/b757db1484473b264bf25ca5541f080d54a579a2]
 for an example on how this was done in the past, specially the 
{{FBUtilities.prettyPrintMemory}} method.

> Print sensible units for all log messages
> -
>
> Key: CASSANDRA-9692
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9692
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Benedict
>Priority: Minor
>  Labels: lhf
> Fix For: 3.x
>
>
> Like CASSANDRA-9691, this has bugged me too long. it also adversely impacts 
> log analysis. I've introduced some improvements to the bits I touched for 
> CASSANDRA-9681, but we should do this across the codebase. It's a small 
> investment for a lot of long term clarity in the logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-7423) Allow updating individual subfields of UDT

2016-02-16 Thread Joshua McKenzie (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie updated CASSANDRA-7423:
---
Assignee: Tyler Hobbs

> Allow updating individual subfields of UDT
> --
>
> Key: CASSANDRA-7423
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7423
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: Tupshin Harper
>Assignee: Tyler Hobbs
>  Labels: cql
> Fix For: 3.x
>
>
> Since user defined types were implemented in CASSANDRA-5590 as blobs (you 
> have to rewrite the entire type in order to make any modifications), they 
> can't be safely used without LWT for any operation that wants to modify a 
> subset of the UDT's fields by any client process that is not authoritative 
> for the entire blob. 
> When trying to use UDTs to model complex records (particularly with nesting), 
> this is not an exceptional circumstance, this is the totally expected normal 
> situation. 
> The use of UDTs for anything non-trivial is harmful to either performance or 
> consistency or both.
> edit: to clarify, i believe that most potential uses of UDTs should be 
> considered anti-patterns until/unless we have field-level r/w access to 
> individual elements of the UDT, with individual timestamps and standard LWW 
> semantics



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (CASSANDRA-10546) Custom MV support

2016-02-16 Thread Tyler Hobbs (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tyler Hobbs resolved CASSANDRA-10546.
-
Resolution: Not A Problem

Closing this for now, as it doesn't look like it's going to be needed any time 
soon.

> Custom MV support
> -
>
> Key: CASSANDRA-10546
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10546
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Matthias Broecheler
>
> The MV implementation should be generalized to allow for custom materialized 
> view implementations. Like with MV, the logic would be triggered by a 
> mutation to some base table on which the custom MV is registered. A custom MV 
> would allow for custom logic to determine the "derived" mutations that need 
> to be applied as a result of the base table mutation. It would then ensure 
> that those derived mutations are applied (to other tables) as the current MV 
> implementation does.
> Note, that a custom MV implementation is responsible for ensuring that any 
> tables that derived mutations are written into exist. As such, a custom MV 
> implementation has an initialization logic which can create those tables upon 
> registration if needed. There should be no limit on what table a custom MV 
> can write derived records to (even existing ones).
> Example:
> (Note, that this example is somewhat construed for simplicity)
> We have a table in which we track user visits to certain properties with 
> timestamp:
> {code}
> CREATE TABLE visits (
>   userId bigint,
>   visitAt timestamp,
>   property varchar,
>   PRIMARY KEY (userId, visitAt)
> );
> {code}
> Every time a user visits a property, a record gets added to this table. 
> Records frequently come in out-of-order.
> At the same time, we would like to know who is currently visiting a 
> particular property (with their last entry time).
> For that, we create a custom MV registered against the {{visits}} table which 
> upon registration creates the following table:
> {code}
> CREATE TABLE currentlyVisiting (
>   property varchar,
>   userId bigint,
>   enteredOn timestamp,
>   PRIMARY KEY (property, userId)
> );
> {code}
> Now, when a record (u,v,p) gets inserted into the {{visits}} table the custom 
> MV logic gets invoked:
> # It reads the most recent visit record for user u: (u,v',p').
> # If no such record exists, it emits (p,u,v) targeting table 
> {{currentlyVisiting}} as a derived record to be persisted.
> # If such a record exists and v'>=v then it emits nothing. But if v' emits records (p',u,v') to be deleted and (p,u,v) to be added to table 
> {{currentlyVisiting}}.
> The MV framework ensures that the emitted records get persisted correctly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9754) Make index info heap friendly for large CQL partitions

2016-02-16 Thread Michael Kjellman (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15149034#comment-15149034
 ] 

Michael Kjellman commented on CASSANDRA-9754:
-

I was asked to update. Making good progress. A large number of tests pass. I'm 
basically just getting the math right for "Inception: The Cassandra Director's 
Cut" due to the need to make a B+ Tree (disk based by it's very definition) 
work with SegmentedFile etc... 

> Make index info heap friendly for large CQL partitions
> --
>
> Key: CASSANDRA-9754
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9754
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Assignee: Michael Kjellman
>Priority: Minor
>
>  Looking at a heap dump of 2.0 cluster, I found that majority of the objects 
> are IndexInfo and its ByteBuffers. This is specially bad in endpoints with 
> large CQL partitions. If a CQL partition is say 6,4GB, it will have 100K 
> IndexInfo objects and 200K ByteBuffers. This will create a lot of churn for 
> GC. Can this be improved by not creating so many objects?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11172) Infinite loop bug adding high-level SSTableReader in compaction

2016-02-16 Thread Jeff Ferland (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15149009#comment-15149009
 ] 

Jeff Ferland commented on CASSANDRA-11172:
--

Yes. It's after incremental repair that I'm seeing this. Next time around I'll 
check that the file listed doesn't exist before restart, but I think this is a 
duplicate.

Alternately, though, I'm also seeing the gossip thread lockup at times after 
incremental repair without mentioning higher level sstables, so that might be a 
new ticket to file next time around.

> Infinite loop bug adding high-level SSTableReader in compaction
> ---
>
> Key: CASSANDRA-11172
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11172
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: DSE 4.x / Cassandra 2.1.11.969
>Reporter: Jeff Ferland
>Assignee: Marcus Eriksson
>
> Observed that after a large repair on LCS that sometimes the system will 
> enter an infinite loop with vast amounts of logs lines recording, "Adding 
> high-level (L${LEVEL}) SSTableReader(path='${TABLE}') to candidates"
> This results in an outage of the node and eventual crashing. The log spam 
> quickly rotates out possibly useful earlier debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11172) Infinite loop bug adding high-level SSTableReader in compaction

2016-02-16 Thread Marcus Eriksson (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15149005#comment-15149005
 ] 

Marcus Eriksson commented on CASSANDRA-11172:
-

do you run incremental repair?

> Infinite loop bug adding high-level SSTableReader in compaction
> ---
>
> Key: CASSANDRA-11172
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11172
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: DSE 4.x / Cassandra 2.1.11.969
>Reporter: Jeff Ferland
>Assignee: Marcus Eriksson
>
> Observed that after a large repair on LCS that sometimes the system will 
> enter an infinite loop with vast amounts of logs lines recording, "Adding 
> high-level (L${LEVEL}) SSTableReader(path='${TABLE}') to candidates"
> This results in an outage of the node and eventual crashing. The log spam 
> quickly rotates out possibly useful earlier debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11172) Infinite loop bug adding high-level SSTableReader in compaction

2016-02-16 Thread Jeff Ferland (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148994#comment-15148994
 ] 

Jeff Ferland commented on CASSANDRA-11172:
--

Possible duplicate of https://issues.apache.org/jira/browse/CASSANDRA-10831 ?

> Infinite loop bug adding high-level SSTableReader in compaction
> ---
>
> Key: CASSANDRA-11172
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11172
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: DSE 4.x / Cassandra 2.1.11.969
>Reporter: Jeff Ferland
>Assignee: Marcus Eriksson
>
> Observed that after a large repair on LCS that sometimes the system will 
> enter an infinite loop with vast amounts of logs lines recording, "Adding 
> high-level (L${LEVEL}) SSTableReader(path='${TABLE}') to candidates"
> This results in an outage of the node and eventual crashing. The log spam 
> quickly rotates out possibly useful earlier debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-11172) Infinite loop bug adding high-level SSTableReader in compaction

2016-02-16 Thread Jeff Ferland (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148994#comment-15148994
 ] 

Jeff Ferland edited comment on CASSANDRA-11172 at 2/16/16 6:01 PM:
---

Possible duplicate of CASSANDRA-10831 ?


was (Author: autocracy):
Possible duplicate of https://issues.apache.org/jira/browse/CASSANDRA-10831 ?

> Infinite loop bug adding high-level SSTableReader in compaction
> ---
>
> Key: CASSANDRA-11172
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11172
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: DSE 4.x / Cassandra 2.1.11.969
>Reporter: Jeff Ferland
>Assignee: Marcus Eriksson
>
> Observed that after a large repair on LCS that sometimes the system will 
> enter an infinite loop with vast amounts of logs lines recording, "Adding 
> high-level (L${LEVEL}) SSTableReader(path='${TABLE}') to candidates"
> This results in an outage of the node and eventual crashing. The log spam 
> quickly rotates out possibly useful earlier debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11172) Infinite loop bug adding high-level SSTableReader in compaction

2016-02-16 Thread Jeff Ferland (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148988#comment-15148988
 ] 

Jeff Ferland commented on CASSANDRA-11172:
--

It's this: `INFO  [CompactionExecutor:12750] 2016-02-16 17:47:48,642  
LeveledManifest.java:415 - Adding high-level (L0) 
SSTableReader(path='/mnt/cassandra/data/youtube/youtube_videos-2d16275b7ff93269bea0aac894e1abaa/youtube-youtube_videos-ka-104968-Data.db')
 to candidates` repeating endlessly. That one's extra special this time because 
of the L0. I'll look in our aggregation server and try to get value from the 
time when it starts.

> Infinite loop bug adding high-level SSTableReader in compaction
> ---
>
> Key: CASSANDRA-11172
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11172
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: DSE 4.x / Cassandra 2.1.11.969
>Reporter: Jeff Ferland
>Assignee: Marcus Eriksson
>
> Observed that after a large repair on LCS that sometimes the system will 
> enter an infinite loop with vast amounts of logs lines recording, "Adding 
> high-level (L${LEVEL}) SSTableReader(path='${TABLE}') to candidates"
> This results in an outage of the node and eventual crashing. The log spam 
> quickly rotates out possibly useful earlier debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-10721) Altering a UDT might break UDA deserialisation

2016-02-16 Thread Robert Stupp (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-10721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148979#comment-15148979
 ] 

Robert Stupp commented on CASSANDRA-10721:
--

Oh - heh - seems, that I coded the stuff on one machine but pushed to github 
from another. No, you didn't miss something.

> Altering a UDT might break UDA deserialisation
> --
>
> Key: CASSANDRA-10721
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10721
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL, Distributed Metadata
>Reporter: Aleksey Yeschenko
>Assignee: Robert Stupp
> Fix For: 3.0.x
>
>
> CASSANDRA-10650 switched UDA's {{initcond}} serialisation in schema to its 
> CQL literal. This means that if any particular field is renamed in the UDT, 
> or of its type gets changes, we will not be able to parse initcond back.
> We should either:
> 1) Forbid renames and type switches in UDTs that are being used in UDAs, or
> 2) Make sure we alter the UDAs in schema alongside the new UDT at all times



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-10587) sstablemetadata NPE on cassandra 2.2

2016-02-16 Thread Yuki Morishita (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuki Morishita updated CASSANDRA-10587:
---
Priority: Minor  (was: Major)
Reviewer: Paulo Motta

> sstablemetadata NPE on cassandra 2.2
> 
>
> Key: CASSANDRA-10587
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10587
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Tiago Batista
>Assignee: Yuki Morishita
>Priority: Minor
> Fix For: 2.2.x, 3.x
>
>
> I have recently upgraded my cassandra cluster to 2.2, currently running 
> 2.2.3. After running the first repair, cassandra renames the sstables to the 
> new naming schema that does not contain the keyspace name.
>  This causes sstablemetadata to fail with the following stack trace:
> {noformat}
> Exception in thread "main" java.lang.NullPointerException
> at 
> org.apache.cassandra.io.sstable.Descriptor.fromFilename(Descriptor.java:275)
> at 
> org.apache.cassandra.io.sstable.Descriptor.fromFilename(Descriptor.java:172)
> at 
> org.apache.cassandra.tools.SSTableMetadataViewer.main(SSTableMetadataViewer.java:52)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (CASSANDRA-10587) sstablemetadata NPE on cassandra 2.2

2016-02-16 Thread Yuki Morishita (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuki Morishita reassigned CASSANDRA-10587:
--

Assignee: Yuki Morishita  (was: Paulo Motta)

> sstablemetadata NPE on cassandra 2.2
> 
>
> Key: CASSANDRA-10587
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10587
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Tiago Batista
>Assignee: Yuki Morishita
> Fix For: 2.2.x, 3.x
>
>
> I have recently upgraded my cassandra cluster to 2.2, currently running 
> 2.2.3. After running the first repair, cassandra renames the sstables to the 
> new naming schema that does not contain the keyspace name.
>  This causes sstablemetadata to fail with the following stack trace:
> {noformat}
> Exception in thread "main" java.lang.NullPointerException
> at 
> org.apache.cassandra.io.sstable.Descriptor.fromFilename(Descriptor.java:275)
> at 
> org.apache.cassandra.io.sstable.Descriptor.fromFilename(Descriptor.java:172)
> at 
> org.apache.cassandra.tools.SSTableMetadataViewer.main(SSTableMetadataViewer.java:52)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-10587) sstablemetadata NPE on cassandra 2.2

2016-02-16 Thread Yuki Morishita (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-10587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148956#comment-15148956
 ] 

Yuki Morishita commented on CASSANDRA-10587:


This is caused by invoking {{sstablemetadata}} with relative path like {{$ 
sstablemetadata la-1-big-Data.db}}.
You can workaround by giving absolute path.
I will attach patches for fix.

> sstablemetadata NPE on cassandra 2.2
> 
>
> Key: CASSANDRA-10587
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10587
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Tiago Batista
>Assignee: Paulo Motta
> Fix For: 2.2.x, 3.x
>
>
> I have recently upgraded my cassandra cluster to 2.2, currently running 
> 2.2.3. After running the first repair, cassandra renames the sstables to the 
> new naming schema that does not contain the keyspace name.
>  This causes sstablemetadata to fail with the following stack trace:
> {noformat}
> Exception in thread "main" java.lang.NullPointerException
> at 
> org.apache.cassandra.io.sstable.Descriptor.fromFilename(Descriptor.java:275)
> at 
> org.apache.cassandra.io.sstable.Descriptor.fromFilename(Descriptor.java:172)
> at 
> org.apache.cassandra.tools.SSTableMetadataViewer.main(SSTableMetadataViewer.java:52)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-10371) Decommissioned nodes can remain in gossip

2016-02-16 Thread pavel (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-10371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148893#comment-15148893
 ] 

pavel commented on CASSANDRA-10371:
---

Logs. Dead node is 10.0.1.32

{noformat}
{"time":"2016-02-16T16:31:24.209Z","level":"TRACE","msg":"requestAll for 
/10.0.1.32"}
{"time":"2016-02-16T16:31:24.210Z","level":"TRACE","msg":"Received a 
GossipDigestAck2Message from /10.0.2.128"}
{"time":"2016-02-16T16:31:24.210Z","level":"TRACE","msg":"reporting 
/10.0.2.128"}
{"time":"2016-02-16T16:31:24.210Z","level":"TRACE","msg":"/10.0.2.128local 
generation 1454668510, remote generation 1454668510"}
{"time":"2016-02-16T16:31:24.210Z","level":"TRACE","msg":"Updating heartbeat 
state version to 2947566 from 2947563 for /10.0.2.128 ..."}
{"time":"2016-02-16T16:31:24.210Z","level":"TRACE","msg":"Ignoring gossip for 
/10.0.1.32 because it is quarantined"}
{"time":"2016-02-16T16:31:24.351Z","level":"TRACE","msg":"My heartbeat is now 
2947086"}
{"time":"2016-02-16T16:31:24.351Z","level":"TRACE","msg":"Gossip Digests are : 
/10.0.1.88:1454668666:2947086 /10.0.1.129:1454674252:2930159 
/10.0.2.128:1454668510:2947566 /10.0.1.12:1454675536:2926261 
/10.0.2.117:1454676714:2922695 /10.0.2.168:1455284954:1077795 "}
{"time":"2016-02-16T16:31:24.351Z","level":"TRACE","msg":"Sending a 
GossipDigestSyn to /10.0.1.12 ..."}
{"time":"2016-02-16T16:31:24.351Z","level":"TRACE","msg":"Performing status 
check ..."}
{"time":"2016-02-16T16:31:24.351Z","level":"TRACE","msg":"PHI for /10.0.2.128 : 
0.13901063851144269"}
{"time":"2016-02-16T16:31:24.351Z","level":"TRACE","msg":"PHI for /10.0.1.129 : 
0.21683730070438223"}
{"time":"2016-02-16T16:31:24.351Z","level":"TRACE","msg":"PHI for /10.0.2.168 : 
0.2206211272280096"}
{"time":"2016-02-16T16:31:24.351Z","level":"TRACE","msg":"PHI for /10.0.1.12 : 
0.2149248608862009"}
{"time":"2016-02-16T16:31:24.352Z","level":"TRACE","msg":"PHI for /10.0.2.117 : 
0.2130329987137985"}
{"time":"2016-02-16T16:31:24.352Z","level":"DEBUG","msg":"6 elapsed, 
/10.0.1.32 gossip quarantine over"}
{"time":"2016-02-16T16:31:24.352Z","level":"TRACE","msg":"Received a 
GossipDigestAckMessage from /10.0.1.12"}
{"time":"2016-02-16T16:31:24.352Z","level":"TRACE","msg":"Received ack with 2 
digests and 0 states"}
{"time":"2016-02-16T16:31:24.352Z","level":"TRACE","msg":"local heartbeat 
version 2947086 greater than 2947085 for /10.0.1.88"}
{"time":"2016-02-16T16:31:24.352Z","level":"TRACE","msg":"local heartbeat 
version 2947566 greater than 2947565 for /10.0.2.128"}
{"time":"2016-02-16T16:31:24.352Z","level":"TRACE","msg":"Sending a 
GossipDigestAck2Message to /10.0.1.12"}
{"time":"2016-02-16T16:31:24.542Z","level":"TRACE","msg":"local heartbeat 
version 2926261 greater than 2926258 for /10.0.1.12"}
{"time":"2016-02-16T16:31:24.542Z","level":"TRACE","msg":"Adding state 
SEVERITY: 0.0"}
{"time":"2016-02-16T16:31:24.542Z","level":"TRACE","msg":"local heartbeat 
version 2947086 greater than 2947080 for /10.0.1.88"}
{"time":"2016-02-16T16:31:24.542Z","level":"TRACE","msg":"Adding state 
SEVERITY: 0.0"}
{"time":"2016-02-16T16:31:24.542Z","level":"TRACE","msg":"local heartbeat 
version 2947566 greater than 2947563 for /10.0.2.128"}
{"time":"2016-02-16T16:31:24.542Z","level":"TRACE","msg":"Adding state 
SEVERITY: 0.0"}
{"time":"2016-02-16T16:31:24.543Z","level":"TRACE","msg":"Received a 
GossipDigestAck2Message from /10.0.1.129"}
{"time":"2016-02-16T16:31:24.543Z","level":"TRACE","msg":"reporting 
/10.0.1.129"}
{"time":"2016-02-16T16:31:24.543Z","level":"TRACE","msg":"/10.0.1.129local 
generation 1454674252, remote generation 1454674252"}
{"time":"2016-02-16T16:31:24.543Z","level":"TRACE","msg":"Updating heartbeat 
state version to 2930162 from 2930159 for /10.0.1.129 ..."}
{"time":"2016-02-16T16:31:25.209Z","level":"TRACE","msg":"local heartbeat 
version 2930162 greater than 2930159 for /10.0.1.129"}
{"time":"2016-02-16T16:31:25.209Z","level":"TRACE","msg":"Adding state 
SEVERITY: 0.0"}
{"time":"2016-02-16T16:31:25.209Z","level":"TRACE","msg":"local heartbeat 
version 2947086 greater than 2947085 for /10.0.1.88"}
{"time":"2016-02-16T16:31:25.209Z","level":"TRACE","msg":"Adding state 
SEVERITY: 0.0"}
{"time":"2016-02-16T16:31:25.209Z","level":"TRACE","msg":"requestAll for 
/10.0.1.32"}
{"time":"2016-02-16T16:31:25.210Z","level":"TRACE","msg":"Received a 
GossipDigestAck2Message from /10.0.2.128"}
{"time":"2016-02-16T16:31:25.210Z","level":"TRACE","msg":"reporting 
/10.0.2.128"}
{"time":"2016-02-16T16:31:25.210Z","level":"TRACE","msg":"reporting 
/10.0.2.117"}
{"time":"2016-02-16T16:31:25.210Z","level":"TRACE","msg":"/10.0.2.128local 
generation 1454668510, remote generation 1454668510"}
{"time":"2016-02-16T16:31:25.210Z","level":"TRACE","msg":"Updating heartbeat 
state version to 2947569 from 2947566 for /10.0.2.128 ..."}
{"time":"2016-02-16T16:31:25.210Z","level":"TRACE","msg":"reporting /10.0.1.32"}

[jira] [Comment Edited] (CASSANDRA-10587) sstablemetadata NPE on cassandra 2.2

2016-02-16 Thread Roman Skvazh (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-10587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148877#comment-15148877
 ] 

Roman Skvazh edited comment on CASSANDRA-10587 at 2/16/16 4:47 PM:
---

Same problem here on 2.2.5 after upgrading from 2.1.13 on all nodes on new 
upgraded "short-named"-files.
{noformat}
Exception in thread "main" java.lang.NullPointerException
at 
org.apache.cassandra.io.sstable.Descriptor.fromFilename(Descriptor.java:271)
at 
org.apache.cassandra.io.sstable.Descriptor.fromFilename(Descriptor.java:172)
at 
org.apache.cassandra.tools.SSTableMetadataViewer.main(SSTableMetadataViewer.java:52)
{noformat}


was (Author: rskvazh):
Same problem here on 2.2.5 after upgrading from 2.1.13 on all nodes.
{noformat}
Exception in thread "main" java.lang.NullPointerException
at 
org.apache.cassandra.io.sstable.Descriptor.fromFilename(Descriptor.java:271)
at 
org.apache.cassandra.io.sstable.Descriptor.fromFilename(Descriptor.java:172)
at 
org.apache.cassandra.tools.SSTableMetadataViewer.main(SSTableMetadataViewer.java:52)
{noformat}

> sstablemetadata NPE on cassandra 2.2
> 
>
> Key: CASSANDRA-10587
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10587
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Tiago Batista
>Assignee: Paulo Motta
> Fix For: 2.2.x, 3.x
>
>
> I have recently upgraded my cassandra cluster to 2.2, currently running 
> 2.2.3. After running the first repair, cassandra renames the sstables to the 
> new naming schema that does not contain the keyspace name.
>  This causes sstablemetadata to fail with the following stack trace:
> {noformat}
> Exception in thread "main" java.lang.NullPointerException
> at 
> org.apache.cassandra.io.sstable.Descriptor.fromFilename(Descriptor.java:275)
> at 
> org.apache.cassandra.io.sstable.Descriptor.fromFilename(Descriptor.java:172)
> at 
> org.apache.cassandra.tools.SSTableMetadataViewer.main(SSTableMetadataViewer.java:52)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9325) cassandra-stress requires keystore for SSL but provides no way to configure it

2016-02-16 Thread Stefan Podkowinski (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148880#comment-15148880
 ] 

Stefan Podkowinski commented on CASSANDRA-9325:
---

Can be reproduced using  e.g. the following stress tool options:
{{./bin/cassandra-stress "write n=100k cl=ONE no-warmup" -transport 
truststore=$HOME/truststore.jks truststore-password=cassandra}}


> cassandra-stress requires keystore for SSL but provides no way to configure it
> --
>
> Key: CASSANDRA-9325
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9325
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: J.B. Langston
>Assignee: Stefan Podkowinski
>  Labels: lhf, stress
> Fix For: 2.2.x
>
>
> Even though it shouldn't be required unless client certificate authentication 
> is enabled, the stress tool is looking for a keystore in the default location 
> of conf/.keystore with the default password of cassandra. There is no command 
> line option to override these defaults so you have to provide a keystore that 
> satisfies the default. It looks for conf/.keystore in the working directory, 
> so you need to create this in the directory you are running cassandra-stress 
> from.It doesn't really matter what's in the keystore; it just needs to exist 
> in the expected location and have a password of cassandra.
> Since the keystore might be required if client certificate authentication is 
> enabled, we need to add -transport parameters for keystore and 
> keystore-password.  Ideally, these should be optional and stress shouldn't 
> require the keystore unless client certificate authentication is enabled on 
> the server.
> In case it wasn't apparent, this is for Cassandra 2.1 and later's stress 
> tool.  I actually had even more problems getting Cassandra 2.0's stress tool 
> working with SSL and gave up on it.  We probably don't need to fix 2.0; we 
> can just document that it doesn't support SSL and recommend using 2.1 instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-10587) sstablemetadata NPE on cassandra 2.2

2016-02-16 Thread Roman Skvazh (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-10587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148877#comment-15148877
 ] 

Roman Skvazh commented on CASSANDRA-10587:
--

Same problem here on 2.2.5 after upgrading from 2.1.13 on all nodes.
{noformat}
Exception in thread "main" java.lang.NullPointerException
at 
org.apache.cassandra.io.sstable.Descriptor.fromFilename(Descriptor.java:271)
at 
org.apache.cassandra.io.sstable.Descriptor.fromFilename(Descriptor.java:172)
at 
org.apache.cassandra.tools.SSTableMetadataViewer.main(SSTableMetadataViewer.java:52)
{noformat}

> sstablemetadata NPE on cassandra 2.2
> 
>
> Key: CASSANDRA-10587
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10587
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Tiago Batista
>Assignee: Paulo Motta
> Fix For: 2.2.x, 3.x
>
>
> I have recently upgraded my cassandra cluster to 2.2, currently running 
> 2.2.3. After running the first repair, cassandra renames the sstables to the 
> new naming schema that does not contain the keyspace name.
>  This causes sstablemetadata to fail with the following stack trace:
> {noformat}
> Exception in thread "main" java.lang.NullPointerException
> at 
> org.apache.cassandra.io.sstable.Descriptor.fromFilename(Descriptor.java:275)
> at 
> org.apache.cassandra.io.sstable.Descriptor.fromFilename(Descriptor.java:172)
> at 
> org.apache.cassandra.tools.SSTableMetadataViewer.main(SSTableMetadataViewer.java:52)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-10371) Decommissioned nodes can remain in gossip

2016-02-16 Thread pavel (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-10371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148886#comment-15148886
 ] 

pavel commented on CASSANDRA-10371:
---

Cluster info

{noformat}
bash-4.3# nodetool status
Datacenter: cassandra
=
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address Load   Tokens  OwnsHost ID  
 Rack
UN  10.0.2.128  1.89 GB256 ?   f019fed3-9881-4ca6-b033-3af6390c74aa 
 RAC1
UN  10.0.1.129  1.79 GB256 ?   40c12e86-759d-495e-8aab-f388142fe424 
 RAC1
UN  10.0.2.117  1.63 GB256 ?   9b28cac9-27a0-45fa-acd3-c509d528a09b 
 RAC1
UN  10.0.1.88   2.05 GB256 ?   d141da3e-8bc9-4c7a-8f03-2bc728df71b8 
 RAC1
UN  10.0.2.168  1.39 GB256 ?   5243d4ba-47c2-4cce-91da-246b63d029a7 
 RAC1
UN  10.0.1.12   1.55 GB256 ?   2622b27e-a730-440d-b8fd-dc88f0a48487 
 RAC1
{noformat}

> Decommissioned nodes can remain in gossip
> -
>
> Key: CASSANDRA-10371
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10371
> Project: Cassandra
>  Issue Type: Bug
>  Components: Distributed Metadata
>Reporter: Brandon Williams
>Assignee: Stefania
>Priority: Minor
>
> This may apply to other dead states as well.  Dead states should be expired 
> after 3 days.  In the case of decom we attach a timestamp to let the other 
> nodes know when it should be expired.  It has been observed that sometimes a 
> subset of nodes in the cluster never expire the state, and through heap 
> analysis of these nodes it is revealed that the epstate.isAlive check returns 
> true when it should return false, which would allow the state to be evicted.  
> This may have been affected by CASSANDRA-8336.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (CASSANDRA-9325) cassandra-stress requires keystore for SSL but provides no way to configure it

2016-02-16 Thread Stefan Podkowinski (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Podkowinski reassigned CASSANDRA-9325:
-

Assignee: Stefan Podkowinski

> cassandra-stress requires keystore for SSL but provides no way to configure it
> --
>
> Key: CASSANDRA-9325
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9325
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: J.B. Langston
>Assignee: Stefan Podkowinski
>  Labels: lhf, stress
> Fix For: 2.1.x
>
>
> Even though it shouldn't be required unless client certificate authentication 
> is enabled, the stress tool is looking for a keystore in the default location 
> of conf/.keystore with the default password of cassandra. There is no command 
> line option to override these defaults so you have to provide a keystore that 
> satisfies the default. It looks for conf/.keystore in the working directory, 
> so you need to create this in the directory you are running cassandra-stress 
> from.It doesn't really matter what's in the keystore; it just needs to exist 
> in the expected location and have a password of cassandra.
> Since the keystore might be required if client certificate authentication is 
> enabled, we need to add -transport parameters for keystore and 
> keystore-password.  Ideally, these should be optional and stress shouldn't 
> require the keystore unless client certificate authentication is enabled on 
> the server.
> In case it wasn't apparent, this is for Cassandra 2.1 and later's stress 
> tool.  I actually had even more problems getting Cassandra 2.0's stress tool 
> working with SSL and gave up on it.  We probably don't need to fix 2.0; we 
> can just document that it doesn't support SSL and recommend using 2.1 instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-10070) Automatic repair scheduling

2016-02-16 Thread Marcus Olsson (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148842#comment-15148842
 ] 

Marcus Olsson commented on CASSANDRA-10070:
---

bq. Do we intend to reuse the lock table for other maintenance tasks as well? 
If so, we must add a generic "holder" column to the lock table so we can reuse 
to identify resources other than the parent repair session in the future. We 
could also add an "attributes" map in the lock table to store additional 
attributes such as status, or have a separate table to maintain status to keep 
the lock table simple.

I think it could be reused, so it's probably better to do it generic from the 
start. I think that as long as we don't put too much data in the attributes 
map, it could be stored in the lock table. Another thing is that it's tightly 
bound to the lock itself, since we will use it to clean up repairs without a 
lock, which means keeping it in a single table is probably the easiest solution.

Another thing we should probably consider is whether or not multiple types of 
maintenance work should run simultaneously. If we need to add this constraint, 
should they use the same lock resources?

bq. Ideally all repairs would go through this interface, but this would 
probably add complexity at this stage. So we should probably just add a 
"lockResource" attribute to each repair session object, and each node would go 
through all repairs currently running checking if it still holds the lock in 
case the "lockResource" field is set.

Sounds good, let's start with the lockResource field in the repair session and 
move to scheduled repairs all together later on (maybe optionally scheduled via 
JMX at first?).

{quote}
It would probably be safe to abort ongoing validation and stream background 
tasks and cleanup repair state on all involved nodes before starting a new 
repair session in the same ranges. This doesn't seem to be done currently. As 
far as I understood, if there are nodes A, B, C running repair, A is the 
coordinator. If validation or streaming fails on node B, the coordinator (A) is 
notified and fails the repair session, but node C will remain doing validation 
and/or streaming, what could cause problems (or increased load) if we start 
another repair session on the same range. 

We will probably need to extend the repair protocol to perform this 
cleanup/abort step on failure. We already have a legacy cleanup message that 
doesn't seem to be used in the current protocol that we could maybe reuse to 
cleanup repair state after a failure. This repair abortion will probably have 
intersection with CASSANDRA-3486. In any case, this is a separate (but related) 
issue and we should address it in an independent ticket, and make this ticket 
dependent on that.
{quote}

Right now it seems that the cleanup message is only used to remove the parent 
repair session from the ActiveRepairService's map. I guess that if we should 
use it we would have to rewrite it to stop validation and streaming as well. 
But as you said, it should be done in a separate ticket.

bq. Another unrelated option that we should probably include in the future is a 
timeout, and abort repair sessions running longer than that.

Agreed. Do we have any time out scenarios that we could foresee before they 
occur? Would it be possible for a node to "drop" a validation/streaming without 
notifying the repair coordinator? If we could detect that, it would be good to 
abort the repair as early as possible, assuming that the timeout would be set 
rather high.

> Automatic repair scheduling
> ---
>
> Key: CASSANDRA-10070
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10070
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Marcus Olsson
>Assignee: Marcus Olsson
>Priority: Minor
> Fix For: 3.x
>
> Attachments: Distributed Repair Scheduling.doc
>
>
> Scheduling and running repairs in a Cassandra cluster is most often a 
> required task, but this can both be hard for new users and it also requires a 
> bit of manual configuration. There are good tools out there that can be used 
> to simplify things, but wouldn't this be a good feature to have inside of 
> Cassandra? To automatically schedule and run repairs, so that when you start 
> up your cluster it basically maintains itself in terms of normal 
> anti-entropy, with the possibility for manual configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (CASSANDRA-11170) Uneven load can be created by cross DC mutation propagations, as remote coordinator is not randomly picked

2016-02-16 Thread Wei Deng (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Deng reassigned CASSANDRA-11170:


Assignee: Wei Deng

> Uneven load can be created by cross DC mutation propagations, as remote 
> coordinator is not randomly picked
> --
>
> Key: CASSANDRA-11170
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11170
> Project: Cassandra
>  Issue Type: Bug
>  Components: Coordination
>Reporter: Wei Deng
>Assignee: Wei Deng
>
> I was looking at the o.a.c.service.StorageProxy code and realized that it 
> seems to be always picking the first IP in the remote DC target list as the 
> destination, whenever it needs to send the mutation to a remote DC. See these 
> lines in the code:
> https://github.com/apache/cassandra/blob/1944bf507d66b5c103c136319caeb4a9e3767a69/src/java/org/apache/cassandra/service/StorageProxy.java#L1280-L1301
> This could cause one node in the remote DC receiving more mutation messages 
> than the other nodes, and hence uneven workload distribution.
> A trivial test (with TRACE logging level enabled) on a 3+3 node cluster 
> proved the problem, see the system.log entries below:
> {code}
> INFO  [RMI TCP Connection(18)-54.173.227.52] 2016-02-13 09:54:55,948  
> StorageService.java:3353 - set log level to TRACE for classes under 
> 'org.apache.cassandra.service.StorageProxy' (if the level doesn't look like 
> 'TRACE' then the logger couldn't parse 'TRACE')
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:15,148  StorageProxy.java:1284 - 
> Adding FWD message to 8996@/52.53.215.74
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:15,149  StorageProxy.java:1284 - 
> Adding FWD message to 8997@/54.183.23.201
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:15,149  StorageProxy.java:1289 - 
> Sending message to 8998@/54.183.209.219
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:22,939  StorageProxy.java:1284 - 
> Adding FWD message to 9032@/52.53.215.74
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:22,940  StorageProxy.java:1284 - 
> Adding FWD message to 9033@/54.183.23.201
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:22,941  StorageProxy.java:1289 - 
> Sending message to 9034@/54.183.209.219
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:28,975  StorageProxy.java:1284 - 
> Adding FWD message to 9064@/52.53.215.74
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:28,976  StorageProxy.java:1284 - 
> Adding FWD message to 9065@/54.183.23.201
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:28,977  StorageProxy.java:1289 - 
> Sending message to 9066@/54.183.209.219
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:33,464  StorageProxy.java:1284 - 
> Adding FWD message to 9094@/52.53.215.74
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:33,465  StorageProxy.java:1284 - 
> Adding FWD message to 9095@/54.183.23.201
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:33,478  StorageProxy.java:1289 - 
> Sending message to 9096@/54.183.209.219
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:39,243  StorageProxy.java:1284 - 
> Adding FWD message to 9121@/52.53.215.74
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:39,244  StorageProxy.java:1284 - 
> Adding FWD message to 9122@/54.183.23.201
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:39,244  StorageProxy.java:1289 - 
> Sending message to 9123@/54.183.209.219
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:44,248  StorageProxy.java:1284 - 
> Adding FWD message to 9145@/52.53.215.74
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:44,249  StorageProxy.java:1284 - 
> Adding FWD message to 9146@/54.183.23.201
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:44,249  StorageProxy.java:1289 - 
> Sending message to 9147@/54.183.209.219
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:49,731  StorageProxy.java:1284 - 
> Adding FWD message to 9170@/52.53.215.74
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:49,734  StorageProxy.java:1284 - 
> Adding FWD message to 9171@/54.183.23.201
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:49,735  StorageProxy.java:1289 - 
> Sending message to 9172@/54.183.209.219
> INFO  [RMI TCP Connection(22)-54.173.227.52] 2016-02-13 09:56:19,545  
> StorageService.java:3353 - set log level to INFO for classes under 
> 'org.apache.cassandra.service.StorageProxy' (if the level doesn't look like 
> 'INFO' then the logger couldn't parse 'INFO')
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11065) null pointer exception in CassandraDaemon.java:195

2016-02-16 Thread Yuki Morishita (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148836#comment-15148836
 ] 

Yuki Morishita commented on CASSANDRA-11065:


[~pauloricardomg] Checking whether the schema (ks/table) exists is something we 
need to do whenever we encounter {{Keyspace.open...}} through out the project. 
And I think what we do when we find schema is missing is vary.
I prefer fixing {{RepairVerbHandler}} only in this case.

Now, your patch throws {{KeyspaceNotDefinedException}} which is 
RuntimeException, so it can hang repair process. And for the first case, we can 
fail when accessing pair of Strings got from {{Pair kscf = 
Schema.instance.getCF(cfId);}}, not from {{Keyspace.open}}.

For each cases, we should log error and send back message that indicates 
"failure".

- PREPARE_* and SNAPSHOT: send back INTERNAL_RESPONSE with failure flag on (as 
done in SNAPSHOT)
- VALIDATION_REQUEST: send back ValidationResponse with empty tree (see 
Validator)



> null pointer exception in CassandraDaemon.java:195
> --
>
> Key: CASSANDRA-11065
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11065
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Vassil Lunchev
>Assignee: Paulo Motta
>Priority: Minor
>
> Running Cassandra 3.0.1 installed from apt-get on debian.
> I had a keyspace called 'tests'. I dropped it. Then I checked some nodes and 
> one of them still had that keyspace 'tests'. On a node that still has the 
> dropped keyspace I ran:
> nodetools repair tests;
> In the system logs of another node that did not have keyspace 'tests' I am 
> seeing a null pointer exception:
> {code:java}
> ERROR [AntiEntropyStage:2] 2016-01-25 15:02:46,323 
> RepairMessageVerbHandler.java:161 - Got error, removing parent repair session
> ERROR [AntiEntropyStage:2] 2016-01-25 15:02:46,324 CassandraDaemon.java:195 - 
> Exception in thread Thread[AntiEntropyStage:2,5,main]
> java.lang.RuntimeException: java.lang.NullPointerException
>   at 
> org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:164)
>  ~[apache-cassandra-3.0.1.jar:3.0.1]
>   at 
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:67) 
> ~[apache-cassandra-3.0.1.jar:3.0.1]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[na:1.8.0_66-internal]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  ~[na:1.8.0_66-internal]
>   at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_66-internal]
> Caused by: java.lang.NullPointerException: null
>   at 
> org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:69)
>  ~[apache-cassandra-3.0.1.jar:3.0.1]
>   ... 4 common frames omitted
> {code}
> The error appears every time I run:
> nodetools repair tests;
> I can see it in the logs of all nodes, including the node on which I run the 
> repair.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11170) Uneven load can be created by cross DC mutation propagations, as remote coordinator is not randomly picked

2016-02-16 Thread Aleksey Yeschenko (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148820#comment-15148820
 ] 

Aleksey Yeschenko commented on CASSANDRA-11170:
---

I agree the list should be randomised per request (though I'm going to go ahead 
and relabel the ticket as Improvement).

> Uneven load can be created by cross DC mutation propagations, as remote 
> coordinator is not randomly picked
> --
>
> Key: CASSANDRA-11170
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11170
> Project: Cassandra
>  Issue Type: Bug
>  Components: Coordination
>Reporter: Wei Deng
>
> I was looking at the o.a.c.service.StorageProxy code and realized that it 
> seems to be always picking the first IP in the remote DC target list as the 
> destination, whenever it needs to send the mutation to a remote DC. See these 
> lines in the code:
> https://github.com/apache/cassandra/blob/1944bf507d66b5c103c136319caeb4a9e3767a69/src/java/org/apache/cassandra/service/StorageProxy.java#L1280-L1301
> This could cause one node in the remote DC receiving more mutation messages 
> than the other nodes, and hence uneven workload distribution.
> A trivial test (with TRACE logging level enabled) on a 3+3 node cluster 
> proved the problem, see the system.log entries below:
> {code}
> INFO  [RMI TCP Connection(18)-54.173.227.52] 2016-02-13 09:54:55,948  
> StorageService.java:3353 - set log level to TRACE for classes under 
> 'org.apache.cassandra.service.StorageProxy' (if the level doesn't look like 
> 'TRACE' then the logger couldn't parse 'TRACE')
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:15,148  StorageProxy.java:1284 - 
> Adding FWD message to 8996@/52.53.215.74
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:15,149  StorageProxy.java:1284 - 
> Adding FWD message to 8997@/54.183.23.201
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:15,149  StorageProxy.java:1289 - 
> Sending message to 8998@/54.183.209.219
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:22,939  StorageProxy.java:1284 - 
> Adding FWD message to 9032@/52.53.215.74
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:22,940  StorageProxy.java:1284 - 
> Adding FWD message to 9033@/54.183.23.201
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:22,941  StorageProxy.java:1289 - 
> Sending message to 9034@/54.183.209.219
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:28,975  StorageProxy.java:1284 - 
> Adding FWD message to 9064@/52.53.215.74
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:28,976  StorageProxy.java:1284 - 
> Adding FWD message to 9065@/54.183.23.201
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:28,977  StorageProxy.java:1289 - 
> Sending message to 9066@/54.183.209.219
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:33,464  StorageProxy.java:1284 - 
> Adding FWD message to 9094@/52.53.215.74
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:33,465  StorageProxy.java:1284 - 
> Adding FWD message to 9095@/54.183.23.201
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:33,478  StorageProxy.java:1289 - 
> Sending message to 9096@/54.183.209.219
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:39,243  StorageProxy.java:1284 - 
> Adding FWD message to 9121@/52.53.215.74
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:39,244  StorageProxy.java:1284 - 
> Adding FWD message to 9122@/54.183.23.201
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:39,244  StorageProxy.java:1289 - 
> Sending message to 9123@/54.183.209.219
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:44,248  StorageProxy.java:1284 - 
> Adding FWD message to 9145@/52.53.215.74
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:44,249  StorageProxy.java:1284 - 
> Adding FWD message to 9146@/54.183.23.201
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:44,249  StorageProxy.java:1289 - 
> Sending message to 9147@/54.183.209.219
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:49,731  StorageProxy.java:1284 - 
> Adding FWD message to 9170@/52.53.215.74
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:49,734  StorageProxy.java:1284 - 
> Adding FWD message to 9171@/54.183.23.201
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:49,735  StorageProxy.java:1289 - 
> Sending message to 9172@/54.183.209.219
> INFO  [RMI TCP Connection(22)-54.173.227.52] 2016-02-13 09:56:19,545  
> StorageService.java:3353 - set log level to INFO for classes under 
> 'org.apache.cassandra.service.StorageProxy' (if the level doesn't look like 
> 'INFO' then the logger couldn't parse 'INFO')
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-10430) "Load" report from "nodetool status" is inaccurate

2016-02-16 Thread clint martin (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-10430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148816#comment-15148816
 ] 

clint martin commented on CASSANDRA-10430:
--

Thank you!

> "Load" report from "nodetool status" is inaccurate
> --
>
> Key: CASSANDRA-10430
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10430
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
> Environment: Cassandra v2.1.9 running on 6 node Amazon AWS, vnodes 
> enabled. 
>Reporter: julia zhang
> Attachments: system.log.2.zip, system.log.3.zip, system.log.4.zip
>
>
> After running an incremental repair, nodetool status report unbalanced load 
> among cluster. 
> $ nodetool status mykeyspace
> ==
> ||Status|| Address ||Load   ||Tokens  ||Owns (effective)  
> ||Host ID ||  Rack || 
> |UN  |10.1.1.1  |1.13 TB   |256|48.5%
> |a4477534-a5c6-4e3e-9108-17a69aebcfc0|  RAC1|
> |UN  |10.1.1.2  |2.58 TB   |256 |50.5% 
> |1a7c3864-879f-48c5-8dde-bc00cf4b23e6  |RAC2|
> |UN  |10.1.1.3  |1.49 TB   |256 |51.5% 
> |27df5b30-a5fc-44a5-9a2c-1cd65e1ba3f7  |RAC1|
> |UN  |10.1.1.4  |250.97 GB  |256 |51.9% 
> |9898a278-2fe6-4da2-b6dc-392e5fda51e6  |RAC3|
> |UN  |10.1.1.5 |1.88 TB  |256 |49.5% 
> |04aa9ce1-c1c3-4886-8d72-270b024b49b9  |RAC2|
> |UN  |10.1.1.6 |1.3 TB|256 |48.1% 
> |6d5d48e6-d188-4f88-808d-dcdbb39fdca5  |RAC3|
> It seems that only 10.1.1.4 reports correct "Load". There is no hints in the 
> cluster and report remains the same after running "nodetool cleanup" on each 
> node. "nodetool cfstats" shows number of keys are evenly distributed and 
> Cassandra data physical disk on each node report about the same usage. 
> "nodetool status" report these inaccurate large storage load until we restart 
> each node, after the restart, "Load" report match what we've seen from disk.  
> We did not see this behavior until upgrade to v2.1.9



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11170) Uneven load can be created by cross DC mutation propagations, as remote coordinator is not randomly picked

2016-02-16 Thread Ryan Svihla (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148798#comment-15148798
 ] 

Ryan Svihla commented on CASSANDRA-11170:
-

Fat partitions and bad data models exist, no need to make them worse by pinning 
all write load to one unfortunate unlucky node until it dies. Going to a single 
node just lowers the bar for the data model falling apart, I get RF writes will 
happen anyway, but I'm assuming coordinator work is non trivial (especially on 
higher levels of CL) and I know from observation that hint handling and replay 
is non trivial especially at certain points (I'm certain improved with file 
based hints but I'm also certain not free).  Final point, if you think of the 
stereotypical time series bucket data model the "stick to the primary token 
owner" approach will generate more hints than SOME strategy of balancing the 
load.

> Uneven load can be created by cross DC mutation propagations, as remote 
> coordinator is not randomly picked
> --
>
> Key: CASSANDRA-11170
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11170
> Project: Cassandra
>  Issue Type: Bug
>  Components: Coordination
>Reporter: Wei Deng
>
> I was looking at the o.a.c.service.StorageProxy code and realized that it 
> seems to be always picking the first IP in the remote DC target list as the 
> destination, whenever it needs to send the mutation to a remote DC. See these 
> lines in the code:
> https://github.com/apache/cassandra/blob/1944bf507d66b5c103c136319caeb4a9e3767a69/src/java/org/apache/cassandra/service/StorageProxy.java#L1280-L1301
> This could cause one node in the remote DC receiving more mutation messages 
> than the other nodes, and hence uneven workload distribution.
> A trivial test (with TRACE logging level enabled) on a 3+3 node cluster 
> proved the problem, see the system.log entries below:
> {code}
> INFO  [RMI TCP Connection(18)-54.173.227.52] 2016-02-13 09:54:55,948  
> StorageService.java:3353 - set log level to TRACE for classes under 
> 'org.apache.cassandra.service.StorageProxy' (if the level doesn't look like 
> 'TRACE' then the logger couldn't parse 'TRACE')
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:15,148  StorageProxy.java:1284 - 
> Adding FWD message to 8996@/52.53.215.74
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:15,149  StorageProxy.java:1284 - 
> Adding FWD message to 8997@/54.183.23.201
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:15,149  StorageProxy.java:1289 - 
> Sending message to 8998@/54.183.209.219
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:22,939  StorageProxy.java:1284 - 
> Adding FWD message to 9032@/52.53.215.74
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:22,940  StorageProxy.java:1284 - 
> Adding FWD message to 9033@/54.183.23.201
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:22,941  StorageProxy.java:1289 - 
> Sending message to 9034@/54.183.209.219
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:28,975  StorageProxy.java:1284 - 
> Adding FWD message to 9064@/52.53.215.74
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:28,976  StorageProxy.java:1284 - 
> Adding FWD message to 9065@/54.183.23.201
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:28,977  StorageProxy.java:1289 - 
> Sending message to 9066@/54.183.209.219
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:33,464  StorageProxy.java:1284 - 
> Adding FWD message to 9094@/52.53.215.74
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:33,465  StorageProxy.java:1284 - 
> Adding FWD message to 9095@/54.183.23.201
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:33,478  StorageProxy.java:1289 - 
> Sending message to 9096@/54.183.209.219
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:39,243  StorageProxy.java:1284 - 
> Adding FWD message to 9121@/52.53.215.74
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:39,244  StorageProxy.java:1284 - 
> Adding FWD message to 9122@/54.183.23.201
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:39,244  StorageProxy.java:1289 - 
> Sending message to 9123@/54.183.209.219
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:44,248  StorageProxy.java:1284 - 
> Adding FWD message to 9145@/52.53.215.74
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:44,249  StorageProxy.java:1284 - 
> Adding FWD message to 9146@/54.183.23.201
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:44,249  StorageProxy.java:1289 - 
> Sending message to 9147@/54.183.209.219
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:49,731  StorageProxy.java:1284 - 
> Adding FWD message to 9170@/52.53.215.74
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:49,734  StorageProxy.java:1284 - 
> Adding FWD message to 9171@/54.183.23.201
> TRACE [SharedPool-Worker-1] 2016-02-13 09:55:49,735  StorageProxy.java:1289 - 
> Sending message to

[jira] [Commented] (CASSANDRA-10458) cqlshrc: add option to always use ssl

2016-02-16 Thread Stefan Podkowinski (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-10458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148780#comment-15148780
 ] 

Stefan Podkowinski commented on CASSANDRA-10458:


Do you mind to review this very simple patch, [~pauloricardomg]?

> cqlshrc: add option to always use ssl
> -
>
> Key: CASSANDRA-10458
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10458
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Matt Wringe
>Assignee: Stefan Podkowinski
>  Labels: lhf
>
> I am currently running on a system in which my cassandra cluster is only 
> accessible over tls.
> The cqlshrc file is used to specify the host, the certificates and other 
> configurations, but one option its missing is to always connect over ssl.
> I would like to be able to call 'cqlsh' instead of always having to specify 
> 'cqlsh --ssl'



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (CASSANDRA-10458) cqlshrc: add option to always use ssl

2016-02-16 Thread Stefan Podkowinski (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Podkowinski reassigned CASSANDRA-10458:
--

Assignee: Stefan Podkowinski

> cqlshrc: add option to always use ssl
> -
>
> Key: CASSANDRA-10458
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10458
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Matt Wringe
>Assignee: Stefan Podkowinski
>  Labels: lhf
>
> I am currently running on a system in which my cassandra cluster is only 
> accessible over tls.
> The cqlshrc file is used to specify the host, the certificates and other 
> configurations, but one option its missing is to always connect over ssl.
> I would like to be able to call 'cqlsh' instead of always having to specify 
> 'cqlsh --ssl'



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11035) Use cardinality estimation to pick better compaction candidates for STCS (SizeTieredCompactionStrategy)

2016-02-16 Thread DOAN DuyHai (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148761#comment-15148761
 ] 

DOAN DuyHai commented on CASSANDRA-11035:
-

Ahhh nice, I'm going to look into it. I'm in touch with a student also doing 
his thesis on a heuristic to optimize the overlap for any given set of keys.

> Use cardinality estimation to pick better compaction candidates for STCS 
> (SizeTieredCompactionStrategy)
> ---
>
> Key: CASSANDRA-11035
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11035
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Wei Deng
>Assignee: Marcus Eriksson
>
> This was initially mentioned in this blog post 
> http://www.datastax.com/dev/blog/improving-compaction-in-cassandra-with-cardinality-estimation
>  but I couldn't find any existing JIRA for it. As stated by [~jbellis], 
> "Potentially even more useful would be using cardinality estimation to pick 
> better compaction candidates. Instead of blindly merging sstables of a 
> similar size a la SizeTieredCompactionStrategy." The L0 STCS in LCS should 
> benefit as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9692) Print sensible units for all log messages

2016-02-16 Thread Giampaolo (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148672#comment-15148672
 ] 

Giampaolo commented on CASSANDRA-9692:
--

I would like to work on this as my first contribution to the project. I 
understand that's a long but simple task that could help me to look at all the 
codebase. Is my evaluation correct?

> Print sensible units for all log messages
> -
>
> Key: CASSANDRA-9692
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9692
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Benedict
>Priority: Minor
>  Labels: lhf
> Fix For: 3.x
>
>
> Like CASSANDRA-9691, this has bugged me too long. it also adversely impacts 
> log analysis. I've introduced some improvements to the bits I touched for 
> CASSANDRA-9681, but we should do this across the codebase. It's a small 
> investment for a lot of long term clarity in the logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-10724) Allow option to only encrypt username/password transfer, not data

2016-02-16 Thread Stefan Podkowinski (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-10724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148668#comment-15148668
 ] 

Stefan Podkowinski commented on CASSANDRA-10724:


Username/password authentication is only taking place for client-to-node 
communication at the beginning of _each_ connection using SASL over an 
unencrypted or TLS secured connection. In case of TLS, all further data will be 
send encrypted afterwards. I'm not aware of any ways to downgrade the TLS 
connection to plaintext after authentication, if that's what you're suggesting. 
Can you elaborate why you need to make sure to protect the user credentials, 
but would be fine by sending all actual data unencrypted?

> Allow option to only encrypt username/password transfer, not data
> -
>
> Key: CASSANDRA-10724
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10724
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Thom Valley
>Priority: Minor
>
> Turning on SSL for both client->node and node->node connections is a resource 
> intensive (expensive) operation.
> Being able to only encrypt the username/password when passed (or looked up) 
> as an option would greatly reduce the encryption / decryption overhead 
> created by turning on SSL for all traffic.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (CASSANDRA-10430) "Load" report from "nodetool status" is inaccurate

2016-02-16 Thread Marcus Eriksson (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson resolved CASSANDRA-10430.
-
   Resolution: Duplicate
Fix Version/s: (was: 2.1.x)

this was fixed in CASSANDRA-10831

> "Load" report from "nodetool status" is inaccurate
> --
>
> Key: CASSANDRA-10430
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10430
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
> Environment: Cassandra v2.1.9 running on 6 node Amazon AWS, vnodes 
> enabled. 
>Reporter: julia zhang
> Attachments: system.log.2.zip, system.log.3.zip, system.log.4.zip
>
>
> After running an incremental repair, nodetool status report unbalanced load 
> among cluster. 
> $ nodetool status mykeyspace
> ==
> ||Status|| Address ||Load   ||Tokens  ||Owns (effective)  
> ||Host ID ||  Rack || 
> |UN  |10.1.1.1  |1.13 TB   |256|48.5%
> |a4477534-a5c6-4e3e-9108-17a69aebcfc0|  RAC1|
> |UN  |10.1.1.2  |2.58 TB   |256 |50.5% 
> |1a7c3864-879f-48c5-8dde-bc00cf4b23e6  |RAC2|
> |UN  |10.1.1.3  |1.49 TB   |256 |51.5% 
> |27df5b30-a5fc-44a5-9a2c-1cd65e1ba3f7  |RAC1|
> |UN  |10.1.1.4  |250.97 GB  |256 |51.9% 
> |9898a278-2fe6-4da2-b6dc-392e5fda51e6  |RAC3|
> |UN  |10.1.1.5 |1.88 TB  |256 |49.5% 
> |04aa9ce1-c1c3-4886-8d72-270b024b49b9  |RAC2|
> |UN  |10.1.1.6 |1.3 TB|256 |48.1% 
> |6d5d48e6-d188-4f88-808d-dcdbb39fdca5  |RAC3|
> It seems that only 10.1.1.4 reports correct "Load". There is no hints in the 
> cluster and report remains the same after running "nodetool cleanup" on each 
> node. "nodetool cfstats" shows number of keys are evenly distributed and 
> Cassandra data physical disk on each node report about the same usage. 
> "nodetool status" report these inaccurate large storage load until we restart 
> each node, after the restart, "Load" report match what we've seen from disk.  
> We did not see this behavior until upgrade to v2.1.9



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-10430) "Load" report from "nodetool status" is inaccurate

2016-02-16 Thread Marcus Eriksson (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-10430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148657#comment-15148657
 ] 

Marcus Eriksson commented on CASSANDRA-10430:
-

the incremental repair issue was fixed in CASSANDRA-10831

> "Load" report from "nodetool status" is inaccurate
> --
>
> Key: CASSANDRA-10430
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10430
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
> Environment: Cassandra v2.1.9 running on 6 node Amazon AWS, vnodes 
> enabled. 
>Reporter: julia zhang
> Fix For: 2.1.x
>
> Attachments: system.log.2.zip, system.log.3.zip, system.log.4.zip
>
>
> After running an incremental repair, nodetool status report unbalanced load 
> among cluster. 
> $ nodetool status mykeyspace
> ==
> ||Status|| Address ||Load   ||Tokens  ||Owns (effective)  
> ||Host ID ||  Rack || 
> |UN  |10.1.1.1  |1.13 TB   |256|48.5%
> |a4477534-a5c6-4e3e-9108-17a69aebcfc0|  RAC1|
> |UN  |10.1.1.2  |2.58 TB   |256 |50.5% 
> |1a7c3864-879f-48c5-8dde-bc00cf4b23e6  |RAC2|
> |UN  |10.1.1.3  |1.49 TB   |256 |51.5% 
> |27df5b30-a5fc-44a5-9a2c-1cd65e1ba3f7  |RAC1|
> |UN  |10.1.1.4  |250.97 GB  |256 |51.9% 
> |9898a278-2fe6-4da2-b6dc-392e5fda51e6  |RAC3|
> |UN  |10.1.1.5 |1.88 TB  |256 |49.5% 
> |04aa9ce1-c1c3-4886-8d72-270b024b49b9  |RAC2|
> |UN  |10.1.1.6 |1.3 TB|256 |48.1% 
> |6d5d48e6-d188-4f88-808d-dcdbb39fdca5  |RAC3|
> It seems that only 10.1.1.4 reports correct "Load". There is no hints in the 
> cluster and report remains the same after running "nodetool cleanup" on each 
> node. "nodetool cfstats" shows number of keys are evenly distributed and 
> Cassandra data physical disk on each node report about the same usage. 
> "nodetool status" report these inaccurate large storage load until we restart 
> each node, after the restart, "Load" report match what we've seen from disk.  
> We did not see this behavior until upgrade to v2.1.9



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-10430) "Load" report from "nodetool status" is inaccurate

2016-02-16 Thread clint martin (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-10430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148655#comment-15148655
 ] 

clint martin commented on CASSANDRA-10430:
--

I am also experiencing this issue, using DSE 4.7.3 (cassandra 2.1.8.689).  Load 
was reported correctly until I switched my cluster to use Incremental Repair.

# nodetool status
Datacenter: DC1
===
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  AddressLoad   Tokens  OwnsHost ID   
Rack
UN  172.16.10.250  1.76 TB1   ?   
88280120-c7d6-401e-8a75-5726cbb081e8  RAC1
UN  172.16.10.251  2.28 TB1   ?   
3812bbd5-d63d-4bf1-a22b-6c31ce279018  RAC1
UN  172.16.10.252  2.05 TB1   ?   
59028151-892a-4896-89b7-a368cceaddd6  RAC1


I only have 1.3TB of raw space on each of these nodes, and am only actually 
using approximately 385G to 468G of raw space on each node. 



> "Load" report from "nodetool status" is inaccurate
> --
>
> Key: CASSANDRA-10430
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10430
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
> Environment: Cassandra v2.1.9 running on 6 node Amazon AWS, vnodes 
> enabled. 
>Reporter: julia zhang
> Fix For: 2.1.x
>
> Attachments: system.log.2.zip, system.log.3.zip, system.log.4.zip
>
>
> After running an incremental repair, nodetool status report unbalanced load 
> among cluster. 
> $ nodetool status mykeyspace
> ==
> ||Status|| Address ||Load   ||Tokens  ||Owns (effective)  
> ||Host ID ||  Rack || 
> |UN  |10.1.1.1  |1.13 TB   |256|48.5%
> |a4477534-a5c6-4e3e-9108-17a69aebcfc0|  RAC1|
> |UN  |10.1.1.2  |2.58 TB   |256 |50.5% 
> |1a7c3864-879f-48c5-8dde-bc00cf4b23e6  |RAC2|
> |UN  |10.1.1.3  |1.49 TB   |256 |51.5% 
> |27df5b30-a5fc-44a5-9a2c-1cd65e1ba3f7  |RAC1|
> |UN  |10.1.1.4  |250.97 GB  |256 |51.9% 
> |9898a278-2fe6-4da2-b6dc-392e5fda51e6  |RAC3|
> |UN  |10.1.1.5 |1.88 TB  |256 |49.5% 
> |04aa9ce1-c1c3-4886-8d72-270b024b49b9  |RAC2|
> |UN  |10.1.1.6 |1.3 TB|256 |48.1% 
> |6d5d48e6-d188-4f88-808d-dcdbb39fdca5  |RAC3|
> It seems that only 10.1.1.4 reports correct "Load". There is no hints in the 
> cluster and report remains the same after running "nodetool cleanup" on each 
> node. "nodetool cfstats" shows number of keys are evenly distributed and 
> Cassandra data physical disk on each node report about the same usage. 
> "nodetool status" report these inaccurate large storage load until we restart 
> each node, after the restart, "Load" report match what we've seen from disk.  
> We did not see this behavior until upgrade to v2.1.9



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11159) SASI indexes don't switch memtable on flush

2016-02-16 Thread Sam Tunnicliffe (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148650#comment-15148650
 ] 

Sam Tunnicliffe commented on CASSANDRA-11159:
-

bq. ColumnIndex needs to keep a list of memtables pending flush
I had suspected that might well be the case, the changes & test on your branch 
LGTM. 

I guess I might be tempted to mark {{getCurrentMemtable}} and 
{{getPendingMemtables}} in {{ColumnIndex}} with {{@VisibleForTesting}}, but 
that's pretty minor tbh.

> SASI indexes don't switch memtable on flush
> ---
>
> Key: CASSANDRA-11159
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11159
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local Write-Read Paths
>Reporter: Sam Tunnicliffe
>Assignee: Sam Tunnicliffe
>Priority: Critical
> Fix For: 3.4
>
>
> SASI maintains its own in-memory structures for indexing the contents of a 
> base Memtable. On flush, these are simply discarded & replaced with an new 
> instance, whilst the on disk index is built as the base memtable is flushed 
> to SSTables. 
> SASIIndex implements INotificationHandler and this switching of the index 
> memtable is triggered by receipt of a MemtableRenewedNotification. In the 
> original SASI implementation, one of the necessary modifications to C* was to 
> emit this notification from DataTracker::switchMemtable, but this was 
> overlooked when porting to 3.0. The net result is that the index memtable is 
> never switched out, which eventually leads to OOME. 
> Simply applying the original modification isn't entirely appropriate though, 
> as it creates a window where it's possible for the index memtable to have 
> been switched, but the flushwriter is yet to finish writing the new index 
> sstables. During this window, index entries will be missing and query results 
> inaccurate. 
> I propose leaving Tracker::switchMemtable as is, so that 
> INotificationConsumers are only notified from there when truncating, but 
> adding similar notifications in Tracker::replaceFlushed, to fire after the 
> View is updated. 
> I'm leaning toward re-using MemtableRenewedNotification for this as 
> semantically I don't believe there's any meaningful difference between the 
> flush and truncation cases here. If anyone has a compelling argument for a 
> new notification type though to distinguish the two events, I'm open to hear 
> it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11035) Use cardinality estimation to pick better compaction candidates for STCS (SizeTieredCompactionStrategy)

2016-02-16 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148636#comment-15148636
 ] 

Jonathan Ellis commented on CASSANDRA-11035:


Very cool!  I know [~doanduyhai] was interested in this approach, he might have 
time to put together a more realistic test.

> Use cardinality estimation to pick better compaction candidates for STCS 
> (SizeTieredCompactionStrategy)
> ---
>
> Key: CASSANDRA-11035
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11035
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Wei Deng
>Assignee: Marcus Eriksson
>
> This was initially mentioned in this blog post 
> http://www.datastax.com/dev/blog/improving-compaction-in-cassandra-with-cardinality-estimation
>  but I couldn't find any existing JIRA for it. As stated by [~jbellis], 
> "Potentially even more useful would be using cardinality estimation to pick 
> better compaction candidates. Instead of blindly merging sstables of a 
> similar size a la SizeTieredCompactionStrategy." The L0 STCS in LCS should 
> benefit as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-11035) Use cardinality estimation to pick better compaction candidates for STCS (SizeTieredCompactionStrategy)

2016-02-16 Thread Marcus Eriksson (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148630#comment-15148630
 ] 

Marcus Eriksson edited comment on CASSANDRA-11035 at 2/16/16 2:07 PM:
--

I've been working on this a bit and I have something that works pretty well 
now: https://github.com/krummas/cassandra/commits/marcuse/11035

It does the following
* tracks row cardinality using HyperLogLogPlus to be able to better estimate 
how much we will gain by compacting the given sstables together
* uses an off-heap version of stream-lib: 
https://github.com/krummas/stream-lib/commits/marcuse/offheapregisterset
* picks sstable candidates using the following heuristic;
** use the standard STCS bucketing for size (reasoning being that we should 
never compact together a tiny sstable with a huge one, even if the big one 
overlaps the tiny one 100%)
** in each bucket, pick the oldest (by modification date) sstable and add it to 
the compaction set. Pick the oldest one to avoid having sstables getting starved
** * find the sstable in the bucket that gives the best compaction gain if it 
was added to the compaction set
** add that sstable to the compaction set if it improves the total compaction 
gain for the compaction set or if the number of sstables in the compaction set 
is < {{min_threshold}}, otherwise we are done
** if size of the compaction set is smaller than {{max_threshold}} goto *

Results in my small tests are promising;
* compaction gain estimation is very accurate, always within 1% of the actual 
result
* I created 10G of highly overlapping data in 100 sstables using a stress 
profile and with autocompaction disabled
** Old STCS takes this down to about 6G
** New STCS takes it down to 3.1G
** Major compaction takes it to 2.8G

should note though that we have a 'perfect' view of the data we are about to 
compact here, this wont be the case when we run with autocompaction enabled 
etc, so we definitely need real-world tests

Todo:
* Submit a pull request for the stream-lib changes - it currently uses a 
finalizer to clear the off heap memory to avoid changing all the ICardinality 
APIs etc, but we could use a version without a finalizer if we think it could 
be a problem
* Find a way to gracefully degrade if the number of sstables is huge - the 
heuristic above runs in O(n^2) where n = number of sstables (or, I guess it 
runs in O(n * max_threshold))
* find an incremental 64bit murmur hash for hashing the clustering values - I 
currently use the old hash value as the seed for the next and it seems to be 
working ok, but I have no idea if it is correct.


was (Author: krummas):
I've been working on this a bit and I have something that works pretty well 
now: https://github.com/krummas/cassandra/commits/marcuse/11035

It does the following
* tracks row cardinality using HyperLogLogPlus to be able to better estimate 
how much we will gain by compacting the given sstables together
* uses an off-heap version of stream-lib: 
https://github.com/krummas/stream-lib/commits/marcuse/offheapregisterset
* picks sstable candidates using the following heuristic;
** use the standard STCS bucketing for size (reasoning being that we should 
never compact together a tiny sstable with a huge one, even if the big one 
overlaps the tiny one 100%)
** in each bucket, pick the oldest (by modification date) sstable and add it to 
the compaction set. Pick the oldest one to avoid having sstables getting starved
** * find the sstable in the bucket that gives the best compaction gain if it 
was added to the compaction set
** add that sstable to the compaction set if it improves the total compaction 
gain for the compaction set or if the number of sstables in the compaction set 
is < min_threshold, otherwise we are done
** if size of the compaction set is smaller than {{max_threshold}} goto *

Results in my small tests are promising;
* compaction gain estimation is very accurate, always within 1% of the actual 
result
* I created 10G of highly overlapping data in 100 sstables using a stress 
profile and with autocompaction disabled
** Old STCS takes this down to about 6G
** New STCS takes it down to 3.1G
** Major compaction takes it to 2.8G

should note though that we have a 'perfect' view of the data we are about to 
compact here, this wont be the case when we run with autocompaction enabled 
etc, so we definitely need real-world tests

Todo:
* Submit a pull request for the stream-lib changes - it currently uses a 
finalizer to clear the off heap memory to avoid changing all the ICardinality 
APIs etc, but we could use a version without a finalizer if we think it could 
be a problem
* Find a way to gracefully degrade if the number of sstables is huge - the 
heuristic above runs in O(n^2) where n = number of sstables (or, I guess it 
runs in O(n * max_threshold))
* find an incremental 64bit murmur hash for hashing the

[jira] [Commented] (CASSANDRA-11035) Use cardinality estimation to pick better compaction candidates for STCS (SizeTieredCompactionStrategy)

2016-02-16 Thread Marcus Eriksson (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148630#comment-15148630
 ] 

Marcus Eriksson commented on CASSANDRA-11035:
-

I've been working on this a bit and I have something that works pretty well 
now: https://github.com/krummas/cassandra/commits/marcuse/11035

It does the following
* tracks row cardinality using HyperLogLogPlus to be able to better estimate 
how much we will gain by compacting the given sstables together
* uses an off-heap version of stream-lib: 
https://github.com/krummas/stream-lib/commits/marcuse/offheapregisterset
* picks sstable candidates using the following heuristic;
** use the standard STCS bucketing for size (reasoning being that we should 
never compact together a tiny sstable with a huge one, even if the big one 
overlaps the tiny one 100%)
** in each bucket, pick the oldest (by modification date) sstable and add it to 
the compaction set. Pick the oldest one to avoid having sstables getting starved
** * find the sstable in the bucket that gives the best compaction gain if it 
was added to the compaction set
** add that sstable to the compaction set if it improves the total compaction 
gain for the compaction set or if the number of sstables in the compaction set 
is < min_threshold, otherwise we are done
** if size of the compaction set is smaller than {{max_threshold}} goto *

Results in my small tests are promising;
* compaction gain estimation is very accurate, always within 1% of the actual 
result
* I created 10G of highly overlapping data in 100 sstables using a stress 
profile and with autocompaction disabled
** Old STCS takes this down to about 6G
** New STCS takes it down to 3.1G
** Major compaction takes it to 2.8G

should note though that we have a 'perfect' view of the data we are about to 
compact here, this wont be the case when we run with autocompaction enabled 
etc, so we definitely need real-world tests

Todo:
* Submit a pull request for the stream-lib changes - it currently uses a 
finalizer to clear the off heap memory to avoid changing all the ICardinality 
APIs etc, but we could use a version without a finalizer if we think it could 
be a problem
* Find a way to gracefully degrade if the number of sstables is huge - the 
heuristic above runs in O(n^2) where n = number of sstables (or, I guess it 
runs in O(n * max_threshold))
* find an incremental 64bit murmur hash for hashing the clustering values - I 
currently use the old hash value as the seed for the next and it seems to be 
working ok, but I have no idea if it is correct.

> Use cardinality estimation to pick better compaction candidates for STCS 
> (SizeTieredCompactionStrategy)
> ---
>
> Key: CASSANDRA-11035
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11035
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Wei Deng
>Assignee: Marcus Eriksson
>
> This was initially mentioned in this blog post 
> http://www.datastax.com/dev/blog/improving-compaction-in-cassandra-with-cardinality-estimation
>  but I couldn't find any existing JIRA for it. As stated by [~jbellis], 
> "Potentially even more useful would be using cardinality estimation to pick 
> better compaction candidates. Instead of blindly merging sstables of a 
> similar size a la SizeTieredCompactionStrategy." The L0 STCS in LCS should 
> benefit as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7276) Include keyspace and table names in logs where possible

2016-02-16 Thread Paulo Motta (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148616#comment-15148616
 ] 

Paulo Motta commented on CASSANDRA-7276:


I have some other things on my TODO before this but should give you feedback 
later this week.

> Include keyspace and table names in logs where possible
> ---
>
> Key: CASSANDRA-7276
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7276
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Tyler Hobbs
>Priority: Minor
>  Labels: bootcamp, lhf
> Fix For: 2.1.x
>
> Attachments: 0001-Logging-for-Keyspace-and-Tables.patch, 
> 2.1-CASSANDRA-7276-v1.txt, cassandra-2.1-7276-compaction.txt, 
> cassandra-2.1-7276.txt, cassandra-2.1.9-7276-v2.txt, cassandra-2.1.9-7276.txt
>
>
> Most error messages and stacktraces give you no clue as to what keyspace or 
> table was causing the problem.  For example:
> {noformat}
> ERROR [MutationStage:61648] 2014-05-20 12:05:45,145 CassandraDaemon.java 
> (line 198) Exception in thread Thread[MutationStage:61648,5,main]
> java.lang.IllegalArgumentException
> at java.nio.Buffer.limit(Unknown Source)
> at 
> org.apache.cassandra.db.marshal.AbstractCompositeType.getBytes(AbstractCompositeType.java:63)
> at 
> org.apache.cassandra.db.marshal.AbstractCompositeType.getWithShortLength(AbstractCompositeType.java:72)
> at 
> org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:98)
> at 
> org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:35)
> at 
> edu.stanford.ppl.concurrent.SnapTreeMap$1.compareTo(SnapTreeMap.java:538)
> at 
> edu.stanford.ppl.concurrent.SnapTreeMap.attemptUpdate(SnapTreeMap.java:1108)
> at 
> edu.stanford.ppl.concurrent.SnapTreeMap.updateUnderRoot(SnapTreeMap.java:1059)
> at edu.stanford.ppl.concurrent.SnapTreeMap.update(SnapTreeMap.java:1023)
> at 
> edu.stanford.ppl.concurrent.SnapTreeMap.putIfAbsent(SnapTreeMap.java:985)
> at 
> org.apache.cassandra.db.AtomicSortedColumns$Holder.addColumn(AtomicSortedColumns.java:328)
> at 
> org.apache.cassandra.db.AtomicSortedColumns.addAllWithSizeDelta(AtomicSortedColumns.java:200)
> at org.apache.cassandra.db.Memtable.resolve(Memtable.java:226)
> at org.apache.cassandra.db.Memtable.put(Memtable.java:173)
> at 
> org.apache.cassandra.db.ColumnFamilyStore.apply(ColumnFamilyStore.java:893)
> at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:368)
> at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:333)
> at org.apache.cassandra.db.RowMutation.apply(RowMutation.java:206)
> at 
> org.apache.cassandra.db.RowMutationVerbHandler.doVerb(RowMutationVerbHandler.java:56)
> at 
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:60)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
> at java.lang.Thread.run(Unknown Source)
> {noformat}
> We should try to include info on the keyspace and column family in the error 
> messages or logs whenever possible.  This includes reads, writes, 
> compactions, flushes, repairs, and probably more.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-11053) COPY FROM on large datasets: fix progress report and debug performance

2016-02-16 Thread Paulo Motta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paulo Motta updated CASSANDRA-11053:

Reviewer: Paulo Motta

> COPY FROM on large datasets: fix progress report and debug performance
> --
>
> Key: CASSANDRA-11053
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11053
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Stefania
>Assignee: Stefania
> Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x
>
> Attachments: copy_from_large_benchmark.txt, 
> copy_from_large_benchmark_2.txt, parent_profile.txt, parent_profile_2.txt, 
> worker_profiles.txt, worker_profiles_2.txt
>
>
> Running COPY from on a large dataset (20G divided in 20M records) revealed 
> two issues:
> * The progress report is incorrect, it is very slow until almost the end of 
> the test at which point it catches up extremely quickly.
> * The performance in rows per second is similar to running smaller tests with 
> a smaller cluster locally (approx 35,000 rows per second). As a comparison, 
> cassandra-stress manages 50,000 rows per second under the same set-up, 
> therefore resulting 1.5 times faster. 
> See attached file _copy_from_large_benchmark.txt_ for the benchmark details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9220) Hostname verification for node-to-node encryption

2016-02-16 Thread Stefan Podkowinski (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Podkowinski updated CASSANDRA-9220:
--
Attachment: (was: sslhostverification-2.0.patch)

> Hostname verification for node-to-node encryption
> -
>
> Key: CASSANDRA-9220
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9220
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Stefan Podkowinski
>Assignee: Stefan Podkowinski
> Fix For: 3.x
>
>
> This patch will will introduce a new ssl server option: 
> {{require_endpoint_verification}}. 
> Setting it will enable hostname verification for inter-node SSL 
> communication. This is necessary to prevent man-in-the-middle attacks when 
> building a trust chain against a common CA. See 
> [here|https://tersesystems.com/2014/03/23/fixing-hostname-verification/] for 
> background details. 
> Clusters that solely rely on importing all node certificates into each trust 
> store (as described 
> [here|http://docs.datastax.com/en/cassandra/2.0/cassandra/security/secureSSLCertificates_t.html])
>  are not effected. 
> Clusters that use the same common CA to sign node certificates are 
> potentially affected. In case the CA signing process will allow other parties 
> to generate certs for different purposes, those certificates could in turn be 
> used for MITM attacks. The provided patch will allow to enable hostname 
> verification to make sure not only to check if the cert is valid but also if 
> it has been created for the host that we're about to connect.
> Corresponding dtest: [Test for 
> CASSANDRA-9220|https://github.com/riptano/cassandra-dtest/pull/237]
> Related patches from the client perspective: 
> [Java|https://datastax-oss.atlassian.net/browse/JAVA-716], 
> [Python|https://datastax-oss.atlassian.net/browse/PYTHON-296]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9220) Hostname verification for node-to-node encryption

2016-02-16 Thread Stefan Podkowinski (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148541#comment-15148541
 ] 

Stefan Podkowinski commented on CASSANDRA-9220:
---

I've now rebased and fixed the dtest and it is working fine now for me. Please 
go ahead if you want to continue review.

> Hostname verification for node-to-node encryption
> -
>
> Key: CASSANDRA-9220
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9220
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Stefan Podkowinski
>Assignee: Stefan Podkowinski
> Fix For: 3.x
>
> Attachments: sslhostverification-2.0.patch
>
>
> This patch will will introduce a new ssl server option: 
> {{require_endpoint_verification}}. 
> Setting it will enable hostname verification for inter-node SSL 
> communication. This is necessary to prevent man-in-the-middle attacks when 
> building a trust chain against a common CA. See 
> [here|https://tersesystems.com/2014/03/23/fixing-hostname-verification/] for 
> background details. 
> Clusters that solely rely on importing all node certificates into each trust 
> store (as described 
> [here|http://docs.datastax.com/en/cassandra/2.0/cassandra/security/secureSSLCertificates_t.html])
>  are not effected. 
> Clusters that use the same common CA to sign node certificates are 
> potentially affected. In case the CA signing process will allow other parties 
> to generate certs for different purposes, those certificates could in turn be 
> used for MITM attacks. The provided patch will allow to enable hostname 
> verification to make sure not only to check if the cert is valid but also if 
> it has been created for the host that we're about to connect.
> Corresponding dtest: [Test for 
> CASSANDRA-9220|https://github.com/riptano/cassandra-dtest/pull/237]
> Related patches from the client perspective: 
> [Java|https://datastax-oss.atlassian.net/browse/JAVA-716], 
> [Python|https://datastax-oss.atlassian.net/browse/PYTHON-296]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9220) Hostname verification for node-to-node encryption

2016-02-16 Thread Stefan Podkowinski (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Podkowinski updated CASSANDRA-9220:
--
Description: 
This patch will will introduce a new ssl server option: 
{{require_endpoint_verification}}. 

Setting it will enable hostname verification for inter-node SSL communication. 
This is necessary to prevent man-in-the-middle attacks when building a trust 
chain against a common CA. See 
[here|https://tersesystems.com/2014/03/23/fixing-hostname-verification/] for 
background details. 

Clusters that solely rely on importing all node certificates into each trust 
store (as described 
[here|http://docs.datastax.com/en/cassandra/2.0/cassandra/security/secureSSLCertificates_t.html])
 are not effected. 

Clusters that use the same common CA to sign node certificates are potentially 
affected. In case the CA signing process will allow other parties to generate 
certs for different purposes, those certificates could in turn be used for MITM 
attacks. The provided patch will allow to enable hostname verification to make 
sure not only to check if the cert is valid but also if it has been created for 
the host that we're about to connect.

Corresponding dtest: [Test for 
CASSANDRA-9220|https://github.com/riptano/cassandra-dtest/pull/237]

Related patches from the client perspective: 
[Java|https://datastax-oss.atlassian.net/browse/JAVA-716], 
[Python|https://datastax-oss.atlassian.net/browse/PYTHON-296]

  was:
This patch will will introduce a new ssl server option: 
{{require_endpoint_verification}}. 

Setting it will enable hostname verification for inter-node SSL communication. 
This is necessary to prevent man-in-the-middle attacks when building a trust 
chain against a common CA. See 
[here|https://tersesystems.com/2014/03/23/fixing-hostname-verification/] for 
background details. 

Clusters that solely rely on importing all node certificates into each trust 
store (as described 
[here|http://docs.datastax.com/en/cassandra/2.0/cassandra/security/secureSSLCertificates_t.html])
 are not effected. 

Clusters that use the same common CA to sign node certificates are potentially 
affected. In case the CA signing process will allow other parties to generate 
certs for different purposes, those certificates could in turn be used for MITM 
attacks. The provided patch will allow to enable hostname verification to make 
sure not only to check if the cert is valid but also if it has been created for 
the host that we're about to connect.

Corresponding dtest: [Test for 
CASSANDRA-9220|https://github.com/riptano/cassandra-dtest/pull/237]

Github: 
2.0 -> 
[diff|https://github.com/apache/cassandra/compare/cassandra-2.0...spodkowinski:feat/sslhostverification],
 
[patch|https://github.com/apache/cassandra/compare/cassandra-2.0...spodkowinski:feat/sslhostverification.patch],
Trunk -> 
[diff|https://github.com/apache/cassandra/compare/trunk...spodkowinski:feat/sslhostverification],
 
[patch|https://github.com/apache/cassandra/compare/trunk...spodkowinski:feat/sslhostverification.patch]

Related patches from the client perspective: 
[Java|https://datastax-oss.atlassian.net/browse/JAVA-716], 
[Python|https://datastax-oss.atlassian.net/browse/PYTHON-296]


> Hostname verification for node-to-node encryption
> -
>
> Key: CASSANDRA-9220
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9220
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Stefan Podkowinski
>Assignee: Stefan Podkowinski
> Fix For: 3.x
>
> Attachments: sslhostverification-2.0.patch
>
>
> This patch will will introduce a new ssl server option: 
> {{require_endpoint_verification}}. 
> Setting it will enable hostname verification for inter-node SSL 
> communication. This is necessary to prevent man-in-the-middle attacks when 
> building a trust chain against a common CA. See 
> [here|https://tersesystems.com/2014/03/23/fixing-hostname-verification/] for 
> background details. 
> Clusters that solely rely on importing all node certificates into each trust 
> store (as described 
> [here|http://docs.datastax.com/en/cassandra/2.0/cassandra/security/secureSSLCertificates_t.html])
>  are not effected. 
> Clusters that use the same common CA to sign node certificates are 
> potentially affected. In case the CA signing process will allow other parties 
> to generate certs for different purposes, those certificates could in turn be 
> used for MITM attacks. The provided patch will allow to enable hostname 
> verification to make sure not only to check if the cert is valid but also if 
> it has been created for the host that we're about to connect.
> Corresponding dtest: [Test for 
> CASSANDRA-9220|https://github.com/riptano/cassandra-dtest/pull/237]
> Related patches from the client

[jira] [Updated] (CASSANDRA-10508) Remove hard-coded SSL cipher suites and protocols

2016-02-16 Thread Stefan Podkowinski (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Podkowinski updated CASSANDRA-10508:
---
Issue Type: Bug  (was: Improvement)

> Remove hard-coded SSL cipher suites and protocols
> -
>
> Key: CASSANDRA-10508
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10508
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Stefan Podkowinski
>Assignee: Stefan Podkowinski
>  Labels: lhf
> Fix For: 3.x
>
>
> Currently each SSL connections will be initialized using a hard-coded list of 
> protocols ("SSLv2Hello", "TLSv1", "TLSv1.1", "TLSv1.2") and cipher suites. We 
> now require Java 8 which comes with solid defaults for these kind of SSL 
> settings and I'm wondering if the current behavior shouldn't be re-evaluated. 
> In my impression the way cipher suites are currently whitelisted is 
> problematic, as this will prevent the JVM from using more recent and more 
> secure suites that haven't been added to the hard-coded list. JVM updates may 
> also cause issues in case the limited number of ciphers cannot be used, e.g. 
> see CASSANDRA-6613.
> Looking at the source I've also stumbled upon a bug in the 
> {{filterCipherSuites()}} method that would return the filtered list of 
> ciphers in undetermined order where the result is passed to 
> {{setEnabledCipherSuites()}}. However, the list of ciphers will reflect the 
> order of preference 
> ([source|https://bugs.openjdk.java.net/browse/JDK-8087311]) and therefore you 
> may end up with weaker algorithms on the top. Currently it's not that 
> critical, as we only whitelist a couple of ciphers anyway. But it adds to the 
> question if it still really makes sense to work with the cipher list at all 
> in the Cassandra code base.
> Another way to effect used ciphers is by changing the security properties. 
> This is a more versatile way to work with cipher lists instead of relying on 
> hard-coded values, see 
> [here|https://docs.oracle.com/javase/8/docs/technotes/guides/security/jsse/JSSERefGuide.html#DisabledAlgorithms]
>  for details.
> The same applies to the protocols. Introduced in CASSANDRA-8265 to prevent 
> SSLv3 attacks, this is not necessary anymore as SSLv3 is now blacklisted 
> anyway and will stop using safer protocol sets on new JVM releases or user 
> request. Again, we should stick with the JVM defaults. Using the 
> {{jdk.tls.client.protocols}} systems property will always allow to restrict 
> the set of protocols in case another emergency fix is needed. 
> You can find a patch with where I ripped out the mentioned options here:
> [Diff 
> trunk|https://github.com/apache/cassandra/compare/trunk...spodkowinski:fix/ssloptions]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-10508) Remove hard-coded SSL cipher suites and protocols

2016-02-16 Thread Stefan Podkowinski (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-10508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148517#comment-15148517
 ] 

Stefan Podkowinski commented on CASSANDRA-10508:


I've now created a new patch that would use the JVM default ciphers instead of 
the hardcoded list as described in the description, while still preserving the 
option to specify a custom list of filters to use. Branch can be found at 
[CASSANDRA-10508-2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...spodkowinski:CASSANDRA-10508-2.2]
 (merges up cleanly).

I'm also changing the ticket type to "Bug" as {{filterCipherSuites()}}  will 
use weaker {{AES_128}} ciphers randomly (even with strong crypto extensions 
installed), due to the fact that it was not preserving the sequence of 
preferred ciphers. This has been fixed and addressed in a unit test. Do you 
mind having a look [~snazy]?

> Remove hard-coded SSL cipher suites and protocols
> -
>
> Key: CASSANDRA-10508
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10508
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Stefan Podkowinski
>Assignee: Stefan Podkowinski
>  Labels: lhf
> Fix For: 3.x
>
>
> Currently each SSL connections will be initialized using a hard-coded list of 
> protocols ("SSLv2Hello", "TLSv1", "TLSv1.1", "TLSv1.2") and cipher suites. We 
> now require Java 8 which comes with solid defaults for these kind of SSL 
> settings and I'm wondering if the current behavior shouldn't be re-evaluated. 
> In my impression the way cipher suites are currently whitelisted is 
> problematic, as this will prevent the JVM from using more recent and more 
> secure suites that haven't been added to the hard-coded list. JVM updates may 
> also cause issues in case the limited number of ciphers cannot be used, e.g. 
> see CASSANDRA-6613.
> Looking at the source I've also stumbled upon a bug in the 
> {{filterCipherSuites()}} method that would return the filtered list of 
> ciphers in undetermined order where the result is passed to 
> {{setEnabledCipherSuites()}}. However, the list of ciphers will reflect the 
> order of preference 
> ([source|https://bugs.openjdk.java.net/browse/JDK-8087311]) and therefore you 
> may end up with weaker algorithms on the top. Currently it's not that 
> critical, as we only whitelist a couple of ciphers anyway. But it adds to the 
> question if it still really makes sense to work with the cipher list at all 
> in the Cassandra code base.
> Another way to effect used ciphers is by changing the security properties. 
> This is a more versatile way to work with cipher lists instead of relying on 
> hard-coded values, see 
> [here|https://docs.oracle.com/javase/8/docs/technotes/guides/security/jsse/JSSERefGuide.html#DisabledAlgorithms]
>  for details.
> The same applies to the protocols. Introduced in CASSANDRA-8265 to prevent 
> SSLv3 attacks, this is not necessary anymore as SSLv3 is now blacklisted 
> anyway and will stop using safer protocol sets on new JVM releases or user 
> request. Again, we should stick with the JVM defaults. Using the 
> {{jdk.tls.client.protocols}} systems property will always allow to restrict 
> the set of protocols in case another emergency fix is needed. 
> You can find a patch with where I ripped out the mentioned options here:
> [Diff 
> trunk|https://github.com/apache/cassandra/compare/trunk...spodkowinski:fix/ssloptions]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11171) conditional update without paxos

2016-02-16 Thread stuart (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148445#comment-15148445
 ] 

stuart commented on CASSANDRA-11171:


Sure, but Ill still incur the cost of reading before writing in the 
application. It would be nice if update scripts could be executed on Cassandra 
nodes (performed as read before write but without LWT) without the extra hop to 
the application layer. Not a big deal, but more of a nice to have.

Cheers,
Stuart

> conditional update without paxos
> 
>
> Key: CASSANDRA-11171
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11171
> Project: Cassandra
>  Issue Type: Wish
>Reporter: stuart
>Priority: Minor
>
> I realise that currently conditional updates use lightweight transactions to 
> provide an atomic check and set operation but that this comes at a non 
> trivial performance cost. I have a solution where synchronised access is 
> ensured by an external mechanism therefore I don't think paxos would be 
> required. It would be nice to be able to run an update command or script that 
> could conditionally update without the performance hit. Currently I'd have to 
> retrieve each row first at the application level before deciding whether or 
> not to perform the update. Would it be possible to add a switch for the 
> conditional updates to turn paxos on or off? Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11171) conditional update without paxos

2016-02-16 Thread DOAN DuyHai (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148412#comment-15148412
 ] 

DOAN DuyHai commented on CASSANDRA-11171:
-

bq. So having guarantees over actor isolation and single threaded execution 
already in place I'd like to be able to perform conditional updates for certain 
use cases

In this case, just don't use LWT and do a read-before-write to perform a 
conditional update. Of course it's less user-friendly but you won't pay the 
price for 4 network round trips of LWT

> conditional update without paxos
> 
>
> Key: CASSANDRA-11171
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11171
> Project: Cassandra
>  Issue Type: Wish
>Reporter: stuart
>Priority: Minor
>
> I realise that currently conditional updates use lightweight transactions to 
> provide an atomic check and set operation but that this comes at a non 
> trivial performance cost. I have a solution where synchronised access is 
> ensured by an external mechanism therefore I don't think paxos would be 
> required. It would be nice to be able to run an update command or script that 
> could conditionally update without the performance hit. Currently I'd have to 
> retrieve each row first at the application level before deciding whether or 
> not to perform the update. Would it be possible to add a switch for the 
> conditional updates to turn paxos on or off? Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11053) COPY FROM on large datasets: fix progress report and debug performance

2016-02-16 Thread Stefania (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148394#comment-15148394
 ] 

Stefania commented on CASSANDRA-11053:
--

I have experimented with CPU affinity by pinning each worker process to a core 
but whist this gave slightly better results locally, on AWS it actually made it 
worst. 

I've also examined the {{strace}} output locally and the most frequent system 
calls are {{futex, read, write and poll}}. To reduce contention I've replaced 
the python queue with multiple point-to-point pipes (the queue was implemented 
over a single pipe with interprocess locks). I didn't see much improvement 
locally but perhaps on AWS it matters more since locally I can only run 2 
worker processes or I max out the cluster that also runs locally. By removing 
the Python queue I was also able to remove one thread, which in Python is a 
good thing due to the GIL (Global Interpreter Lock).

I plan to test this implementation on AWS, together with an additional 
suggestion to increase time slicing ({{schedtool -B}}), then if everything 
works as expected I will move the ticket to patch available.

It's worth noting that the driver doesn't coalesce messages on the socket at 
present. This could be detrimental on virtualized environments like AWS, 
especially if [enhanced 
networking|https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/enhanced-networking.html#other-linux-enhanced-networking-instance-store]
 is not available. However we would probably need to worry about this once our 
encoding functions are faster, at the moment the bottleneck is still encoding 
so I would leave this for a future ticket.

> COPY FROM on large datasets: fix progress report and debug performance
> --
>
> Key: CASSANDRA-11053
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11053
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Stefania
>Assignee: Stefania
> Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x
>
> Attachments: copy_from_large_benchmark.txt, 
> copy_from_large_benchmark_2.txt, parent_profile.txt, parent_profile_2.txt, 
> worker_profiles.txt, worker_profiles_2.txt
>
>
> Running COPY from on a large dataset (20G divided in 20M records) revealed 
> two issues:
> * The progress report is incorrect, it is very slow until almost the end of 
> the test at which point it catches up extremely quickly.
> * The performance in rows per second is similar to running smaller tests with 
> a smaller cluster locally (approx 35,000 rows per second). As a comparison, 
> cassandra-stress manages 50,000 rows per second under the same set-up, 
> therefore resulting 1.5 times faster. 
> See attached file _copy_from_large_benchmark.txt_ for the benchmark details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-10371) Decommissioned nodes can remain in gossip

2016-02-16 Thread Stefania (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-10371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148332#comment-15148332
 ] 

Stefania commented on CASSANDRA-10371:
--

Ideally we would need the logs at TRACE level. I'm afraid the lines above don't 
really help much.

You can try these commands to avoid restarting:

{code}
nodetool setlogginglevel org.apache.cassandra.gms.Gossiper TRACE
nodetool setlogginglevel org.apache.cassandra.gms.GossipShutdownVerbHandler 
TRACE
nodetool setlogginglevel org.apache.cassandra.gms.GossipDigestAckVerbHandler 
TRACE
nodetool setlogginglevel org.apache.cassandra.gms.GossipDigestAck2VerbHandler 
TRACE
nodetool setlogginglevel org.apache.cassandra.gms.FailureDetector TRACE
nodetool setlogginglevel org.apache.cassandra.service.StorageService TRACE
{code}

to reset:

{code}
nodetool setlogginglevel
{code}

Use {{-h}} to specify a host if required. Obviously if this is a production 
cluster you may want to wait.

If you see a digest message for the node causing problems, it may arrive from a 
single host, see [this 
comment|https://issues.apache.org/jira/browse/CASSANDRA-10371?focusedCommentId=15068186=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15068186]
 above for a technique to find such host. Then, before draining and restarting 
the host causing the problem, we would need the logs at TRACE level for this 
host.

> Decommissioned nodes can remain in gossip
> -
>
> Key: CASSANDRA-10371
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10371
> Project: Cassandra
>  Issue Type: Bug
>  Components: Distributed Metadata
>Reporter: Brandon Williams
>Assignee: Stefania
>Priority: Minor
>
> This may apply to other dead states as well.  Dead states should be expired 
> after 3 days.  In the case of decom we attach a timestamp to let the other 
> nodes know when it should be expired.  It has been observed that sometimes a 
> subset of nodes in the cluster never expire the state, and through heap 
> analysis of these nodes it is revealed that the epstate.isAlive check returns 
> true when it should return false, which would allow the state to be evicted.  
> This may have been affected by CASSANDRA-8336.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

82 matches

Mail list logo