[jira] [Updated] (CASSANDRA-12877) SASI index throwing AssertionError on creation/flush

2016-11-04 Thread Voytek Jarnot (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Voytek Jarnot updated CASSANDRA-12877:
--
Description: 
Possibly a 3.10 regression?  The exact test shown below does not error in 3.9.

I built and installed a 3.10 snapshot (built 04-Nov-2016) to get around 
CASSANDRA-11670, CASSANDRA-12689, and CASSANDRA-12223 which are holding me back 
when using 3.9.

Now I'm able to make nodetool flush (or a scheduled flush) produce an unhandled 
error easily with a SASI:

{code}
CREATE KEYSPACE vjtest WITH replication = {'class': 'SimpleStrategy', 
'replication_factor': '1'};
use vjtest ;
create table tester(id1 text, id2 text, id3 text, val1 text, primary key((id1, 
id2), id3));
create custom index tester_idx_val1 on tester(val1) using 
'org.apache.cassandra.index.sasi.SASIIndex';
insert into tester(id1,id2,id3, val1) values ('1-1','1-2','1-3','asdf');
insert into tester(id1,id2,id3, val1) values ('1-1','1-2','2-3','asdf');
insert into tester(id1,id2,id3, val1) values ('1-1','1-2','3-3','asdf');
insert into tester(id1,id2,id3, val1) values ('1-1','1-2','4-3','asdf');
insert into tester(id1,id2,id3, val1) values ('1-1','1-2','5-3','asdf');
insert into tester(id1,id2,id3, val1) values ('1-1','1-2','6-3','asdf');
insert into tester(id1,id2,id3, val1) values ('1-1','1-2','7-3','asdf');
insert into tester(id1,id2,id3, val1) values ('1-1','1-2','8-3','asdf');
insert into tester(id1,id2,id3, val1) values ('1-1','1-2','9-3','asdf');
{code}

Not enough going on here to trigger a flush, so following a manual {{nodetool 
flush vjtest}} I get the following in {{system.log}}:

{code}
INFO  [MemtableFlushWriter:3] 2016-11-04 22:19:35,412 
PerSSTableIndexWriter.java:284 - Scheduling index flush to 
/mydir/apache-cassandra-3.10-SNAPSHOT/data/data/vjtest/tester-6f1fdff0a30611e692c087673c5ef8d4/mc-1-big-SI_tester_idx_val1.db
INFO  [SASI-Memtable:1] 2016-11-04 22:19:35,447 PerSSTableIndexWriter.java:335 
- Index flush to 
/mydir/apache-cassandra-3.10-SNAPSHOT/data/data/vjtest/tester-6f1fdff0a30611e692c087673c5ef8d4/mc-1-big-SI_tester_idx_val1.db
 took 16 ms.
ERROR [SASI-Memtable:1] 2016-11-04 22:19:35,449 CassandraDaemon.java:229 - 
Exception in thread Thread[SASI-Memtable:1,5,RMI Runtime]
java.lang.AssertionError: cannot have more than 8 overflow collisions per leaf, 
but had: 9
at 
org.apache.cassandra.index.sasi.disk.AbstractTokenTreeBuilder$Leaf.createOverflowEntry(AbstractTokenTreeBuilder.java:357)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at 
org.apache.cassandra.index.sasi.disk.AbstractTokenTreeBuilder$Leaf.createEntry(AbstractTokenTreeBuilder.java:346)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at 
org.apache.cassandra.index.sasi.disk.DynamicTokenTreeBuilder$DynamicLeaf.serializeData(DynamicTokenTreeBuilder.java:180)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at 
org.apache.cassandra.index.sasi.disk.AbstractTokenTreeBuilder$Leaf.serialize(AbstractTokenTreeBuilder.java:306)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at 
org.apache.cassandra.index.sasi.disk.AbstractTokenTreeBuilder.write(AbstractTokenTreeBuilder.java:90)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at 
org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder$MutableDataBlock.flushAndClear(OnDiskIndexBuilder.java:629)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at 
org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder$MutableLevel.flush(OnDiskIndexBuilder.java:446)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at 
org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder$MutableLevel.finalFlush(OnDiskIndexBuilder.java:451)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at 
org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder.finish(OnDiskIndexBuilder.java:296)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at 
org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder.finish(OnDiskIndexBuilder.java:258)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at 
org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder.finish(OnDiskIndexBuilder.java:241)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at 
org.apache.cassandra.index.sasi.disk.PerSSTableIndexWriter$Index.lambda$scheduleSegmentFlush$0(PerSSTableIndexWriter.java:267)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at 
org.apache.cassandra.index.sasi.disk.PerSSTableIndexWriter$Index.lambda$complete$1(PerSSTableIndexWriter.java:296)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[na:1.8.0_101]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_101]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
~[na:1.8.0_101]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 

[jira] [Updated] (CASSANDRA-12877) SASI index throwing AssertionError on creation/flush

2016-11-04 Thread Voytek Jarnot (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Voytek Jarnot updated CASSANDRA-12877:
--
Description: 
Possibly a 3.10 regression?  The exact test shown below does not error in 3.9.

I built and installed a 3.10 snapshot (built 04-Nov-2016) to get around 
CASSANDRA-11670, CASSANDRA-12689, and CASSANDRA-12223 which are holding me back 
when using 3.9.

Now I'm able to make nodetool flush (or a scheduled flush) produce and 
unhandled error easily with a SASI:

{code}
CREATE KEYSPACE vjtest WITH replication = {'class': 'SimpleStrategy', 
'replication_factor': '1'};
use vjtest ;
create table tester(id1 text, id2 text, id3 text, val1 text, primary key((id1, 
id2), id3));
create custom index tester_idx_val1 on tester(val1) using 
'org.apache.cassandra.index.sasi.SASIIndex';
insert into tester(id1,id2,id3, val1) values ('1-1','1-2','1-3','asdf');
insert into tester(id1,id2,id3, val1) values ('1-1','1-2','2-3','asdf');
insert into tester(id1,id2,id3, val1) values ('1-1','1-2','3-3','asdf');
insert into tester(id1,id2,id3, val1) values ('1-1','1-2','4-3','asdf');
insert into tester(id1,id2,id3, val1) values ('1-1','1-2','5-3','asdf');
insert into tester(id1,id2,id3, val1) values ('1-1','1-2','6-3','asdf');
insert into tester(id1,id2,id3, val1) values ('1-1','1-2','7-3','asdf');
insert into tester(id1,id2,id3, val1) values ('1-1','1-2','8-3','asdf');
insert into tester(id1,id2,id3, val1) values ('1-1','1-2','9-3','asdf');
{code}

Not enough going on here to trigger a flush, so following a manual {{nodetool 
flush vjtest}} I get the following in {{system.log}}:

{code}
INFO  [MemtableFlushWriter:3] 2016-11-04 22:19:35,412 
PerSSTableIndexWriter.java:284 - Scheduling index flush to 
/mydir/apache-cassandra-3.10-SNAPSHOT/data/data/vjtest/tester-6f1fdff0a30611e692c087673c5ef8d4/mc-1-big-SI_tester_idx_val1.db
INFO  [SASI-Memtable:1] 2016-11-04 22:19:35,447 PerSSTableIndexWriter.java:335 
- Index flush to 
/mydir/apache-cassandra-3.10-SNAPSHOT/data/data/vjtest/tester-6f1fdff0a30611e692c087673c5ef8d4/mc-1-big-SI_tester_idx_val1.db
 took 16 ms.
ERROR [SASI-Memtable:1] 2016-11-04 22:19:35,449 CassandraDaemon.java:229 - 
Exception in thread Thread[SASI-Memtable:1,5,RMI Runtime]
java.lang.AssertionError: cannot have more than 8 overflow collisions per leaf, 
but had: 9
at 
org.apache.cassandra.index.sasi.disk.AbstractTokenTreeBuilder$Leaf.createOverflowEntry(AbstractTokenTreeBuilder.java:357)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at 
org.apache.cassandra.index.sasi.disk.AbstractTokenTreeBuilder$Leaf.createEntry(AbstractTokenTreeBuilder.java:346)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at 
org.apache.cassandra.index.sasi.disk.DynamicTokenTreeBuilder$DynamicLeaf.serializeData(DynamicTokenTreeBuilder.java:180)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at 
org.apache.cassandra.index.sasi.disk.AbstractTokenTreeBuilder$Leaf.serialize(AbstractTokenTreeBuilder.java:306)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at 
org.apache.cassandra.index.sasi.disk.AbstractTokenTreeBuilder.write(AbstractTokenTreeBuilder.java:90)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at 
org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder$MutableDataBlock.flushAndClear(OnDiskIndexBuilder.java:629)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at 
org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder$MutableLevel.flush(OnDiskIndexBuilder.java:446)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at 
org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder$MutableLevel.finalFlush(OnDiskIndexBuilder.java:451)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at 
org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder.finish(OnDiskIndexBuilder.java:296)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at 
org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder.finish(OnDiskIndexBuilder.java:258)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at 
org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder.finish(OnDiskIndexBuilder.java:241)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at 
org.apache.cassandra.index.sasi.disk.PerSSTableIndexWriter$Index.lambda$scheduleSegmentFlush$0(PerSSTableIndexWriter.java:267)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at 
org.apache.cassandra.index.sasi.disk.PerSSTableIndexWriter$Index.lambda$complete$1(PerSSTableIndexWriter.java:296)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[na:1.8.0_101]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_101]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
~[na:1.8.0_101]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 

[jira] [Updated] (CASSANDRA-12877) SASI index throwing AssertionError on creation/flush

2016-11-04 Thread Voytek Jarnot (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Voytek Jarnot updated CASSANDRA-12877:
--
Summary: SASI index throwing AssertionError on creation/flush  (was: SASI 
index throwing AssertionError on index creation/flush)

> SASI index throwing AssertionError on creation/flush
> 
>
> Key: CASSANDRA-12877
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12877
> Project: Cassandra
>  Issue Type: Bug
>  Components: sasi
> Environment: 3.9 and 3.10 tested on both linux and osx
>Reporter: Voytek Jarnot
>
> Possibly a 3.10 regression?  The exact test shown below does not error in 3.9.
> I built and installed a 3.10 snapshot (built 04-Nov-2016) to get around 
> CASSANDRA-11670, CASSANDRA-12689, and CASSANDRA-12223 which are holding me 
> back when using 3.9.
> Now I'm able to make nodetool flush (or a scheduled flush) crash easily with 
> a SASI:
> {code}
> CREATE KEYSPACE vjtest WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': '1'};
> use vjtest ;
> create table tester(id1 text, id2 text, id3 text, val1 text, primary 
> key((id1, id2), id3));
> create custom index tester_idx_val1 on tester(val1) using 
> 'org.apache.cassandra.index.sasi.SASIIndex';
> insert into tester(id1,id2,id3, val1) values ('1-1','1-2','1-3','asdf');
> insert into tester(id1,id2,id3, val1) values ('1-1','1-2','2-3','asdf');
> insert into tester(id1,id2,id3, val1) values ('1-1','1-2','3-3','asdf');
> insert into tester(id1,id2,id3, val1) values ('1-1','1-2','4-3','asdf');
> insert into tester(id1,id2,id3, val1) values ('1-1','1-2','5-3','asdf');
> insert into tester(id1,id2,id3, val1) values ('1-1','1-2','6-3','asdf');
> insert into tester(id1,id2,id3, val1) values ('1-1','1-2','7-3','asdf');
> insert into tester(id1,id2,id3, val1) values ('1-1','1-2','8-3','asdf');
> insert into tester(id1,id2,id3, val1) values ('1-1','1-2','9-3','asdf');
> {code}
> Not enough going on here to trigger a flush, so following a manual {{nodetool 
> flush vjtest}} I get the following in {{system.log}}:
> {code}
> INFO  [MemtableFlushWriter:3] 2016-11-04 22:19:35,412 
> PerSSTableIndexWriter.java:284 - Scheduling index flush to 
> /mydir/apache-cassandra-3.10-SNAPSHOT/data/data/vjtest/tester-6f1fdff0a30611e692c087673c5ef8d4/mc-1-big-SI_tester_idx_val1.db
> INFO  [SASI-Memtable:1] 2016-11-04 22:19:35,447 
> PerSSTableIndexWriter.java:335 - Index flush to 
> /mydir/apache-cassandra-3.10-SNAPSHOT/data/data/vjtest/tester-6f1fdff0a30611e692c087673c5ef8d4/mc-1-big-SI_tester_idx_val1.db
>  took 16 ms.
> ERROR [SASI-Memtable:1] 2016-11-04 22:19:35,449 CassandraDaemon.java:229 - 
> Exception in thread Thread[SASI-Memtable:1,5,RMI Runtime]
> java.lang.AssertionError: cannot have more than 8 overflow collisions per 
> leaf, but had: 9
> at 
> org.apache.cassandra.index.sasi.disk.AbstractTokenTreeBuilder$Leaf.createOverflowEntry(AbstractTokenTreeBuilder.java:357)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
> at 
> org.apache.cassandra.index.sasi.disk.AbstractTokenTreeBuilder$Leaf.createEntry(AbstractTokenTreeBuilder.java:346)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
> at 
> org.apache.cassandra.index.sasi.disk.DynamicTokenTreeBuilder$DynamicLeaf.serializeData(DynamicTokenTreeBuilder.java:180)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
> at 
> org.apache.cassandra.index.sasi.disk.AbstractTokenTreeBuilder$Leaf.serialize(AbstractTokenTreeBuilder.java:306)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
> at 
> org.apache.cassandra.index.sasi.disk.AbstractTokenTreeBuilder.write(AbstractTokenTreeBuilder.java:90)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
> at 
> org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder$MutableDataBlock.flushAndClear(OnDiskIndexBuilder.java:629)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
> at 
> org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder$MutableLevel.flush(OnDiskIndexBuilder.java:446)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
> at 
> org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder$MutableLevel.finalFlush(OnDiskIndexBuilder.java:451)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
> at 
> org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder.finish(OnDiskIndexBuilder.java:296)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
> at 
> org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder.finish(OnDiskIndexBuilder.java:258)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
> at 
> org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder.finish(OnDiskIndexBuilder.java:241)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
> at 
> 

[jira] [Updated] (CASSANDRA-12877) SASI index throwing AssertionError on index creation/flush

2016-11-04 Thread Voytek Jarnot (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Voytek Jarnot updated CASSANDRA-12877:
--
Summary: SASI index throwing AssertionError on index creation/flush  (was: 
SASI index throwing AssertionError on index creation)

> SASI index throwing AssertionError on index creation/flush
> --
>
> Key: CASSANDRA-12877
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12877
> Project: Cassandra
>  Issue Type: Bug
>  Components: sasi
> Environment: 3.9 and 3.10 tested on both linux and osx
>Reporter: Voytek Jarnot
>
> Possibly a 3.10 regression?  The exact test shown below does not error in 3.9.
> I built and installed a 3.10 snapshot (built 04-Nov-2016) to get around 
> CASSANDRA-11670, CASSANDRA-12689, and CASSANDRA-12223 which are holding me 
> back when using 3.9.
> Now I'm able to make nodetool flush (or a scheduled flush) crash easily with 
> a SASI:
> {code}
> CREATE KEYSPACE vjtest WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': '1'};
> use vjtest ;
> create table tester(id1 text, id2 text, id3 text, val1 text, primary 
> key((id1, id2), id3));
> create custom index tester_idx_val1 on tester(val1) using 
> 'org.apache.cassandra.index.sasi.SASIIndex';
> insert into tester(id1,id2,id3, val1) values ('1-1','1-2','1-3','asdf');
> insert into tester(id1,id2,id3, val1) values ('1-1','1-2','2-3','asdf');
> insert into tester(id1,id2,id3, val1) values ('1-1','1-2','3-3','asdf');
> insert into tester(id1,id2,id3, val1) values ('1-1','1-2','4-3','asdf');
> insert into tester(id1,id2,id3, val1) values ('1-1','1-2','5-3','asdf');
> insert into tester(id1,id2,id3, val1) values ('1-1','1-2','6-3','asdf');
> insert into tester(id1,id2,id3, val1) values ('1-1','1-2','7-3','asdf');
> insert into tester(id1,id2,id3, val1) values ('1-1','1-2','8-3','asdf');
> insert into tester(id1,id2,id3, val1) values ('1-1','1-2','9-3','asdf');
> {code}
> Not enough going on here to trigger a flush, so following a manual {{nodetool 
> flush vjtest}} I get the following in {{system.log}}:
> {code}
> INFO  [MemtableFlushWriter:3] 2016-11-04 22:19:35,412 
> PerSSTableIndexWriter.java:284 - Scheduling index flush to 
> /mydir/apache-cassandra-3.10-SNAPSHOT/data/data/vjtest/tester-6f1fdff0a30611e692c087673c5ef8d4/mc-1-big-SI_tester_idx_val1.db
> INFO  [SASI-Memtable:1] 2016-11-04 22:19:35,447 
> PerSSTableIndexWriter.java:335 - Index flush to 
> /mydir/apache-cassandra-3.10-SNAPSHOT/data/data/vjtest/tester-6f1fdff0a30611e692c087673c5ef8d4/mc-1-big-SI_tester_idx_val1.db
>  took 16 ms.
> ERROR [SASI-Memtable:1] 2016-11-04 22:19:35,449 CassandraDaemon.java:229 - 
> Exception in thread Thread[SASI-Memtable:1,5,RMI Runtime]
> java.lang.AssertionError: cannot have more than 8 overflow collisions per 
> leaf, but had: 9
> at 
> org.apache.cassandra.index.sasi.disk.AbstractTokenTreeBuilder$Leaf.createOverflowEntry(AbstractTokenTreeBuilder.java:357)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
> at 
> org.apache.cassandra.index.sasi.disk.AbstractTokenTreeBuilder$Leaf.createEntry(AbstractTokenTreeBuilder.java:346)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
> at 
> org.apache.cassandra.index.sasi.disk.DynamicTokenTreeBuilder$DynamicLeaf.serializeData(DynamicTokenTreeBuilder.java:180)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
> at 
> org.apache.cassandra.index.sasi.disk.AbstractTokenTreeBuilder$Leaf.serialize(AbstractTokenTreeBuilder.java:306)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
> at 
> org.apache.cassandra.index.sasi.disk.AbstractTokenTreeBuilder.write(AbstractTokenTreeBuilder.java:90)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
> at 
> org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder$MutableDataBlock.flushAndClear(OnDiskIndexBuilder.java:629)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
> at 
> org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder$MutableLevel.flush(OnDiskIndexBuilder.java:446)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
> at 
> org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder$MutableLevel.finalFlush(OnDiskIndexBuilder.java:451)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
> at 
> org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder.finish(OnDiskIndexBuilder.java:296)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
> at 
> org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder.finish(OnDiskIndexBuilder.java:258)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
> at 
> org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder.finish(OnDiskIndexBuilder.java:241)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
> at 
> 

[jira] [Commented] (CASSANDRA-11990) Address rows rather than partitions in SASI

2016-11-04 Thread Voytek Jarnot (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15638499#comment-15638499
 ] 

Voytek Jarnot commented on CASSANDRA-11990:
---

Too much of a dilettante to know, but my relatively-uneducated guess is that 
the fix for this is causing CASSANDRA-12877.

> Address rows rather than partitions in SASI
> ---
>
> Key: CASSANDRA-11990
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11990
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL, sasi
>Reporter: Alex Petrov
>Assignee: Alex Petrov
> Fix For: 3.10
>
> Attachments: perf.pdf, size_comparison.png
>
>
> Currently, the lookup in SASI index would return the key position of the 
> partition. After the partition lookup, the rows are iterated and the 
> operators are applied in order to filter out ones that do not match.
> bq. TokenTree which accepts variable size keys (such would enable different 
> partitioners, collections support, primary key indexing etc.), 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12877) SASI index throwing AssertionError on index creation

2016-11-04 Thread Voytek Jarnot (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Voytek Jarnot updated CASSANDRA-12877:
--
Description: 
Possibly a 3.10 regression?  The exact test shown below does not error in 3.9.

I built and installed a 3.10 snapshot (built 04-Nov-2016) to get around 
CASSANDRA-11670, CASSANDRA-12689, and CASSANDRA-12223 which are holding me back 
when using 3.9.

Now I'm able to make nodetool flush (or a scheduled flush) crash easily with a 
SASI:

{code}
CREATE KEYSPACE vjtest WITH replication = {'class': 'SimpleStrategy', 
'replication_factor': '1'};
use vjtest ;
create table tester(id1 text, id2 text, id3 text, val1 text, primary key((id1, 
id2), id3));
create custom index tester_idx_val1 on tester(val1) using 
'org.apache.cassandra.index.sasi.SASIIndex';
insert into tester(id1,id2,id3, val1) values ('1-1','1-2','1-3','asdf');
insert into tester(id1,id2,id3, val1) values ('1-1','1-2','2-3','asdf');
insert into tester(id1,id2,id3, val1) values ('1-1','1-2','3-3','asdf');
insert into tester(id1,id2,id3, val1) values ('1-1','1-2','4-3','asdf');
insert into tester(id1,id2,id3, val1) values ('1-1','1-2','5-3','asdf');
insert into tester(id1,id2,id3, val1) values ('1-1','1-2','6-3','asdf');
insert into tester(id1,id2,id3, val1) values ('1-1','1-2','7-3','asdf');
insert into tester(id1,id2,id3, val1) values ('1-1','1-2','8-3','asdf');
insert into tester(id1,id2,id3, val1) values ('1-1','1-2','9-3','asdf');
{code}

Not enough going on here to trigger a flush, so following a manual {{nodetool 
flush vjtest}} I get the following in {{system.log}}:

{code}
INFO  [MemtableFlushWriter:3] 2016-11-04 22:19:35,412 
PerSSTableIndexWriter.java:284 - Scheduling index flush to 
/mydir/apache-cassandra-3.10-SNAPSHOT/data/data/vjtest/tester-6f1fdff0a30611e692c087673c5ef8d4/mc-1-big-SI_tester_idx_val1.db
INFO  [SASI-Memtable:1] 2016-11-04 22:19:35,447 PerSSTableIndexWriter.java:335 
- Index flush to 
/mydir/apache-cassandra-3.10-SNAPSHOT/data/data/vjtest/tester-6f1fdff0a30611e692c087673c5ef8d4/mc-1-big-SI_tester_idx_val1.db
 took 16 ms.
ERROR [SASI-Memtable:1] 2016-11-04 22:19:35,449 CassandraDaemon.java:229 - 
Exception in thread Thread[SASI-Memtable:1,5,RMI Runtime]
java.lang.AssertionError: cannot have more than 8 overflow collisions per leaf, 
but had: 9
at 
org.apache.cassandra.index.sasi.disk.AbstractTokenTreeBuilder$Leaf.createOverflowEntry(AbstractTokenTreeBuilder.java:357)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at 
org.apache.cassandra.index.sasi.disk.AbstractTokenTreeBuilder$Leaf.createEntry(AbstractTokenTreeBuilder.java:346)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at 
org.apache.cassandra.index.sasi.disk.DynamicTokenTreeBuilder$DynamicLeaf.serializeData(DynamicTokenTreeBuilder.java:180)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at 
org.apache.cassandra.index.sasi.disk.AbstractTokenTreeBuilder$Leaf.serialize(AbstractTokenTreeBuilder.java:306)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at 
org.apache.cassandra.index.sasi.disk.AbstractTokenTreeBuilder.write(AbstractTokenTreeBuilder.java:90)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at 
org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder$MutableDataBlock.flushAndClear(OnDiskIndexBuilder.java:629)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at 
org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder$MutableLevel.flush(OnDiskIndexBuilder.java:446)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at 
org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder$MutableLevel.finalFlush(OnDiskIndexBuilder.java:451)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at 
org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder.finish(OnDiskIndexBuilder.java:296)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at 
org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder.finish(OnDiskIndexBuilder.java:258)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at 
org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder.finish(OnDiskIndexBuilder.java:241)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at 
org.apache.cassandra.index.sasi.disk.PerSSTableIndexWriter$Index.lambda$scheduleSegmentFlush$0(PerSSTableIndexWriter.java:267)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at 
org.apache.cassandra.index.sasi.disk.PerSSTableIndexWriter$Index.lambda$complete$1(PerSSTableIndexWriter.java:296)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[na:1.8.0_101]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_101]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
~[na:1.8.0_101]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[na:1.8.0_101]
at 

[jira] [Updated] (CASSANDRA-12877) SASI index throwing AssertionError on index creation

2016-11-04 Thread Voytek Jarnot (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Voytek Jarnot updated CASSANDRA-12877:
--
Description: 
Possibly a 3.10 regression?  The exact test shown below does not error in 3.9.

I built and installed a 3.10 snapshot (built 04-Nov-2016) to get around 
CASSANDRA-11670, CASSANDRA-12689, and CASSANDRA-12223 which are holding me back 
when using 3.9.

Now I'm able to make nodetool flush (or a scheduled flush) crash easily with a 
SASI:

{code}
CREATE KEYSPACE vjtest WITH replication = {'class': 'SimpleStrategy', 
'replication_factor': '1'};
use vjtest ;
create table tester(id1 text, id2 text, id3 text, val1 text, primary key((id1, 
id2), id3));
create custom index tester_idx_val1 on tester(val1) using 
'org.apache.cassandra.index.sasi.SASIIndex';
insert into tester(id1,id2,id3, val1) values ('1-1','1-2','1-3','asdf');
insert into tester(id1,id2,id3, val1) values ('1-1','1-2','2-3','asdf');
insert into tester(id1,id2,id3, val1) values ('1-1','1-2','3-3','asdf');
insert into tester(id1,id2,id3, val1) values ('1-1','1-2','4-3','asdf');
insert into tester(id1,id2,id3, val1) values ('1-1','1-2','5-3','asdf');
insert into tester(id1,id2,id3, val1) values ('1-1','1-2','6-3','asdf');
insert into tester(id1,id2,id3, val1) values ('1-1','1-2','7-3','asdf');
insert into tester(id1,id2,id3, val1) values ('1-1','1-2','8-3','asdf');
insert into tester(id1,id2,id3, val1) values ('1-1','1-2','9-3','asdf');
{code}

Not enough going on here to trigger a flush, so following a manual {{nodetool 
flush vjtest}} I get the following in {{system.log}}:

{code}
INFO  [MemtableFlushWriter:3] 2016-11-04 22:19:35,412 
PerSSTableIndexWriter.java:284 - Scheduling index flush to 
/mydir/apache-cassandra-3.10-SNAPSHOT/data/data/vjtest/tester-6f1fdff0a30611e692c087673c5ef8d4/mc-1-big-SI_tester_idx_val1.db
INFO  [SASI-Memtable:1] 2016-11-04 22:19:35,447 PerSSTableIndexWriter.java:335 
- Index flush to 
/mydir/apache-cassandra-3.10-SNAPSHOT/data/data/vjtest/tester-6f1fdff0a30611e692c087673c5ef8d4/mc-1-big-SI_tester_idx_val1.db
 took 16 ms.
ERROR [SASI-Memtable:1] 2016-11-04 22:19:35,449 CassandraDaemon.java:229 - 
Exception in thread Thread[SASI-Memtable:1,5,RMI Runtime]
java.lang.AssertionError: cannot have more than 8 overflow collisions per leaf, 
but had: 9
at 
org.apache.cassandra.index.sasi.disk.AbstractTokenTreeBuilder$Leaf.createOverflowEntry(AbstractTokenTreeBuilder.java:357)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at 
org.apache.cassandra.index.sasi.disk.AbstractTokenTreeBuilder$Leaf.createEntry(AbstractTokenTreeBuilder.java:346)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at 
org.apache.cassandra.index.sasi.disk.DynamicTokenTreeBuilder$DynamicLeaf.serializeData(DynamicTokenTreeBuilder.java:180)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at 
org.apache.cassandra.index.sasi.disk.AbstractTokenTreeBuilder$Leaf.serialize(AbstractTokenTreeBuilder.java:306)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at 
org.apache.cassandra.index.sasi.disk.AbstractTokenTreeBuilder.write(AbstractTokenTreeBuilder.java:90)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at 
org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder$MutableDataBlock.flushAndClear(OnDiskIndexBuilder.java:629)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at 
org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder$MutableLevel.flush(OnDiskIndexBuilder.java:446)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at 
org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder$MutableLevel.finalFlush(OnDiskIndexBuilder.java:451)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at 
org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder.finish(OnDiskIndexBuilder.java:296)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at 
org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder.finish(OnDiskIndexBuilder.java:258)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at 
org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder.finish(OnDiskIndexBuilder.java:241)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at 
org.apache.cassandra.index.sasi.disk.PerSSTableIndexWriter$Index.lambda$scheduleSegmentFlush$0(PerSSTableIndexWriter.java:267)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at 
org.apache.cassandra.index.sasi.disk.PerSSTableIndexWriter$Index.lambda$complete$1(PerSSTableIndexWriter.java:296)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[na:1.8.0_101]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_101]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
~[na:1.8.0_101]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[na:1.8.0_101]
at 

[jira] [Issue Comment Deleted] (CASSANDRA-12877) SASI index throwing AssertionError on index creation

2016-11-04 Thread Voytek Jarnot (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Voytek Jarnot updated CASSANDRA-12877:
--
Comment: was deleted

(was: Attached full log output.  Fresh build of cassandra-3.X; fresh install, 
fresh keyspace (SimpleStrategy, RF 1).

1) built/installed 3.10-SNAPSHOT from git branch cassandra-3.X
2) created keyspace (SimpleStrategy, RF 1)
3) created table: (simplified version below, many more valX columns present)
{quote}
CREATE TABLE mytable (
id1 text,
id2 text,
id3 date,
id4 timestamp,
id5 text,
val1 text,
val2 text,
val3 text,
task_id text,
document_nbr text,
val5 text,
PRIMARY KEY ((id1, id2), id3, id4, id5)
) WITH CLUSTERING ORDER BY (id3 DESC, id4 DESC, id5 ASC)
{quote}

4) created materialized view:
{quote}
CREATE MATERIALIZED VIEW mytable_by_task_id AS
SELECT *
FROM mytable
WHERE id1 IS NOT NULL AND id2 IS NOT NULL AND id3 IS NOT NULL AND id4 IS 
NOT NULL AND id5 IS NOT NULL AND task_id IS NOT NULL
PRIMARY KEY (task_id, id3, id4, id1, id2, id5)
WITH CLUSTERING ORDER BY (id3 DESC, id4 DESC, id1 ASC, id2 ASC, id5 ASC)
{quote}
5) inserted 27 million "rows" (i.e., unique values for id5)
6) create index attempt
{quote}
create custom index idx_ar_document_nbr on test_table(document_nbr) using 
'org.apache.cassandra.index.sasi.SASIIndex';
{quote}
7) no error in cqlsh, logged errors attached.

Beginning to suspect CASSANDRA-11990 ... but don't have enough 
internals-knowledge to do much more than guess.)

> SASI index throwing AssertionError on index creation
> 
>
> Key: CASSANDRA-12877
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12877
> Project: Cassandra
>  Issue Type: Bug
>  Components: sasi
> Environment: 3.9 and 3.10 tested on both linux and osx
>Reporter: Voytek Jarnot
>
> Possibly a 3.10 regression?
> I built and installed a 3.10 snapshot (built 03-Nov-2016) to get around 
> CASSANDRA-11670, CASSANDRA-12689, and CASSANDRA-12223 which are holding me 
> back when using 3.9.
> Would like to state up front that I can't duplicate this with a lightweight 
> throwaway test, which is frustrating, but it keeps hitting me on our dev 
> cluster.  It may require a certain amount of data present (or perhaps a high 
> number of nulls in the indexed column) - never had any luck duplicating with 
> the table shown below.
> Table roughly resembles the following, with many more 'valx' columns:
> CREATE TABLE idx_test_table (
> id1 text,
> id2 text,
> id3 text,
> id4 text,
> val1 text,
> val2 text,
> PRIMARY KEY ((id1, id2), id3, id4)
> ) WITH CLUSTERING ORDER BY (id3 DESC, id4 ASC);
> CREATE CUSTOM INDEX idx_test_index ON idx_test_table (val2) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex';
> The error below occurs in 3.10, but not in 3.9; it occurs whether I insert a 
> bunch of dev data and then create the index, or whether I create the index 
> and then insert a bunch of test data.
> {quote}
> INFO  [MemtableFlushWriter:5] 2016-11-03 21:00:19,416 
> PerSSTableIndexWriter.java:284 - Scheduling index flush to 
> /u01/cassandra-data/data/essatc1/audit_record-520c1dc0a1e411e691db0bd4b103bd15/mc-266-big-SI_idx_ar_document_nbr.db
> INFO  [SASI-Memtable:1] 2016-11-03 21:00:19,450 
> PerSSTableIndexWriter.java:335 - Index flush to 
> /u01/cassandra-data/data/essatc1/audit_record-520c1dc0a1e411e691db0bd4b103bd15/mc-266-big-SI_idx_ar_document_nbr.db
>  took 33 ms.
> ERROR [SASI-Memtable:1] 2016-11-03 21:00:19,454 CassandraDaemon.java:229 - 
> Exception in thread Thread[SASI-Memtable:1,5,main]
> java.lang.AssertionError: cannot have more than 8 overflow collisions per 
> leaf, but had: 25
>   at 
> org.apache.cassandra.index.sasi.disk.AbstractTokenTreeBuilder$Leaf.createOverflowEntry(AbstractTokenTreeBuilder.java:357)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.disk.AbstractTokenTreeBuilder$Leaf.createEntry(AbstractTokenTreeBuilder.java:346)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.disk.DynamicTokenTreeBuilder$DynamicLeaf.serializeData(DynamicTokenTreeBuilder.java:180)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.disk.AbstractTokenTreeBuilder$Leaf.serialize(AbstractTokenTreeBuilder.java:306)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.disk.AbstractTokenTreeBuilder.write(AbstractTokenTreeBuilder.java:90)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder$MutableDataBlock.flushAndClear(OnDiskIndexBuilder.java:629)
>  

[jira] [Issue Comment Deleted] (CASSANDRA-12877) SASI index throwing AssertionError on index creation

2016-11-04 Thread Voytek Jarnot (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Voytek Jarnot updated CASSANDRA-12877:
--
Comment: was deleted

(was: Attached (slightly sanitized) result of a failed attempt to create a SASI 
index as described but on my localhost 1-machine cluster.  Full series of 
stacktraces as well as the "Update table ..." output, giving the details of my 
setup.

Perhaps worth mentioning: the tables has ~27 million values for the final 
primary key column.)

> SASI index throwing AssertionError on index creation
> 
>
> Key: CASSANDRA-12877
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12877
> Project: Cassandra
>  Issue Type: Bug
>  Components: sasi
> Environment: 3.9 and 3.10 tested on both linux and osx
>Reporter: Voytek Jarnot
>
> Possibly a 3.10 regression?
> I built and installed a 3.10 snapshot (built 03-Nov-2016) to get around 
> CASSANDRA-11670, CASSANDRA-12689, and CASSANDRA-12223 which are holding me 
> back when using 3.9.
> Would like to state up front that I can't duplicate this with a lightweight 
> throwaway test, which is frustrating, but it keeps hitting me on our dev 
> cluster.  It may require a certain amount of data present (or perhaps a high 
> number of nulls in the indexed column) - never had any luck duplicating with 
> the table shown below.
> Table roughly resembles the following, with many more 'valx' columns:
> CREATE TABLE idx_test_table (
> id1 text,
> id2 text,
> id3 text,
> id4 text,
> val1 text,
> val2 text,
> PRIMARY KEY ((id1, id2), id3, id4)
> ) WITH CLUSTERING ORDER BY (id3 DESC, id4 ASC);
> CREATE CUSTOM INDEX idx_test_index ON idx_test_table (val2) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex';
> The error below occurs in 3.10, but not in 3.9; it occurs whether I insert a 
> bunch of dev data and then create the index, or whether I create the index 
> and then insert a bunch of test data.
> {quote}
> INFO  [MemtableFlushWriter:5] 2016-11-03 21:00:19,416 
> PerSSTableIndexWriter.java:284 - Scheduling index flush to 
> /u01/cassandra-data/data/essatc1/audit_record-520c1dc0a1e411e691db0bd4b103bd15/mc-266-big-SI_idx_ar_document_nbr.db
> INFO  [SASI-Memtable:1] 2016-11-03 21:00:19,450 
> PerSSTableIndexWriter.java:335 - Index flush to 
> /u01/cassandra-data/data/essatc1/audit_record-520c1dc0a1e411e691db0bd4b103bd15/mc-266-big-SI_idx_ar_document_nbr.db
>  took 33 ms.
> ERROR [SASI-Memtable:1] 2016-11-03 21:00:19,454 CassandraDaemon.java:229 - 
> Exception in thread Thread[SASI-Memtable:1,5,main]
> java.lang.AssertionError: cannot have more than 8 overflow collisions per 
> leaf, but had: 25
>   at 
> org.apache.cassandra.index.sasi.disk.AbstractTokenTreeBuilder$Leaf.createOverflowEntry(AbstractTokenTreeBuilder.java:357)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.disk.AbstractTokenTreeBuilder$Leaf.createEntry(AbstractTokenTreeBuilder.java:346)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.disk.DynamicTokenTreeBuilder$DynamicLeaf.serializeData(DynamicTokenTreeBuilder.java:180)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.disk.AbstractTokenTreeBuilder$Leaf.serialize(AbstractTokenTreeBuilder.java:306)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.disk.AbstractTokenTreeBuilder.write(AbstractTokenTreeBuilder.java:90)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder$MutableDataBlock.flushAndClear(OnDiskIndexBuilder.java:629)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder$MutableLevel.flush(OnDiskIndexBuilder.java:446)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder$MutableLevel.add(OnDiskIndexBuilder.java:433)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder.addTerm(OnDiskIndexBuilder.java:207)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder.finish(OnDiskIndexBuilder.java:293)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder.finish(OnDiskIndexBuilder.java:258)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder.finish(OnDiskIndexBuilder.java:241)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> 

[jira] [Updated] (CASSANDRA-12877) SASI index throwing AssertionError on index creation

2016-11-04 Thread Voytek Jarnot (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Voytek Jarnot updated CASSANDRA-12877:
--
Attachment: (was: idx-stacktrace-04-nov-2016.txt)

> SASI index throwing AssertionError on index creation
> 
>
> Key: CASSANDRA-12877
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12877
> Project: Cassandra
>  Issue Type: Bug
>  Components: sasi
> Environment: 3.9 and 3.10 tested on both linux and osx
>Reporter: Voytek Jarnot
>
> Possibly a 3.10 regression?
> I built and installed a 3.10 snapshot (built 03-Nov-2016) to get around 
> CASSANDRA-11670, CASSANDRA-12689, and CASSANDRA-12223 which are holding me 
> back when using 3.9.
> Would like to state up front that I can't duplicate this with a lightweight 
> throwaway test, which is frustrating, but it keeps hitting me on our dev 
> cluster.  It may require a certain amount of data present (or perhaps a high 
> number of nulls in the indexed column) - never had any luck duplicating with 
> the table shown below.
> Table roughly resembles the following, with many more 'valx' columns:
> CREATE TABLE idx_test_table (
> id1 text,
> id2 text,
> id3 text,
> id4 text,
> val1 text,
> val2 text,
> PRIMARY KEY ((id1, id2), id3, id4)
> ) WITH CLUSTERING ORDER BY (id3 DESC, id4 ASC);
> CREATE CUSTOM INDEX idx_test_index ON idx_test_table (val2) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex';
> The error below occurs in 3.10, but not in 3.9; it occurs whether I insert a 
> bunch of dev data and then create the index, or whether I create the index 
> and then insert a bunch of test data.
> {quote}
> INFO  [MemtableFlushWriter:5] 2016-11-03 21:00:19,416 
> PerSSTableIndexWriter.java:284 - Scheduling index flush to 
> /u01/cassandra-data/data/essatc1/audit_record-520c1dc0a1e411e691db0bd4b103bd15/mc-266-big-SI_idx_ar_document_nbr.db
> INFO  [SASI-Memtable:1] 2016-11-03 21:00:19,450 
> PerSSTableIndexWriter.java:335 - Index flush to 
> /u01/cassandra-data/data/essatc1/audit_record-520c1dc0a1e411e691db0bd4b103bd15/mc-266-big-SI_idx_ar_document_nbr.db
>  took 33 ms.
> ERROR [SASI-Memtable:1] 2016-11-03 21:00:19,454 CassandraDaemon.java:229 - 
> Exception in thread Thread[SASI-Memtable:1,5,main]
> java.lang.AssertionError: cannot have more than 8 overflow collisions per 
> leaf, but had: 25
>   at 
> org.apache.cassandra.index.sasi.disk.AbstractTokenTreeBuilder$Leaf.createOverflowEntry(AbstractTokenTreeBuilder.java:357)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.disk.AbstractTokenTreeBuilder$Leaf.createEntry(AbstractTokenTreeBuilder.java:346)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.disk.DynamicTokenTreeBuilder$DynamicLeaf.serializeData(DynamicTokenTreeBuilder.java:180)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.disk.AbstractTokenTreeBuilder$Leaf.serialize(AbstractTokenTreeBuilder.java:306)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.disk.AbstractTokenTreeBuilder.write(AbstractTokenTreeBuilder.java:90)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder$MutableDataBlock.flushAndClear(OnDiskIndexBuilder.java:629)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder$MutableLevel.flush(OnDiskIndexBuilder.java:446)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder$MutableLevel.add(OnDiskIndexBuilder.java:433)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder.addTerm(OnDiskIndexBuilder.java:207)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder.finish(OnDiskIndexBuilder.java:293)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder.finish(OnDiskIndexBuilder.java:258)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder.finish(OnDiskIndexBuilder.java:241)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.disk.PerSSTableIndexWriter$Index.lambda$scheduleSegmentFlush$0(PerSSTableIndexWriter.java:267)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.disk.PerSSTableIndexWriter$Index.lambda$complete$1(PerSSTableIndexWriter.java:296)
>  

[jira] [Updated] (CASSANDRA-12877) SASI index throwing AssertionError on index creation

2016-11-04 Thread Voytek Jarnot (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Voytek Jarnot updated CASSANDRA-12877:
--
Description: 
Possibly a 3.10 regression?

I built and installed a 3.10 snapshot (built 03-Nov-2016) to get around 
CASSANDRA-11670, CASSANDRA-12689, and CASSANDRA-12223 which are holding me back 
when using 3.9.

Would like to state up front that I can't duplicate this with a lightweight 
throwaway test, which is frustrating, but it keeps hitting me on our dev 
cluster.  It may require a certain amount of data present (or perhaps a high 
number of nulls in the indexed column) - never had any luck duplicating with 
the table shown below.

Table roughly resembles the following, with many more 'valx' columns:

CREATE TABLE idx_test_table (
id1 text,
id2 text,
id3 text,
id4 text,
val1 text,
val2 text,
PRIMARY KEY ((id1, id2), id3, id4)
) WITH CLUSTERING ORDER BY (id3 DESC, id4 ASC);
CREATE CUSTOM INDEX idx_test_index ON idx_test_table (val2) USING 
'org.apache.cassandra.index.sasi.SASIIndex';


The error below occurs in 3.10, but not in 3.9; it occurs whether I insert a 
bunch of dev data and then create the index, or whether I create the index and 
then insert a bunch of test data.

{quote}
INFO  [MemtableFlushWriter:5] 2016-11-03 21:00:19,416 
PerSSTableIndexWriter.java:284 - Scheduling index flush to 
/u01/cassandra-data/data/essatc1/audit_record-520c1dc0a1e411e691db0bd4b103bd15/mc-266-big-SI_idx_ar_document_nbr.db
INFO  [SASI-Memtable:1] 2016-11-03 21:00:19,450 PerSSTableIndexWriter.java:335 
- Index flush to 
/u01/cassandra-data/data/essatc1/audit_record-520c1dc0a1e411e691db0bd4b103bd15/mc-266-big-SI_idx_ar_document_nbr.db
 took 33 ms.
ERROR [SASI-Memtable:1] 2016-11-03 21:00:19,454 CassandraDaemon.java:229 - 
Exception in thread Thread[SASI-Memtable:1,5,main]
java.lang.AssertionError: cannot have more than 8 overflow collisions per leaf, 
but had: 25
at 
org.apache.cassandra.index.sasi.disk.AbstractTokenTreeBuilder$Leaf.createOverflowEntry(AbstractTokenTreeBuilder.java:357)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at 
org.apache.cassandra.index.sasi.disk.AbstractTokenTreeBuilder$Leaf.createEntry(AbstractTokenTreeBuilder.java:346)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at 
org.apache.cassandra.index.sasi.disk.DynamicTokenTreeBuilder$DynamicLeaf.serializeData(DynamicTokenTreeBuilder.java:180)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at 
org.apache.cassandra.index.sasi.disk.AbstractTokenTreeBuilder$Leaf.serialize(AbstractTokenTreeBuilder.java:306)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at 
org.apache.cassandra.index.sasi.disk.AbstractTokenTreeBuilder.write(AbstractTokenTreeBuilder.java:90)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at 
org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder$MutableDataBlock.flushAndClear(OnDiskIndexBuilder.java:629)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at 
org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder$MutableLevel.flush(OnDiskIndexBuilder.java:446)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at 
org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder$MutableLevel.add(OnDiskIndexBuilder.java:433)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at 
org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder.addTerm(OnDiskIndexBuilder.java:207)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at 
org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder.finish(OnDiskIndexBuilder.java:293)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at 
org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder.finish(OnDiskIndexBuilder.java:258)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at 
org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder.finish(OnDiskIndexBuilder.java:241)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at 
org.apache.cassandra.index.sasi.disk.PerSSTableIndexWriter$Index.lambda$scheduleSegmentFlush$0(PerSSTableIndexWriter.java:267)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at 
org.apache.cassandra.index.sasi.disk.PerSSTableIndexWriter$Index.lambda$complete$1(PerSSTableIndexWriter.java:296)
 ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[na:1.8.0_91]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
~[na:1.8.0_91]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
~[na:1.8.0_91]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[na:1.8.0_91]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_91]
ERROR [MemtableFlushWriter:5] 2016-11-03 21:00:19,459 DataTracker.java:168 - 
Can't open index 

[jira] [Updated] (CASSANDRA-12877) SASI index throwing AssertionError on index creation

2016-11-04 Thread Voytek Jarnot (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Voytek Jarnot updated CASSANDRA-12877:
--
Attachment: (was: idx-stacktrace-03-nov-2016.txt)

> SASI index throwing AssertionError on index creation
> 
>
> Key: CASSANDRA-12877
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12877
> Project: Cassandra
>  Issue Type: Bug
>  Components: sasi
> Environment: 3.9 and 3.10 tested on both linux and osx
>Reporter: Voytek Jarnot
>
> Possibly a 3.10 regression?
> I built and installed a 3.10 snapshot (built 03-Nov-2016) to get around 
> CASSANDRA-11670, CASSANDRA-12689, and CASSANDRA-12223 which are holding me 
> back when using 3.9.
> Would like to state up front that I can't duplicate this with a lightweight 
> throwaway test, which is frustrating, but it keeps hitting me on our dev 
> cluster.  It may require a certain amount of data present (or perhaps a high 
> number of nulls in the indexed column) - never had any luck duplicating with 
> the table shown below.
> Table roughly resembles the following, with many more 'valx' columns:
> CREATE TABLE idx_test_table (
> id1 text,
> id2 text,
> id3 text,
> id4 text,
> val1 text,
> val2 text,
> PRIMARY KEY ((id1, id2), id3, id4)
> ) WITH CLUSTERING ORDER BY (id3 DESC, id4 ASC);
> CREATE CUSTOM INDEX idx_test_index ON idx_test_table (val2) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex';
> The error below occurs in 3.10, but not in 3.9; it occurs whether I insert a 
> bunch of dev data and then create the index, or whether I create the index 
> and then insert a bunch of test data.
> {quote}
> INFO  [MemtableFlushWriter:5] 2016-11-03 21:00:19,416 
> PerSSTableIndexWriter.java:284 - Scheduling index flush to 
> /u01/cassandra-data/data/essatc1/audit_record-520c1dc0a1e411e691db0bd4b103bd15/mc-266-big-SI_idx_ar_document_nbr.db
> INFO  [SASI-Memtable:1] 2016-11-03 21:00:19,450 
> PerSSTableIndexWriter.java:335 - Index flush to 
> /u01/cassandra-data/data/essatc1/audit_record-520c1dc0a1e411e691db0bd4b103bd15/mc-266-big-SI_idx_ar_document_nbr.db
>  took 33 ms.
> ERROR [SASI-Memtable:1] 2016-11-03 21:00:19,454 CassandraDaemon.java:229 - 
> Exception in thread Thread[SASI-Memtable:1,5,main]
> java.lang.AssertionError: cannot have more than 8 overflow collisions per 
> leaf, but had: 25
>   at 
> org.apache.cassandra.index.sasi.disk.AbstractTokenTreeBuilder$Leaf.createOverflowEntry(AbstractTokenTreeBuilder.java:357)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.disk.AbstractTokenTreeBuilder$Leaf.createEntry(AbstractTokenTreeBuilder.java:346)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.disk.DynamicTokenTreeBuilder$DynamicLeaf.serializeData(DynamicTokenTreeBuilder.java:180)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.disk.AbstractTokenTreeBuilder$Leaf.serialize(AbstractTokenTreeBuilder.java:306)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.disk.AbstractTokenTreeBuilder.write(AbstractTokenTreeBuilder.java:90)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder$MutableDataBlock.flushAndClear(OnDiskIndexBuilder.java:629)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder$MutableLevel.flush(OnDiskIndexBuilder.java:446)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder$MutableLevel.add(OnDiskIndexBuilder.java:433)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder.addTerm(OnDiskIndexBuilder.java:207)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder.finish(OnDiskIndexBuilder.java:293)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder.finish(OnDiskIndexBuilder.java:258)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder.finish(OnDiskIndexBuilder.java:241)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.disk.PerSSTableIndexWriter$Index.lambda$scheduleSegmentFlush$0(PerSSTableIndexWriter.java:267)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.disk.PerSSTableIndexWriter$Index.lambda$complete$1(PerSSTableIndexWriter.java:296)
>  

[jira] [Comment Edited] (CASSANDRA-12627) Provide new seed providers

2016-11-04 Thread Edward Capriolo (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15638193#comment-15638193
 ] 

Edward Capriolo edited comment on CASSANDRA-12627 at 11/5/16 12:36 AM:
---

I am a bit confused. I am looking at the code you are sending me here:
{quote}
 String priamSeeds = 
DataFetcher.fetchData("http://127.0.0.1:8080/Priam/REST/v1/cassconfig/get_seeds;);
for (String seed : priamSeeds.split(","))
seeds.add(InetAddress.getByName(seed));
{quote}

With the feature I am proposing you can replace that with:

CASSANDRA_SEED_LIST=`wget 
http://127.0.0.1:8080/Priam/REST/v1/cassconfig/get_seeds` bin/cassandra

You could also do

CASSANDRA_SEED_LIST=`dig -t txt seeds.mydomain.com` bin/cassandra

Doesn't this seem extremely useful and straight forward?


was (Author: appodictic):
I am a bit confused. I am looking at the code you are sending me here:
{quote}
 String priamSeeds = 
DataFetcher.fetchData("http://127.0.0.1:8080/Priam/REST/v1/cassconfig/get_seeds;);
for (String seed : priamSeeds.split(","))
seeds.add(InetAddress.getByName(seed));
{quote}

With the feature I am proposing you can replace that with:

CASSANDRA_SEED_LIST=`wget 
http://127.0.0.1:8080/Priam/REST/v1/cassconfig/get_seeds` bin/cassandra

You could also do

CASSANDRA_SEED_LIST=`dig -t txt mydomain.com` bin/cassandra

Doesn't this seem extremely useful and straight forward?

> Provide new seed providers
> --
>
> Key: CASSANDRA-12627
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12627
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Edward Capriolo
>Assignee: Edward Capriolo
>
> SeedProvider is plugable, however only one implementation exists.
> Changes:
> * Create a SeedProvider that reads properties from System properties or env
> * Provide a SeedProvider that scans ranges of IP addresses to find peers.
> * Refactor interface to abstract class because all seed providers must 
> provide a constructor that accepts Map 
> * correct error messages
> * Do not catch Exception use MultiCatch and catch typed exceptions



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12627) Provide new seed providers

2016-11-04 Thread Edward Capriolo (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15638193#comment-15638193
 ] 

Edward Capriolo commented on CASSANDRA-12627:
-

I am a bit confused. I am looking at the code you are sending me here:
{quote}
 String priamSeeds = 
DataFetcher.fetchData("http://127.0.0.1:8080/Priam/REST/v1/cassconfig/get_seeds;);
for (String seed : priamSeeds.split(","))
seeds.add(InetAddress.getByName(seed));
{quote}

With the feature I am proposing you can replace that with:

CASSANDRA_SEED_LIST=`wget 
http://127.0.0.1:8080/Priam/REST/v1/cassconfig/get_seeds` bin/cassandra

You could also do

CASSANDRA_SEED_LIST=`dig -t txt mydomain.com` bin/cassandra

Doesn't this seem extremely useful and straight forward?

> Provide new seed providers
> --
>
> Key: CASSANDRA-12627
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12627
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Edward Capriolo
>Assignee: Edward Capriolo
>
> SeedProvider is plugable, however only one implementation exists.
> Changes:
> * Create a SeedProvider that reads properties from System properties or env
> * Provide a SeedProvider that scans ranges of IP addresses to find peers.
> * Refactor interface to abstract class because all seed providers must 
> provide a constructor that accepts Map 
> * correct error messages
> * Do not catch Exception use MultiCatch and catch typed exceptions



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12627) Provide new seed providers

2016-11-04 Thread Jason Brown (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15638108#comment-15638108
 ] 

Jason Brown commented on CASSANDRA-12627:
-

bq. Operators should not actually need to muck around with the yaml file anyway

Well, you already need to edit the yaml in order to define the 
{{SeedProdvider}}, let alone setting addresses, ports, and the like. For better 
or worse, operators will need to deal with the yaml. Thus, I'm -1 on 
{[PropertyOrEnvironmentSeedProvider}}.

{{NeighborSeedProvider}} gets into equally an unexpected space, as well. By 
naively setting the {{scan.distance}} too wide, we'd include nodes in the list 
that do not exist or are not c* nodes. In {{Gossiper}}, we will try to connect 
and gossip with those nodes as seeds ({{Gossiper#maybeGossipToSeed()}}). That 
means extra unix sockets (really not a problem), but also extra threads due to 
the existing internode messaging service implementation (see 
{{OutboundTcpConnectionPool}}). Those extra threads are expensive in large 
clusters. I'm not convinced that setting explicit addresses as the seed(s) is 
more difficult or requires more config understanding than defining a 
{{scan.distance}}. Again, I'm -1 on {{NeighborSeedProvider}}

bq. why is the configuration so obtuse and plugable if only one implementation 
exists?

For the record, there are other provider implementations, some public (like 
[Priam|https://github.com/Netflix/Priam/blob/master/priam-cass-extensions/src/main/java/com/netflix/priam/cassandra/extensions/NFSeedProvider.java]),
 some private. In those cases, special functionality is needed, and the space 
to plug it in is provided. The trade off, of course, is an increased config 
complexity that affects all users, in some small way.

I agree that the yaml config is non-trivial, but even that is pluggable. If we 
want to entertain a "config-lite" for new users/operator to get something stood 
up quickly, that might be a great thing - but outside the scope of this ticket.


> Provide new seed providers
> --
>
> Key: CASSANDRA-12627
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12627
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Edward Capriolo
>Assignee: Edward Capriolo
>
> SeedProvider is plugable, however only one implementation exists.
> Changes:
> * Create a SeedProvider that reads properties from System properties or env
> * Provide a SeedProvider that scans ranges of IP addresses to find peers.
> * Refactor interface to abstract class because all seed providers must 
> provide a constructor that accepts Map 
> * correct error messages
> * Do not catch Exception use MultiCatch and catch typed exceptions



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-12882) Deadlock in MemtableAllocator

2016-11-04 Thread Nimi Wariboko Jr. (JIRA)
Nimi Wariboko Jr. created CASSANDRA-12882:
-

 Summary: Deadlock in MemtableAllocator
 Key: CASSANDRA-12882
 URL: https://issues.apache.org/jira/browse/CASSANDRA-12882
 Project: Cassandra
  Issue Type: Bug
 Environment: Ubuntu 14.40
Cassandra 3.5
Reporter: Nimi Wariboko Jr.
 Fix For: 3.5
 Attachments: cassandra.yaml, threaddump.txt

I'm seeing an issue where a node will eventually lock up and their thread pools 
- I looked into jstack, and a lot of threads are stuck in the Memtable Allocator

{code}
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
at 
org.apache.cassandra.utils.concurrent.WaitQueue$AbstractSignal.awaitUninterruptibly(WaitQueue.java:279)
at 
org.apache.cassandra.utils.memory.MemtableAllocator$SubAllocator.allocate(MemtableAllocator.java:198)
at 
org.apache.cassandra.utils.memory.SlabAllocator.allocate(SlabAllocator.java:89)
at 
org.apache.cassandra.utils.memory.ContextAllocator.allocate(ContextAllocator.java:57)
at 
org.apache.cassandra.utils.memory.ContextAllocator.clone(ContextAllocator.java:47)
at 
org.apache.cassandra.utils.memory.MemtableBufferAllocator.clone(MemtableBufferAllocator.java:41)
{code}

I looked into the code, and its not immediately apparent to me what thread 
might hold the relevant lock.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-12877) SASI index throwing AssertionError on index creation

2016-11-04 Thread Voytek Jarnot (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15637691#comment-15637691
 ] 

Voytek Jarnot edited comment on CASSANDRA-12877 at 11/4/16 9:12 PM:


Attached full log output.  Fresh build of cassandra-3.X; fresh install, fresh 
keyspace (SimpleStrategy, RF 1).

1) built/installed 3.10-SNAPSHOT from git branch cassandra-3.X
2) created keyspace (SimpleStrategy, RF 1)
3) created table: (simplified version below, many more valX columns present)
{quote}
CREATE TABLE mytable (
id1 text,
id2 text,
id3 date,
id4 timestamp,
id5 text,
val1 text,
val2 text,
val3 text,
task_id text,
document_nbr text,
val5 text,
PRIMARY KEY ((id1, id2), id3, id4, id5)
) WITH CLUSTERING ORDER BY (id3 DESC, id4 DESC, id5 ASC)
{quote}

4) created materialized view:
{quote}
CREATE MATERIALIZED VIEW mytable_by_task_id AS
SELECT *
FROM mytable
WHERE id1 IS NOT NULL AND id2 IS NOT NULL AND id3 IS NOT NULL AND id4 IS 
NOT NULL AND id5 IS NOT NULL AND task_id IS NOT NULL
PRIMARY KEY (task_id, id3, id4, id1, id2, id5)
WITH CLUSTERING ORDER BY (id3 DESC, id4 DESC, id1 ASC, id2 ASC, id5 ASC)
{quote}
5) inserted 27 million "rows" (i.e., unique values for id5)
6) create index attempt
{quote}
create custom index idx_ar_document_nbr on test_table(document_nbr) using 
'org.apache.cassandra.index.sasi.SASIIndex';
{quote}
7) no error in cqlsh, logged errors attached.

Beginning to suspect CASSANDRA-11990 ... but don't have enough 
internals-knowledge to do much more than guess.


was (Author: voytek.jarnot):
Attached full log output.  Fresh build of cassandra-3.X; fresh install, fresh 
keyspace (SimpleStrategy, RF 1).

1) built/installed 3.10-SNAPSHOT from git branch cassandra-3.X
2) created keyspace (SimpleStrategy, RF 1)
3) created table: (simplified version below, many more valX columns present)
CREATE TABLE mytable (
id1 text,
id2 text,
id3 date,
id4 timestamp,
id5 text,
val1 text,
val2 text,
val3 text,
task_id text,
document_nbr text,
val5 text,
PRIMARY KEY ((id1, id2), id3, id4, id5)
) WITH CLUSTERING ORDER BY (id3 DESC, id4 DESC, id5 ASC)
4) created materialized view:
CREATE MATERIALIZED VIEW mytable_by_task_id AS
SELECT *
FROM mytable
WHERE id1 IS NOT NULL AND id2 IS NOT NULL AND id3 IS NOT NULL AND id4 IS 
NOT NULL AND id5 IS NOT NULL AND task_id IS NOT NULL
PRIMARY KEY (task_id, id3, id4, id1, id2, id5)
WITH CLUSTERING ORDER BY (id3 DESC, id4 DESC, id1 ASC, id2 ASC, id5 ASC)
5) inserted 27 million "rows" (i.e., unique values for id5)
6) create index attempt
create custom index idx_ar_document_nbr on test_table(document_nbr) using 
'org.apache.cassandra.index.sasi.SASIIndex';

7) no error in cqlsh, logged errors attached.

Beginning to suspect CASSANDRA-11990 ... but don't have enough 
internals-knowledge to do much more than guess.

> SASI index throwing AssertionError on index creation
> 
>
> Key: CASSANDRA-12877
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12877
> Project: Cassandra
>  Issue Type: Bug
>  Components: sasi
> Environment: 3.9 and 3.10 tested on both linux and osx
>Reporter: Voytek Jarnot
> Attachments: idx-stacktrace-03-nov-2016.txt, 
> idx-stacktrace-04-nov-2016.txt
>
>
> Possibly a 3.10 regression?
> I built and installed a 3.10 snapshot (built 03-Nov-2016) to get around 
> CASSANDRA-11670, CASSANDRA-12689, and CASSANDRA-12223 which are holding me 
> back when using 3.9. Edit to add: 3 node cluster, replication factor of 2.
> Would like to state up front that I can't duplicate this with a lightweight 
> throwaway test, which is frustrating, but it keeps hitting me on our dev 
> cluster.  It may require a certain amount of data present (or perhaps a high 
> number of nulls in the indexed column) - never had any luck duplicating with 
> the table shown below.
> Table roughly resembles the following, with many more 'valx' columns:
> CREATE TABLE idx_test_table (
> id1 text,
> id2 text,
> id3 text,
> id4 text,
> val1 text,
> val2 text,
> PRIMARY KEY ((id1, id2), id3, id4)
> ) WITH CLUSTERING ORDER BY (id3 DESC, id4 ASC);
> CREATE CUSTOM INDEX idx_test_index ON idx_test_table (val2) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex';
> The error below occurs in 3.10, but not in 3.9; it occurs whether I insert a 
> bunch of dev data and then create the index, or whether I create the index 
> and then insert a bunch of test data.
> {quote}
> INFO  [MemtableFlushWriter:5] 2016-11-03 21:00:19,416 
> PerSSTableIndexWriter.java:284 - Scheduling index flush to 
> /u01/cassandra-data/data/essatc1/audit_record-520c1dc0a1e411e691db0bd4b103bd15/mc-266-big-SI_idx_ar_document_nbr.db

[jira] [Commented] (CASSANDRA-11218) Prioritize Secondary Index rebuild

2016-11-04 Thread Sam Tunnicliffe (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15637696#comment-15637696
 ] 

Sam Tunnicliffe commented on CASSANDRA-11218:
-

I took a step back to re-evaulate exactly what is prioritized and when and it 
turns out that adding priorities to {{CompactionTask}} itself is redundant, and 
actually, in every case it's ignored. Really, it's the 
{{Runnable/WrappedRunnable/Callable}} being submitted to {{CompactionExecutor}} 
that needs to be prioritized and even in cases where these wrap a 
{{CompactionTask}}, these have their own priority explicitly set and do not 
usually inherit it from the task. That is to say, that it's the context of the 
submission which determines the priority, rather than the task itself. Having 
{{CompactionTask}} implement {{Prioritized}} actually makes some stuff pretty 
hard to follow.

Actually, it turns out that this is somewhat academic, as most of the 
priorities are not actually being observed anyway, due to the fact that 
{{CompactionExecutor::submitIfRunning}} wraps everything in a 
{{ListenableFutureTask}} before submission. This effectively hides any priority 
that might be set so {{newTask}} always ends up creating another wrapper layer 
with default priorities. 

I've pushed a fix for this to my branch, and also refactored so that 
{{CompactionTask}} no longer implements {{Prioritized}}, which I think 
simplifies things quite a bit. 

Another thing that was bugging me was that the relationship between 
{{OperationType}} and {{Priority}} is somewhat muddy, as for some operations 
prioritization is a meaningless concept (as indicated by the multiple special 
cases). It feels like we're overloading the concepts somewhat which, for me, 
again makes it much harder to grok. So I propose removing all the priority 
stuff from {{OperationType}} and adding a new enum solely to represent the task 
priorities (also done in my branch).

bq. I've removed the remaining uses of subtype prioritization...I've left the 
logic in place for later

I think we should just rip it out if it's not being used. Like you say, if we 
do come up with some good way to prioritise within types later, then we can 
easily add it back, so leaving half of it hanging around in the meantime is 
only going to be confusing.

The other thing that's obviously missing here is tests, 
{{CompactionPriorityTest}} alone clearly isn't sufficient, seeing as it didn't 
catch the fact that no prioritization was actually being done :) 


> Prioritize Secondary Index rebuild
> --
>
> Key: CASSANDRA-11218
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11218
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: sankalp kohli
>Assignee: Jeff Jirsa
>Priority: Minor
>
> We have seen that secondary index rebuild get stuck behind other compaction 
> during a bootstrap and other operations. This causes things to not finish. We 
> should prioritize index rebuild via a separate thread pool or using a 
> priority queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12877) SASI index throwing AssertionError on index creation

2016-11-04 Thread Voytek Jarnot (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Voytek Jarnot updated CASSANDRA-12877:
--
Attachment: idx-stacktrace-04-nov-2016.txt

Attached full log output.  Fresh build of cassandra-3.X; fresh install, fresh 
keyspace (SimpleStrategy, RF 1).

1) built/installed 3.10-SNAPSHOT from git branch cassandra-3.X
2) created keyspace (SimpleStrategy, RF 1)
3) created table: (simplified version below, many more valX columns present)
CREATE TABLE mytable (
id1 text,
id2 text,
id3 date,
id4 timestamp,
id5 text,
val1 text,
val2 text,
val3 text,
task_id text,
document_nbr text,
val5 text,
PRIMARY KEY ((id1, id2), id3, id4, id5)
) WITH CLUSTERING ORDER BY (id3 DESC, id4 DESC, id5 ASC)
4) created materialized view:
CREATE MATERIALIZED VIEW mytable_by_task_id AS
SELECT *
FROM mytable
WHERE id1 IS NOT NULL AND id2 IS NOT NULL AND id3 IS NOT NULL AND id4 IS 
NOT NULL AND id5 IS NOT NULL AND task_id IS NOT NULL
PRIMARY KEY (task_id, id3, id4, id1, id2, id5)
WITH CLUSTERING ORDER BY (id3 DESC, id4 DESC, id1 ASC, id2 ASC, id5 ASC)
5) inserted 27 million "rows" (i.e., unique values for id5)
6) create index attempt
create custom index idx_ar_document_nbr on test_table(document_nbr) using 
'org.apache.cassandra.index.sasi.SASIIndex';

7) no error in cqlsh, logged errors attached.

Beginning to suspect CASSANDRA-11990 ... but don't have enough 
internals-knowledge to do much more than guess.

> SASI index throwing AssertionError on index creation
> 
>
> Key: CASSANDRA-12877
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12877
> Project: Cassandra
>  Issue Type: Bug
>  Components: sasi
> Environment: 3.9 and 3.10 tested on both linux and osx
>Reporter: Voytek Jarnot
> Attachments: idx-stacktrace-03-nov-2016.txt, 
> idx-stacktrace-04-nov-2016.txt
>
>
> Possibly a 3.10 regression?
> I built and installed a 3.10 snapshot (built 03-Nov-2016) to get around 
> CASSANDRA-11670, CASSANDRA-12689, and CASSANDRA-12223 which are holding me 
> back when using 3.9. Edit to add: 3 node cluster, replication factor of 2.
> Would like to state up front that I can't duplicate this with a lightweight 
> throwaway test, which is frustrating, but it keeps hitting me on our dev 
> cluster.  It may require a certain amount of data present (or perhaps a high 
> number of nulls in the indexed column) - never had any luck duplicating with 
> the table shown below.
> Table roughly resembles the following, with many more 'valx' columns:
> CREATE TABLE idx_test_table (
> id1 text,
> id2 text,
> id3 text,
> id4 text,
> val1 text,
> val2 text,
> PRIMARY KEY ((id1, id2), id3, id4)
> ) WITH CLUSTERING ORDER BY (id3 DESC, id4 ASC);
> CREATE CUSTOM INDEX idx_test_index ON idx_test_table (val2) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex';
> The error below occurs in 3.10, but not in 3.9; it occurs whether I insert a 
> bunch of dev data and then create the index, or whether I create the index 
> and then insert a bunch of test data.
> {quote}
> INFO  [MemtableFlushWriter:5] 2016-11-03 21:00:19,416 
> PerSSTableIndexWriter.java:284 - Scheduling index flush to 
> /u01/cassandra-data/data/essatc1/audit_record-520c1dc0a1e411e691db0bd4b103bd15/mc-266-big-SI_idx_ar_document_nbr.db
> INFO  [SASI-Memtable:1] 2016-11-03 21:00:19,450 
> PerSSTableIndexWriter.java:335 - Index flush to 
> /u01/cassandra-data/data/essatc1/audit_record-520c1dc0a1e411e691db0bd4b103bd15/mc-266-big-SI_idx_ar_document_nbr.db
>  took 33 ms.
> ERROR [SASI-Memtable:1] 2016-11-03 21:00:19,454 CassandraDaemon.java:229 - 
> Exception in thread Thread[SASI-Memtable:1,5,main]
> java.lang.AssertionError: cannot have more than 8 overflow collisions per 
> leaf, but had: 25
>   at 
> org.apache.cassandra.index.sasi.disk.AbstractTokenTreeBuilder$Leaf.createOverflowEntry(AbstractTokenTreeBuilder.java:357)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.disk.AbstractTokenTreeBuilder$Leaf.createEntry(AbstractTokenTreeBuilder.java:346)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.disk.DynamicTokenTreeBuilder$DynamicLeaf.serializeData(DynamicTokenTreeBuilder.java:180)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.disk.AbstractTokenTreeBuilder$Leaf.serialize(AbstractTokenTreeBuilder.java:306)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.disk.AbstractTokenTreeBuilder.write(AbstractTokenTreeBuilder.java:90)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> 

[jira] [Updated] (CASSANDRA-12796) Heap exhaustion when rebuilding secondary index over a table with wide partitions

2016-11-04 Thread Joshua McKenzie (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie updated CASSANDRA-12796:

Reproduced In: 3.0.8, 2.2.7, 2.1.9  (was: 2.1.9, 2.2.7, 3.0.8)
 Reviewer: Sam Tunnicliffe

> Heap exhaustion when rebuilding secondary index over a table with wide 
> partitions
> -
>
> Key: CASSANDRA-12796
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12796
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Milan Majercik
>Priority: Critical
>
> We have a table with rather wide partition and a secondary index defined over 
> it. As soon as we try to rebuild the index we observed exhaustion of Java 
> heap and eventual OOM error. After a lengthy investigation we have managed to 
> find a culprit which appears to be a wrong granule of barrier issuances in 
> method {{org.apache.cassandra.db.Keyspace.indexRow}}:
> {code}
> try (OpOrder.Group opGroup = cfs.keyspace.writeOrder.start()){html}
> {
> Set indexes = 
> cfs.indexManager.getIndexesByNames(idxNames);
> Iterator pager = QueryPagers.pageRowLocally(cfs, 
> key.getKey(), DEFAULT_PAGE_SIZE);
> while (pager.hasNext())
> {
> ColumnFamily cf = pager.next();
> ColumnFamily cf2 = cf.cloneMeShallow();
> for (Cell cell : cf)
> {
> if (cfs.indexManager.indexes(cell.name(), indexes))
> cf2.addColumn(cell);
> }
> cfs.indexManager.indexRow(key.getKey(), cf2, opGroup);
> }
> }
> {code}
> Please note the operation group granule is a partition of the source table 
> which poses a problem for wide partition tables as flush runnable 
> ({{org.apache.cassandra.db.ColumnFamilyStore.Flush.run()}}) won't proceed 
> with flushing secondary index memtable before completing operations prior 
> recent issue of the barrier. In our situation the flush runnable waits until 
> whole wide partition gets indexed into the secondary index memtable before 
> flushing it. This causes an exhaustion of the heap and eventual OOM error.
> After we changed granule of barrier issue in method 
> {{org.apache.cassandra.db.Keyspace.indexRow}} to query page as opposed to 
> table partition secondary index (see 
> [https://github.com/mmajercik/cassandra/commit/7e10e5aa97f1de483c2a5faf867315ecbf65f3d6?diff=unified]),
>  rebuild started to work without heap exhaustion. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-12627) Provide new seed providers

2016-11-04 Thread Edward Capriolo (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15637420#comment-15637420
 ] 

Edward Capriolo edited comment on CASSANDRA-12627 at 11/4/16 8:19 PM:
--

PropertyOrEnvironmentSeedProvider 

{quote}
muck about with the yaml anyways
{quote}

Operators should not actually need to muck around with the yaml file anyway. In 
the old cassandra days we ONLY had the configuration file to find seeds. Now 
the known hosts are stored in system tables anyway. Thus the seeds defined in 
the yaml file are fairly useless after initialization just like the tokens. We 
should break free of maintaining this. 

NeighborSeedProvider you are correct to say that this implementation is not 
useful in a cloud environment with random IP allocation. However in non-cloud 
environments machines are typically given IP addresses in a subnet and for 
large networks and administrator will pre-allocate a subnet.

For example, an administrator will say we are going to use the network 
10.1.1.0/255.255.255.0 as a class C network for Cassandra servers. Given our 
own IP address we can know where the others will be. 

In general you can argue the merits of each provider. The next logical question 
is why is the configuration so obtuse and plugable if only one implementation 
exists? As to say who uses this? How? and Why? If they are mocking/adjusting 
the configuration in other places why does this piece need plugability?


was (Author: appodictic):
PropertyOrEnvironmentSeedProvider 

{quote}
muck about with the yaml anyways
{quote}

Operators should not actually need to muck around with the yaml file anyway. In 
the old cassandra days we ONLY had the configuration file to find seeds. Now 
the known hosts are stored in system tables anyway. Thus the seeds defined in 
the yaml file are fairly useless after initialization just like the tokens. We 
should break free of maintaining this. 

NeighborSeedProvider you are correct to say that this implementation is not 
useful in a cloud environment with random IP allocation. However in non-cloud 
environments machines are typically given IP addresses in a subnet and for 
large networks and administrator will pre-allocate a subnet.

For example, an administrator will say we are going to use the network 
10.1.1.0/255.255.255.0 as a class C network for Cassandra servers. Given our 
own IP address we can know where the others will be. 

In general you can argue the merits of each provider. The next logical question 
is why is the configuration so obtuse and plugable if only one implementation 
exists?

> Provide new seed providers
> --
>
> Key: CASSANDRA-12627
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12627
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Edward Capriolo
>Assignee: Edward Capriolo
>
> SeedProvider is plugable, however only one implementation exists.
> Changes:
> * Create a SeedProvider that reads properties from System properties or env
> * Provide a SeedProvider that scans ranges of IP addresses to find peers.
> * Refactor interface to abstract class because all seed providers must 
> provide a constructor that accepts Map 
> * correct error messages
> * Do not catch Exception use MultiCatch and catch typed exceptions



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12653) In-flight shadow round requests

2016-11-04 Thread Joel Knighton (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15637549#comment-15637549
 ] 

Joel Knighton commented on CASSANDRA-12653:
---

[~spo...@gmail.com] - yes! I sincerely apologize for the delay here. If anyone 
else is interested in reviewing this, they're welcome to pick it up, but it's 
near the top of my list and I hope to get to this soon.

> In-flight shadow round requests
> ---
>
> Key: CASSANDRA-12653
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12653
> Project: Cassandra
>  Issue Type: Bug
>  Components: Distributed Metadata
>Reporter: Stefan Podkowinski
>Assignee: Stefan Podkowinski
>Priority: Minor
> Attachments: 12653-2.2.patch, 12653-3.0.patch, 12653-trunk.patch
>
>
> Bootstrapping or replacing a node in the cluster requires to gather and check 
> some host IDs or tokens by doing a gossip "shadow round" once before joining 
> the cluster. This is done by sending a gossip SYN to all seeds until we 
> receive a response with the cluster state, from where we can move on in the 
> bootstrap process. Receiving a response will call the shadow round done and 
> calls {{Gossiper.resetEndpointStateMap}} for cleaning up the received state 
> again.
> The issue here is that at this point there might be other in-flight requests 
> and it's very likely that shadow round responses from other seeds will be 
> received afterwards, while the current state of the bootstrap process doesn't 
> expect this to happen (e.g. gossiper may or may not be enabled). 
> One side effect will be that MigrationTasks are spawned for each shadow round 
> reply except the first. Tasks might or might not execute based on whether at 
> execution time {{Gossiper.resetEndpointStateMap}} had been called, which 
> effects the outcome of {{FailureDetector.instance.isAlive(endpoint))}} at 
> start of the task. You'll see error log messages such as follows when this 
> happend:
> {noformat}
> INFO  [SharedPool-Worker-1] 2016-09-08 08:36:39,255 Gossiper.java:993 - 
> InetAddress /xx.xx.xx.xx is now UP
> ERROR [MigrationStage:1]2016-09-08 08:36:39,255 FailureDetector.java:223 
> - unknown endpoint /xx.xx.xx.xx
> {noformat}
> Although is isn't pretty, I currently don't see any serious harm from this, 
> but it would be good to get a second opinion (feel free to close as "wont 
> fix").
> /cc [~Stefania] [~thobbs]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12796) Heap exhaustion when rebuilding secondary index over a table with wide partitions

2016-11-04 Thread anmols (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

anmols updated CASSANDRA-12796:
---
Reproduced In: 3.0.8, 2.2.7, 2.1.9  (was: 2.2.7)
   Status: Awaiting Feedback  (was: Open)

> Heap exhaustion when rebuilding secondary index over a table with wide 
> partitions
> -
>
> Key: CASSANDRA-12796
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12796
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Milan Majercik
>Priority: Critical
>
> We have a table with rather wide partition and a secondary index defined over 
> it. As soon as we try to rebuild the index we observed exhaustion of Java 
> heap and eventual OOM error. After a lengthy investigation we have managed to 
> find a culprit which appears to be a wrong granule of barrier issuances in 
> method {{org.apache.cassandra.db.Keyspace.indexRow}}:
> {code}
> try (OpOrder.Group opGroup = cfs.keyspace.writeOrder.start()){html}
> {
> Set indexes = 
> cfs.indexManager.getIndexesByNames(idxNames);
> Iterator pager = QueryPagers.pageRowLocally(cfs, 
> key.getKey(), DEFAULT_PAGE_SIZE);
> while (pager.hasNext())
> {
> ColumnFamily cf = pager.next();
> ColumnFamily cf2 = cf.cloneMeShallow();
> for (Cell cell : cf)
> {
> if (cfs.indexManager.indexes(cell.name(), indexes))
> cf2.addColumn(cell);
> }
> cfs.indexManager.indexRow(key.getKey(), cf2, opGroup);
> }
> }
> {code}
> Please note the operation group granule is a partition of the source table 
> which poses a problem for wide partition tables as flush runnable 
> ({{org.apache.cassandra.db.ColumnFamilyStore.Flush.run()}}) won't proceed 
> with flushing secondary index memtable before completing operations prior 
> recent issue of the barrier. In our situation the flush runnable waits until 
> whole wide partition gets indexed into the secondary index memtable before 
> flushing it. This causes an exhaustion of the heap and eventual OOM error.
> After we changed granule of barrier issue in method 
> {{org.apache.cassandra.db.Keyspace.indexRow}} to query page as opposed to 
> table partition secondary index (see 
> [https://github.com/mmajercik/cassandra/commit/7e10e5aa97f1de483c2a5faf867315ecbf65f3d6?diff=unified]),
>  rebuild started to work without heap exhaustion. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12796) Heap exhaustion when rebuilding secondary index over a table with wide partitions

2016-11-04 Thread anmols (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15637535#comment-15637535
 ] 

anmols commented on CASSANDRA-12796:


I am able to reproduce this issue in Apache Cassandra 3.0.8 with a wide 
partition and a secondary index defined over it.

The code has changed significantly between the version reported here and 3.0.8 
however the characteristics of the failure are fairly similar, i.e. when a 
secondary index is rebuild there is a build up of large number of pending 
memtable flush runnables and the node gets overwhelmed and crashes due to an 
OOM.

Adjusting the granule on which the write barrier applies (taking a pass with 
the suggested patch's logic on the 3.0.8 code) does seem to alleviate the 
problem and I do not see the memtable flush runnables queue up, however I am 
not sure if there are other unintended consequences of tweaking this write 
barrier granule which need to be considered.

I would like to know if this patch can be brought into Cassandra 3.0.x or are 
there other solutions to deal with large partitions with secondary indexes?

> Heap exhaustion when rebuilding secondary index over a table with wide 
> partitions
> -
>
> Key: CASSANDRA-12796
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12796
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Milan Majercik
>Priority: Critical
>
> We have a table with rather wide partition and a secondary index defined over 
> it. As soon as we try to rebuild the index we observed exhaustion of Java 
> heap and eventual OOM error. After a lengthy investigation we have managed to 
> find a culprit which appears to be a wrong granule of barrier issuances in 
> method {{org.apache.cassandra.db.Keyspace.indexRow}}:
> {code}
> try (OpOrder.Group opGroup = cfs.keyspace.writeOrder.start()){html}
> {
> Set indexes = 
> cfs.indexManager.getIndexesByNames(idxNames);
> Iterator pager = QueryPagers.pageRowLocally(cfs, 
> key.getKey(), DEFAULT_PAGE_SIZE);
> while (pager.hasNext())
> {
> ColumnFamily cf = pager.next();
> ColumnFamily cf2 = cf.cloneMeShallow();
> for (Cell cell : cf)
> {
> if (cfs.indexManager.indexes(cell.name(), indexes))
> cf2.addColumn(cell);
> }
> cfs.indexManager.indexRow(key.getKey(), cf2, opGroup);
> }
> }
> {code}
> Please note the operation group granule is a partition of the source table 
> which poses a problem for wide partition tables as flush runnable 
> ({{org.apache.cassandra.db.ColumnFamilyStore.Flush.run()}}) won't proceed 
> with flushing secondary index memtable before completing operations prior 
> recent issue of the barrier. In our situation the flush runnable waits until 
> whole wide partition gets indexed into the secondary index memtable before 
> flushing it. This causes an exhaustion of the heap and eventual OOM error.
> After we changed granule of barrier issue in method 
> {{org.apache.cassandra.db.Keyspace.indexRow}} to query page as opposed to 
> table partition secondary index (see 
> [https://github.com/mmajercik/cassandra/commit/7e10e5aa97f1de483c2a5faf867315ecbf65f3d6?diff=unified]),
>  rebuild started to work without heap exhaustion. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12281) Gossip blocks on startup when another node is bootstrapping

2016-11-04 Thread Joel Knighton (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15637528#comment-15637528
 ] 

Joel Knighton commented on CASSANDRA-12281:
---

Thanks for the patch and your patience as I get to this for review! I've been 
quite busy lately.

The approach overall seems sound. While calculating pending ranges can be a 
little slow, I don't think we risk falling too far behind, because the huge 
delays here appear to be a result of cascading delays to other tasks. The 
PendingRangeCalculatorService's restriction on one queued task that will 
reflect cluster state at time of execution helps with this.

A few small questions/nits:
- Is there a reason that the test is excluded from the 2.2 branch? Byteman is 
available for tests on the 2.2 branch since [CASSANDRA-12377], and I don't see 
anything else that stops the test from being useful there.
- Generally, the tests are organized as a top-level class for some entity or 
fundamental operation in the codebase and then specific test methods for unit 
tests/regression tests. I think it would make sense to establish a 
{{PendingRangeCalculatorServiceTest}} and introduce the specific test for 
[CASSANDRA-12281] inside that class.
- In the {{PendingRangeCalculatorService}}, I'm not sure we need to move the 
"Finished calculation for ..." log message to trace. Most Gossip/TokenMetadata 
state changes are logged at debug, especially when they reflect some detail 
about the aggregate state of an operation.
- A few minor spelling fixes in the test "aquire" -> "acquire", "fist" -> 
"first". (Note that I normally wouldn't bother with these, but since the test 
could likely use a few other changes, I think it is worthwhile to fix these.)
- In the test's setUp, the call to {{Keyspace.setInitialized}} is redundant. 
The call to {{SchemaLoader.prepareServer}} will already perform this.
- CI looks good overall. The 3.0-dtest run has a few materialized view dtest 
failures that are likely unrelated, but it would be good if you could retrigger 
CI for at least this branch.
- There's no CI/branch posted for the 3.X series. While this has barely 
diverged from trunk at this point, it'd be nice if you could run CI for this 
branch.

Thanks again.

> Gossip blocks on startup when another node is bootstrapping
> ---
>
> Key: CASSANDRA-12281
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12281
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Eric Evans
>Assignee: Stefan Podkowinski
> Attachments: 12281-2.2.patch, 12281-3.0.patch, 12281-trunk.patch, 
> restbase1015-a_jstack.txt
>
>
> In our cluster, normal node startup times (after a drain on shutdown) are 
> less than 1 minute.  However, when another node in the cluster is 
> bootstrapping, the same node startup takes nearly 30 minutes to complete, the 
> apparent result of gossip blocking on pending range calculations.
> {noformat}
> $ nodetool-a tpstats
> Pool NameActive   Pending  Completed   Blocked  All 
> time blocked
> MutationStage 0 0   1840 0
>  0
> ReadStage 0 0   2350 0
>  0
> RequestResponseStage  0 0 53 0
>  0
> ReadRepairStage   0 0  1 0
>  0
> CounterMutationStage  0 0  0 0
>  0
> HintedHandoff 0 0 44 0
>  0
> MiscStage 0 0  0 0
>  0
> CompactionExecutor3 3395 0
>  0
> MemtableReclaimMemory 0 0 30 0
>  0
> PendingRangeCalculator1 2 29 0
>  0
> GossipStage   1  5602164 0
>  0
> MigrationStage0 0  0 0
>  0
> MemtablePostFlush 0 0111 0
>  0
> ValidationExecutor0 0  0 0
>  0
> Sampler   0 0  0 0
>  0
> MemtableFlushWriter   0 0 30 0
>  0
> InternalResponseStage 0 0  0 0
>  0
> AntiEntropyStage  0 0  0 0
>  0
> CacheCleanupExecutor  0 0 

[jira] [Comment Edited] (CASSANDRA-12627) Provide new seed providers

2016-11-04 Thread Edward Capriolo (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15637420#comment-15637420
 ] 

Edward Capriolo edited comment on CASSANDRA-12627 at 11/4/16 8:02 PM:
--

PropertyOrEnvironmentSeedProvider 

{quote}
muck about with the yaml anyways
{quote}

Operators should not actually need to muck around with the yaml file anyway. In 
the old cassandra days we ONLY had the configuration file to find seeds. Now 
the known hosts are stored in system tables anyway. Thus the seeds defined in 
the yaml file are fairly useless after initialization just like the tokens. We 
should break free of maintaining this. 

NeighborSeedProvider you are correct to say that this implementation is not 
useful in a cloud environment with random IP allocation. However in non-cloud 
environments machines are typically given IP addresses in a subnet and for 
large networks and administrator will pre-allocate a subnet.

For example, an administrator will say we are going to use the network 
10.1.1.0/255.255.255.0 as a class C network for Cassandra servers. Given our 
own IP address we can know where the others will be. 

In general you can argue the merits of each provider. The next logical question 
is why is the configuration so obtuse and plugable if only one implementation 
exists?


was (Author: appodictic):
PropertyOrEnvironmentSeedProvider 

{quote}
muck about with the yaml anyways
{quote}

Operators should not actually need to muck around with the yaml file anyway. In 
the old cassandra days we ONLY had the configuration file to find seeds. Now 
the known hosts are stored in system tables anyway. Thus the seeds defined in 
the yaml file are fairly useless after initialization just like the tokens. We 
should break free of maintaining this. 

NeighborSeedProvider you are correct to say that this implementation is not 
useful in a cloud environment with random IP allocation. However in non-cloud 
environments machines are typically given IP addresses in a subnet and they for 
large networks and administrator will pre-allocate a subnet.

For example, an administrator will say we are going to use the network 
10.1.1.0/255.255.255.0 as a class C network for Cassandra servers. Given our 
own IP address we can know where the others will be. 

In general you can argue the merits of each provider. The next logical question 
is why is the configuration so obtuse and plugable if only one implementation 
exists?

> Provide new seed providers
> --
>
> Key: CASSANDRA-12627
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12627
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Edward Capriolo
>Assignee: Edward Capriolo
>
> SeedProvider is plugable, however only one implementation exists.
> Changes:
> * Create a SeedProvider that reads properties from System properties or env
> * Provide a SeedProvider that scans ranges of IP addresses to find peers.
> * Refactor interface to abstract class because all seed providers must 
> provide a constructor that accepts Map 
> * correct error messages
> * Do not catch Exception use MultiCatch and catch typed exceptions



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12627) Provide new seed providers

2016-11-04 Thread Edward Capriolo (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15637420#comment-15637420
 ] 

Edward Capriolo commented on CASSANDRA-12627:
-

PropertyOrEnvironmentSeedProvider 

{quote}
muck about with the yaml anyways
{quote}

Operators should not actually need to muck around with the yaml file anyway. In 
the old cassandra days we ONLY had the configuration file to find seeds. Now 
the known hosts are stored in system tables anyway. Thus the seeds defined in 
the yaml file are fairly useless after initialization just like the tokens. We 
should break free of maintaining this. 

NeighborSeedProvider you are correct to say that this implementation is not 
useful in a cloud environment with random IP allocation. However in non-cloud 
environments machines are typically given IP addresses in a subnet and they for 
large networks and administrator will pre-allocate a subnet.

For example, an administrator will say we are going to use the network 
10.1.1.0/255.255.255.0 as a class C network for Cassandra servers. Given our 
own IP address we can know where the others will be. 

In general you can argue the merits of each provider. The next logical question 
is why is the configuration so obtuse and plugable if only one implementation 
exists?

> Provide new seed providers
> --
>
> Key: CASSANDRA-12627
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12627
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Edward Capriolo
>Assignee: Edward Capriolo
>
> SeedProvider is plugable, however only one implementation exists.
> Changes:
> * Create a SeedProvider that reads properties from System properties or env
> * Provide a SeedProvider that scans ranges of IP addresses to find peers.
> * Refactor interface to abstract class because all seed providers must 
> provide a constructor that accepts Map 
> * correct error messages
> * Do not catch Exception use MultiCatch and catch typed exceptions



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12627) Provide new seed providers

2016-11-04 Thread Jason Brown (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15637148#comment-15637148
 ] 

Jason Brown commented on CASSANDRA-12627:
-

Thanks for the patch , [~appodictic]. Here's my initial thoughts:

- {{SeedProvider}}, I agree that anyone wanting to implement their own provider 
would need to know (and probably trip up on) that we require a constructor that 
takes a {{Map}} as an argument. That change makes sense.

- {{PropertyOrEnvironmentSeedProvider}}, I'm trying to understand the upside 
value of this. Sure, it's hypothetically "simpler" to pass in a {{-D}} prop or 
set an env variable, but won't operators already need to muck about with the 
yaml anyways, for other values? By introducing this new seed provider, do we 
just spread out the configuration burden to more places?

- {{NeighborSeedProvider}}, can you explain what the value is of this class? It 
appears to naively add a range of {{InetAddress}} centered around the current 
node's address - and then use those as seeds. That seed list would only be 
visible on that node as we don't have a shared, distributed notion of what 
nodes are "seeds". I'm trying to imagine where {{NeighborSeedProvider}} would 
be useful, especially in a "cloud environment", where IP address assignment, at 
least from my previous EC2 experience, is largely random.

> Provide new seed providers
> --
>
> Key: CASSANDRA-12627
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12627
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Edward Capriolo
>Assignee: Edward Capriolo
>
> SeedProvider is plugable, however only one implementation exists.
> Changes:
> * Create a SeedProvider that reads properties from System properties or env
> * Provide a SeedProvider that scans ranges of IP addresses to find peers.
> * Refactor interface to abstract class because all seed providers must 
> provide a constructor that accepts Map 
> * correct error messages
> * Do not catch Exception use MultiCatch and catch typed exceptions



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12867) Batch with multiple conditional updates for the same partition causes AssertionError

2016-11-04 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15636942#comment-15636942
 ] 

Sylvain Lebresne commented on CASSANDRA-12867:
--

Pushed update to the tests. This merges cleanly to 3.X but pushed a branch for 
the sake of testing, and re-ran it all:
| [12867-3.0|https://github.com/pcmanus/cassandra/commits/12867-3.0] | 
[utests|http://cassci.datastax.com/job/pcmanus-12867-3.0-testall] | 
[dtests|http://cassci.datastax.com/job/pcmanus-12867-3.0-dtest] |
| [12867-3.X|https://github.com/pcmanus/cassandra/commits/12867-3.X] | 
[utests|http://cassci.datastax.com/job/pcmanus-12867-3.X-testall] | 
[dtests|http://cassci.datastax.com/job/pcmanus-12867-3.X-dtest] |

bq. it would make sense to add the same tests on a table with one clustering 
columns

Worth noting that I created a similar but different test for this. We could 
probably somehow save some lines by merging both test with some conditional 
string concatenation but 1) I don't think that would be more readable/better 
and 2) I was lazy.

> Batch with multiple conditional updates for the same partition causes 
> AssertionError
> 
>
> Key: CASSANDRA-12867
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12867
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL
>Reporter: Kurt Greaves
>Assignee: Sylvain Lebresne
>Priority: Critical
> Fix For: 3.0.10, 3.10
>
> Attachments: 12867-3.0.patch
>
>
> Reproduced in 3.0.10 and 3.10. Used to work in 3.0.9 and earlier. Bug was 
> introduced in CASSANDRA-12060.
> The following causes an AssertionError:
> {code}
> CREATE KEYSPACE test WITH replication = { 'class' : 'SimpleStrategy', 
> 'replication_factor' : 1 };
> create table test.test (id int PRIMARY KEY, val text);
> BEGIN BATCH INSERT INTO test.test (id, val) VALUES (999, 'aaa') IF NOT 
> EXISTS; INSERT INTO test.test (id, val) VALUES (999, 'ccc') IF NOT EXISTS; 
> APPLY BATCH ;
> {code}
> Stack trace is as follows:
> {code}
> ERROR [Native-Transport-Requests-2] 2016-10-31 04:16:44,231 Message.java:622 
> - Unexpected exception during request; channel = [id: 0x176e1c04, 
> L:/127.0.0.1:9042 - R:/127.0.0.1:59743]
> java.lang.AssertionError: null
> at 
> org.apache.cassandra.cql3.statements.CQL3CasRequest.setConditionsForRow(CQL3CasRequest.java:138)
>  ~[main/:na]
> at 
> org.apache.cassandra.cql3.statements.CQL3CasRequest.addExistsCondition(CQL3CasRequest.java:104)
>  ~[main/:na]
> at 
> org.apache.cassandra.cql3.statements.CQL3CasRequest.addNotExist(CQL3CasRequest.java:84)
>  ~[main/:na]
> at 
> org.apache.cassandra.cql3.IfNotExistsCondition.addConditionsTo(IfNotExistsCondition.java:28)
>  ~[main/:na]
> at 
> org.apache.cassandra.cql3.statements.ModificationStatement.addConditions(ModificationStatement.java:482)
>  ~[main/:na]
> at 
> org.apache.cassandra.cql3.statements.BatchStatement.makeCasRequest(BatchStatement.java:434)
>  ~[main/:na]
> at 
> org.apache.cassandra.cql3.statements.BatchStatement.executeWithConditions(BatchStatement.java:379)
>  ~[main/:na]
> at 
> org.apache.cassandra.cql3.statements.BatchStatement.execute(BatchStatement.java:358)
>  ~[main/:na]
> at 
> org.apache.cassandra.cql3.statements.BatchStatement.execute(BatchStatement.java:346)
>  ~[main/:na]
> at 
> org.apache.cassandra.cql3.statements.BatchStatement.execute(BatchStatement.java:341)
>  ~[main/:na]
> at 
> org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:218)
>  ~[main/:na]
> at 
> org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:249) 
> ~[main/:na]
> at 
> org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:234) 
> ~[main/:na]
> at 
> org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:115)
>  ~[main/:na]
> at 
> org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:516)
>  [main/:na]
> at 
> org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:409)
>  [main/:na]
> at 
> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
>  [netty-all-4.0.39.Final.jar:4.0.39.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
>  [netty-all-4.0.39.Final.jar:4.0.39.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:35)
>  [netty-all-4.0.39.Final.jar:4.0.39.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:357)
>  [netty-all-4.0.39.Final.jar:4.0.39.Final]
> 

[jira] [Updated] (CASSANDRA-12627) Provide new seed providers

2016-11-04 Thread Jason Brown (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Brown updated CASSANDRA-12627:

Reviewer: Jason Brown

> Provide new seed providers
> --
>
> Key: CASSANDRA-12627
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12627
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Edward Capriolo
>Assignee: Edward Capriolo
>
> SeedProvider is plugable, however only one implementation exists.
> Changes:
> * Create a SeedProvider that reads properties from System properties or env
> * Provide a SeedProvider that scans ranges of IP addresses to find peers.
> * Refactor interface to abstract class because all seed providers must 
> provide a constructor that accepts Map 
> * correct error messages
> * Do not catch Exception use MultiCatch and catch typed exceptions



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12792) delete with timestamp long.MAX_VALUE for the whole key creates tombstone that cannot be removed.

2016-11-04 Thread Joel Knighton (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15636940#comment-15636940
 ] 

Joel Knighton commented on CASSANDRA-12792:
---

Ah, I totally misread that. Thanks for the clarification!

> delete with timestamp long.MAX_VALUE for the whole key creates tombstone that 
> cannot be removed. 
> -
>
> Key: CASSANDRA-12792
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12792
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Ian Ilsley
>Assignee: Joel Knighton
>
> In db/compaction/LazilyCompactedRow.java 
> we only check for  <  MaxPurgeableTimeStamp  
> eg:
> (this.maxRowTombstone.markedForDeleteAt < getMaxPurgeableTimestamp())
> this should probably be <= 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12861) example/triggers build fail.

2016-11-04 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15636919#comment-15636919
 ] 

Sylvain Lebresne commented on CASSANDRA-12861:
--

Thanks for the notice and patch, but ss noted on [this 
comment|https://issues.apache.org/jira/browse/CASSANDRA-12236?focusedCommentId=15395897=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15395897]
 of the ticket that made the change (CASSANDRA-12236), we really did wanted to 
move {{RowUpdateBuilder}} out of the non-test sources (in fact, it's in the 
test only out of laziness and could be cleaned some day). So I'd rather we 
instead replace the use of {{RowUpdateBuilder}} in the trigger example by the 
new {{SimpleBuilder}} introduced in CASSANDRA-12236. I'll provide a patch for 
that when I have a bit of time next week, but if you want to have a shot at it 
first, that would be very much appreciated.

> example/triggers build fail.
> 
>
> Key: CASSANDRA-12861
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12861
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Yasuharu Goto
>Assignee: Yasuharu Goto
>Priority: Trivial
>
> When I tried to build example/trigger on trunk branch, I found that "ant jar" 
> fails with an error like below.
> (Sorry for my language settings for ant. I couldn't find how to change it. 
> The error indicated here is a "cannot find symboll" error of 
> RowUpdateBuilder).
> {code}
> Buildfile: /Users/yasuharu/git/cassandra/examples/triggers/build.xml
> init:
> [mkdir] Created dir: 
> /Users/yasuharu/git/cassandra/examples/triggers/build/classes
> build:
> [javac] Compiling 1 source file to 
> /Users/yasuharu/git/cassandra/examples/triggers/build/classes
> [javac] 警告: 
> 注釈プロセッサ'org.openjdk.jmh.generators.BenchmarkProcessor'から-source 
> '1.8'より小さいソース・バージョン'RELEASE_6'がサポートされています
> [javac] 
> /Users/yasuharu/git/cassandra/examples/triggers/src/org/apache/cassandra/triggers/AuditTrigger.java:27:
>  エラー: シンボルを見つけられません
> [javac] import org.apache.cassandra.db.RowUpdateBuilder;
> [javac]   ^
> [javac]   シンボル:   クラス RowUpdateBuilder
> [javac]   場所: パッケージ org.apache.cassandra.db
> [javac] エラー1個
> [javac] 警告1個
> BUILD FAILED
> /Users/yasuharu/git/cassandra/examples/triggers/build.xml:45: Compile failed; 
> see the compiler error output for details.
> Total time: 1 second
> {code}
> I think the movement of RowUpdateBuilder to test has broken this build.
> https://github.com/apache/cassandra/commit/26838063de6246e3a1e18062114ca92fb81c00cf
> In order to fix this, I moved back RowUpdateBuilder.java to src in my patch.
> https://github.com/apache/cassandra/commit/d133eefe9c5fbebd8d389a9397c3948b8c36bd06
> Could you please review my patch?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12792) delete with timestamp long.MAX_VALUE for the whole key creates tombstone that cannot be removed.

2016-11-04 Thread Branimir Lambov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15636900#comment-15636900
 ] 

Branimir Lambov commented on CASSANDRA-12792:
-

You need the same logic for sstables as well: If you have sstables and they all 
don't have the row, you will go into the 'else' with {{Long.MAX_VALUE}} instead 
of always true.

> delete with timestamp long.MAX_VALUE for the whole key creates tombstone that 
> cannot be removed. 
> -
>
> Key: CASSANDRA-12792
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12792
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Ian Ilsley
>Assignee: Joel Knighton
>
> In db/compaction/LazilyCompactedRow.java 
> we only check for  <  MaxPurgeableTimeStamp  
> eg:
> (this.maxRowTombstone.markedForDeleteAt < getMaxPurgeableTimestamp())
> this should probably be <= 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-12792) delete with timestamp long.MAX_VALUE for the whole key creates tombstone that cannot be removed.

2016-11-04 Thread Branimir Lambov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15636900#comment-15636900
 ] 

Branimir Lambov edited comment on CASSANDRA-12792 at 11/4/16 4:29 PM:
--

I meant that you need the same logic for sstables as well: currently if you 
have sstables and they all don't have the row, you will go into the 'else' with 
{{Long.MAX_VALUE}} instead of always true.


was (Author: blambov):
You need the same logic for sstables as well: If you have sstables and they all 
don't have the row, you will go into the 'else' with {{Long.MAX_VALUE}} instead 
of always true.

> delete with timestamp long.MAX_VALUE for the whole key creates tombstone that 
> cannot be removed. 
> -
>
> Key: CASSANDRA-12792
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12792
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Ian Ilsley
>Assignee: Joel Knighton
>
> In db/compaction/LazilyCompactedRow.java 
> we only check for  <  MaxPurgeableTimeStamp  
> eg:
> (this.maxRowTombstone.markedForDeleteAt < getMaxPurgeableTimestamp())
> this should probably be <= 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12792) delete with timestamp long.MAX_VALUE for the whole key creates tombstone that cannot be removed.

2016-11-04 Thread Joel Knighton (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15636860#comment-15636860
 ] 

Joel Knighton commented on CASSANDRA-12792:
---

Good observation on the {{hasPurgeEvaluator}} flag. I'll remove that in my 
follow up patch. I'll also try the lambda approach on 3.0+; I think you're 
right in valuing clarity on the version going forward here.

I'm not sure I understand the {{hasMemtableCf}}/{{hasTimestamp}} distinction. 
The only place that {{hasMemtableCf}} is used is dictating whether we return 
the {{AlwaysTruePurgeEvaluator}} or a {{TimestampedPurgeEvaluator}}. In the 
case that the value of {{hasMemtableCF}} matters, we already know that 
{{filteredSSTables}} is empty. This means that the current {{minTimestamp}} 
when we start inspecting memtables is {{Long.MAX_VALUE}}. We would need to set 
{{hasTimestamp}} whenever the partition's mintimestamp is less than or equal to 
this {{minTimestamp}}, since we need a {{TimestampedPurgeEvaluator}} in the 
case where a value in the partition is timestamped with {{Long.MAX_VALUE}}. 
Since any long value will be less than or equal to {{Long.MAX_VALUE}}, it seems 
like setting a flag whenever we take the minTimestamp from a partition in a 
memtable reduces to setting a flag whenever we have a partition in a memtable. 
I'm probably misunderstanding something here.

> delete with timestamp long.MAX_VALUE for the whole key creates tombstone that 
> cannot be removed. 
> -
>
> Key: CASSANDRA-12792
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12792
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Ian Ilsley
>Assignee: Joel Knighton
>
> In db/compaction/LazilyCompactedRow.java 
> we only check for  <  MaxPurgeableTimeStamp  
> eg:
> (this.maxRowTombstone.markedForDeleteAt < getMaxPurgeableTimestamp())
> this should probably be <= 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12792) delete with timestamp long.MAX_VALUE for the whole key creates tombstone that cannot be removed.

2016-11-04 Thread Joel Knighton (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15636859#comment-15636859
 ] 

Joel Knighton commented on CASSANDRA-12792:
---

Good observation on the {{hasPurgeEvaluator}} flag. I'll remove that in my 
follow up patch. I'll also try the lambda approach on 3.0+; I think you're 
right in valuing clarity on the version going forward here.

I'm not sure I understand the {{hasMemtableCf}}/{{hasTimestamp}} distinction. 
The only place that {{hasMemtableCf}} is used is dictating whether we return 
the {{AlwaysTruePurgeEvaluator}} or a {{TimestampedPurgeEvaluator}}. In the 
case that the value of {{hasMemtableCF}} matters, we already know that 
{{filteredSSTables}} is empty. This means that the current {{minTimestamp}} 
when we start inspecting memtables is {{Long.MAX_VALUE}}. We would need to set 
{{hasTimestamp}} whenever the partition's mintimestamp is less than or equal to 
this {{minTimestamp}}, since we need a {{TimestampedPurgeEvaluator}} in the 
case where a value in the partition is timestamped with {{Long.MAX_VALUE}}. 
Since any long value will be less than or equal to {{Long.MAX_VALUE}}, it seems 
like setting a flag whenever we take the minTimestamp from a partition in a 
memtable reduces to setting a flag whenever we have a partition in a memtable. 
I'm probably misunderstanding something here.

> delete with timestamp long.MAX_VALUE for the whole key creates tombstone that 
> cannot be removed. 
> -
>
> Key: CASSANDRA-12792
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12792
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Ian Ilsley
>Assignee: Joel Knighton
>
> In db/compaction/LazilyCompactedRow.java 
> we only check for  <  MaxPurgeableTimeStamp  
> eg:
> (this.maxRowTombstone.markedForDeleteAt < getMaxPurgeableTimestamp())
> this should probably be <= 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12804) CQL docs table of contents links are broken

2016-11-04 Thread Evan Prothro (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Evan Prothro updated CASSANDRA-12804:
-
Description: 
Example: Clicking on a link in the table of contents at 
https://cassandra.apache.org/doc/cql3/CQL-3.0.html results in a 404 to 
https://cassandra.apache.org/doc/cql3/CQL.html#Preamble

This ticket proposes changing the paths of legacy CQL.html files so they work, 
removing the textile source for this legacy doc (as it is replaced by the 
in-tree sphinx docs now),  and updating the live docs to a sphinx build from 
3.9.

  was:
Example: Clicking on a link in the table of contents at 
https://cassandra.apache.org/doc/cql3/CQL-3.0.html results in a 404 to 
https://cassandra.apache.org/doc/cql3/CQL.html#Preamble

This ticket proposes changing the paths of legacy CQL.html files so they work, 
and removing the textile source for this legacy doc, as it is replaced by the 
in-tree sphinx docs now.


> CQL docs table of contents links are broken
> ---
>
> Key: CASSANDRA-12804
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12804
> Project: Cassandra
>  Issue Type: Bug
>  Components: Documentation and Website
>Reporter: Evan Prothro
>Priority: Minor
>  Labels: lhf
>
> Example: Clicking on a link in the table of contents at 
> https://cassandra.apache.org/doc/cql3/CQL-3.0.html results in a 404 to 
> https://cassandra.apache.org/doc/cql3/CQL.html#Preamble
> This ticket proposes changing the paths of legacy CQL.html files so they 
> work, removing the textile source for this legacy doc (as it is replaced by 
> the in-tree sphinx docs now),  and updating the live docs to a sphinx build 
> from 3.9.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (CASSANDRA-12792) delete with timestamp long.MAX_VALUE for the whole key creates tombstone that cannot be removed.

2016-11-04 Thread Joel Knighton (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Knighton updated CASSANDRA-12792:
--
Comment: was deleted

(was: Good observation on the {{hasPurgeEvaluator}} flag. I'll remove that in 
my follow up patch. I'll also try the lambda approach on 3.0+; I think you're 
right in valuing clarity on the version going forward here.

I'm not sure I understand the {{hasMemtableCf}}/{{hasTimestamp}} distinction. 
The only place that {{hasMemtableCf}} is used is dictating whether we return 
the {{AlwaysTruePurgeEvaluator}} or a {{TimestampedPurgeEvaluator}}. In the 
case that the value of {{hasMemtableCF}} matters, we already know that 
{{filteredSSTables}} is empty. This means that the current {{minTimestamp}} 
when we start inspecting memtables is {{Long.MAX_VALUE}}. We would need to set 
{{hasTimestamp}} whenever the partition's mintimestamp is less than or equal to 
this {{minTimestamp}}, since we need a {{TimestampedPurgeEvaluator}} in the 
case where a value in the partition is timestamped with {{Long.MAX_VALUE}}. 
Since any long value will be less than or equal to {{Long.MAX_VALUE}}, it seems 
like setting a flag whenever we take the minTimestamp from a partition in a 
memtable reduces to setting a flag whenever we have a partition in a memtable. 
I'm probably misunderstanding something here.)

> delete with timestamp long.MAX_VALUE for the whole key creates tombstone that 
> cannot be removed. 
> -
>
> Key: CASSANDRA-12792
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12792
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Ian Ilsley
>Assignee: Joel Knighton
>
> In db/compaction/LazilyCompactedRow.java 
> we only check for  <  MaxPurgeableTimeStamp  
> eg:
> (this.maxRowTombstone.markedForDeleteAt < getMaxPurgeableTimestamp())
> this should probably be <= 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-12792) delete with timestamp long.MAX_VALUE for the whole key creates tombstone that cannot be removed.

2016-11-04 Thread Joel Knighton (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15636859#comment-15636859
 ] 

Joel Knighton edited comment on CASSANDRA-12792 at 11/4/16 4:15 PM:


Good observation on the {{hasPurgeEvaluator}} flag. I'll remove that in my 
follow up patch. I'll also try the lambda approach on 3.0+; I think you're 
right in valuing clarity on the version going forward here. When doing that, 
I'll look for chances to reduce verbosity. It does feel bigger than it needs to 
be.

I'm not sure I understand the {{hasMemtableCf}}/{{hasTimestamp}} distinction. 
The only place that {{hasMemtableCf}} is used is dictating whether we return 
the {{AlwaysTruePurgeEvaluator}} or a {{TimestampedPurgeEvaluator}}. In the 
case that the value of {{hasMemtableCF}} matters, we already know that 
{{filteredSSTables}} is empty. This means that the current {{minTimestamp}} 
when we start inspecting memtables is {{Long.MAX_VALUE}}. We would need to set 
{{hasTimestamp}} whenever the partition's mintimestamp is less than or equal to 
this {{minTimestamp}}, since we need a {{TimestampedPurgeEvaluator}} in the 
case where a value in the partition is timestamped with {{Long.MAX_VALUE}}. 
Since any long value will be less than or equal to {{Long.MAX_VALUE}}, it seems 
like setting a flag whenever we take the minTimestamp from a partition in a 
memtable reduces to setting a flag whenever we have a partition in a memtable. 
I'm probably misunderstanding something here.


was (Author: jkni):
Good observation on the {{hasPurgeEvaluator}} flag. I'll remove that in my 
follow up patch. I'll also try the lambda approach on 3.0+; I think you're 
right in valuing clarity on the version going forward here.

I'm not sure I understand the {{hasMemtableCf}}/{{hasTimestamp}} distinction. 
The only place that {{hasMemtableCf}} is used is dictating whether we return 
the {{AlwaysTruePurgeEvaluator}} or a {{TimestampedPurgeEvaluator}}. In the 
case that the value of {{hasMemtableCF}} matters, we already know that 
{{filteredSSTables}} is empty. This means that the current {{minTimestamp}} 
when we start inspecting memtables is {{Long.MAX_VALUE}}. We would need to set 
{{hasTimestamp}} whenever the partition's mintimestamp is less than or equal to 
this {{minTimestamp}}, since we need a {{TimestampedPurgeEvaluator}} in the 
case where a value in the partition is timestamped with {{Long.MAX_VALUE}}. 
Since any long value will be less than or equal to {{Long.MAX_VALUE}}, it seems 
like setting a flag whenever we take the minTimestamp from a partition in a 
memtable reduces to setting a flag whenever we have a partition in a memtable. 
I'm probably misunderstanding something here.

> delete with timestamp long.MAX_VALUE for the whole key creates tombstone that 
> cannot be removed. 
> -
>
> Key: CASSANDRA-12792
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12792
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Ian Ilsley
>Assignee: Joel Knighton
>
> In db/compaction/LazilyCompactedRow.java 
> we only check for  <  MaxPurgeableTimeStamp  
> eg:
> (this.maxRowTombstone.markedForDeleteAt < getMaxPurgeableTimestamp())
> this should probably be <= 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12792) delete with timestamp long.MAX_VALUE for the whole key creates tombstone that cannot be removed.

2016-11-04 Thread Joel Knighton (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Knighton updated CASSANDRA-12792:
--
Status: Open  (was: Patch Available)

> delete with timestamp long.MAX_VALUE for the whole key creates tombstone that 
> cannot be removed. 
> -
>
> Key: CASSANDRA-12792
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12792
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Ian Ilsley
>Assignee: Joel Knighton
>
> In db/compaction/LazilyCompactedRow.java 
> we only check for  <  MaxPurgeableTimeStamp  
> eg:
> (this.maxRowTombstone.markedForDeleteAt < getMaxPurgeableTimestamp())
> this should probably be <= 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-12881) Move SASI docs into sphinx docs

2016-11-04 Thread Evan Prothro (JIRA)
Evan Prothro created CASSANDRA-12881:


 Summary: Move SASI docs into sphinx docs
 Key: CASSANDRA-12881
 URL: https://issues.apache.org/jira/browse/CASSANDRA-12881
 Project: Cassandra
  Issue Type: Task
  Components: Documentation and Website
Reporter: Evan Prothro
Priority: Trivial


Previous TODO in code regarding SASI docs:

TODO: we should probably move the first half of that documentation to the 
general documentation, and the implementation explanation parts into the wiki.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-12880) setInt/setShort/setFlaot throwing exception from CassandraPreparedStatement

2016-11-04 Thread Raghavendra Pinninti (JIRA)
Raghavendra Pinninti created CASSANDRA-12880:


 Summary: setInt/setShort/setFlaot throwing exception from 
CassandraPreparedStatement
 Key: CASSANDRA-12880
 URL: https://issues.apache.org/jira/browse/CASSANDRA-12880
 Project: Cassandra
  Issue Type: Test
  Components: Core
 Environment: Cassandra JDBC
Reporter: Raghavendra Pinninti
Priority: Trivial
 Fix For: 2.1.17


Getting Exception as for ps.setInt/ps.setFloat methods
Exception in thread "main" java.lang.NoSuchMethodError: 
org.apache.cassandra.cql.jdbc.JdbcInteger.decompose(Ljava/math/BigInteger;)Ljava/nio/ByteBuffer;
at 
org.apache.cassandra.cql.jdbc.CassandraPreparedStatement.setShort(CassandraPreparedStatement.java:381)
at com.jdbc.cassandra.CassandraJdbc.Insert(CassandraJdbc.java:63)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12804) CQL docs table of contents links are broken

2016-11-04 Thread Evan Prothro (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Evan Prothro updated CASSANDRA-12804:
-
Description: 
Example: Clicking on a link in the table of contents at 
https://cassandra.apache.org/doc/cql3/CQL-3.0.html results in a 404 to 
https://cassandra.apache.org/doc/cql3/CQL.html#Preamble

This ticket proposes changing the paths of legacy CQL.html files so they work, 
and removing the textile source for this legacy doc, as it is replaced by the 
in-tree sphinx docs now.

  was:
Example: Clicking on a link in the table of contents at 
https://cassandra.apache.org/doc/cql3/CQL-3.0.html results in a 404 to 
https://cassandra.apache.org/doc/cql3/CQL.html#Preamble

Links in the body work.

Table of contents link to the base file name (cql.html), but the html file is 
cql-[version].html)


> CQL docs table of contents links are broken
> ---
>
> Key: CASSANDRA-12804
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12804
> Project: Cassandra
>  Issue Type: Bug
>  Components: Documentation and Website
>Reporter: Evan Prothro
>Priority: Minor
>  Labels: lhf
>
> Example: Clicking on a link in the table of contents at 
> https://cassandra.apache.org/doc/cql3/CQL-3.0.html results in a 404 to 
> https://cassandra.apache.org/doc/cql3/CQL.html#Preamble
> This ticket proposes changing the paths of legacy CQL.html files so they 
> work, and removing the textile source for this legacy doc, as it is replaced 
> by the in-tree sphinx docs now.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12879) sstablesplit incorrectly ignores SStable files

2016-11-04 Thread Cliff Gilmore (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cliff Gilmore updated CASSANDRA-12879:
--
Description: 
When trying to test sstablesplit on a valid sstable I see messages like 

"Skipping non sstable file mc-7-big-Data.db" 

This is clearly an sstable, so it looks like the parser for valid sstables is 
still assuming the old file name format.

StandaloneSplitter appears to call tryComponentFromFilename which calls 
fromFilename in src/java/org/apache/cassandra/io/sstable/Component.java 

Checking the in Component.java I see a comment that looks like the code is 
looking for the old sstable file name format:

* {@code
 * Filename of the form 
"/-[tmp-][-]-",
 * }



  was:
When trying to test sstablesplit on a valid sstable I see messages like 

"Skipping non sstable file mc-7-big-Data.db" 

This is clearly an sstable, so it looks like the parser for valid sstables is 
still assuming the old format.

StandaloneSplitter appears to call tryComponentFromFilename which calls 
fromFilename in src/java/org/apache/cassandra/io/sstable/Component.java 

Checking the in Component.java I see a comment that looks like the code is 
looking for the old sstable file name format:

* {@code
 * Filename of the form 
"/-[tmp-][-]-",
 * }




> sstablesplit incorrectly ignores SStable files
> --
>
> Key: CASSANDRA-12879
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12879
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
> Environment: Linux
>Reporter: Cliff Gilmore
>Priority: Minor
>
> When trying to test sstablesplit on a valid sstable I see messages like 
> "Skipping non sstable file mc-7-big-Data.db" 
> This is clearly an sstable, so it looks like the parser for valid sstables is 
> still assuming the old file name format.
> StandaloneSplitter appears to call tryComponentFromFilename which calls 
> fromFilename in src/java/org/apache/cassandra/io/sstable/Component.java 
> Checking the in Component.java I see a comment that looks like the code is 
> looking for the old sstable file name format:
> * {@code
>  * Filename of the form 
> "/-[tmp-][-]-",
>  * }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12879) sstablesplit incorrectly ignores SStable files

2016-11-04 Thread Cliff Gilmore (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cliff Gilmore updated CASSANDRA-12879:
--
Summary: sstablesplit incorrectly ignores SStable files  (was: sstablesplit 
Incorrectly Ignores SStable Files)

> sstablesplit incorrectly ignores SStable files
> --
>
> Key: CASSANDRA-12879
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12879
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
> Environment: Linux
>Reporter: Cliff Gilmore
>Priority: Minor
>
> When trying to test sstablesplit on a valid sstable I see messages like 
> "Skipping non sstable file mc-7-big-Data.db" 
> This is clearly an sstable, so it looks like the parser for valid sstables is 
> still assuming the old format.
> StandaloneSplitter appears to call tryComponentFromFilename which calls 
> fromFilename in src/java/org/apache/cassandra/io/sstable/Component.java 
> Checking the in Component.java I see a comment that looks like the code is 
> looking for the old sstable file name format:
> * {@code
>  * Filename of the form 
> "/-[tmp-][-]-",
>  * }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-12879) sstablesplit Incorrectly Ignores SStable Files

2016-11-04 Thread Cliff Gilmore (JIRA)
Cliff Gilmore created CASSANDRA-12879:
-

 Summary: sstablesplit Incorrectly Ignores SStable Files
 Key: CASSANDRA-12879
 URL: https://issues.apache.org/jira/browse/CASSANDRA-12879
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
 Environment: Linux
Reporter: Cliff Gilmore
Priority: Minor


When trying to test sstablesplit on a valid sstable I see messages like 

"Skipping non sstable file mc-7-big-Data.db" 

This is clearly an sstable, so it looks like the parser for valid sstables is 
still assuming the old format.

StandaloneSplitter appears to call tryComponentFromFilename which calls 
fromFilename in src/java/org/apache/cassandra/io/sstable/Component.java 

Checking the in Component.java I see a comment that looks like the code is 
looking for the old sstable file name format:

* {@code
 * Filename of the form 
"/-[tmp-][-]-",
 * }





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12649) Add BATCH metrics

2016-11-04 Thread Benjamin Lerer (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15636401#comment-15636401
 ] 

Benjamin Lerer commented on CASSANDRA-12649:


I discussed the patch with [~iamaleksey] and we both have some concerns with 
the performance impact of measuring the mutation size. Could you provide a 
benchmark to assess the performance impact of that measurement?

I am also not fully convinced by the usefullness of Logged / Unlogged 
Partitions per batch distribution. Could you explain in more details how it 
will be usefull for you?

> Add BATCH metrics
> -
>
> Key: CASSANDRA-12649
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12649
> Project: Cassandra
>  Issue Type: Wish
>Reporter: Alwyn Davis
>Priority: Minor
> Fix For: 3.x
>
> Attachments: 12649-3.x.patch, trunk-12649.txt
>
>
> To identify causes of load on a cluster, it would be useful to have some 
> additional metrics:
> * *Mutation size distribution:* I believe this would be relevant when 
> tracking the performance of unlogged batches.
> * *Logged / Unlogged Partitions per batch distribution:* This would also give 
> a count of batch types processed. Multiple distinct tables in batch would 
> just be considered as separate partitions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12695) Truncate ALWAYS not applied

2016-11-04 Thread T Jake Luciani (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15636349#comment-15636349
 ] 

T Jake Luciani commented on CASSANDRA-12695:


Why not just make the check ...
{code}
 || (settings.rate.threadCount != -1 && settings.command.truncate == 
SettingsCommand.TruncateWhen.ALWAYS)
{code}

> Truncate ALWAYS not applied
> ---
>
> Key: CASSANDRA-12695
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12695
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Alwyn Davis
>Priority: Trivial
>  Labels: stress
> Fix For: 3.x
>
> Attachments: 12695-trunk.patch
>
>
> If truncate is set to ALWAYS and rate sets a specific thread count, the 
> stress table is not actually truncated. E.g.
> {code}
> truncate=always -rate threads=4
> {code}
> This can cause an unexpected number of rows to be left in the table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12867) Batch with multiple conditional updates for the same partition causes AssertionError

2016-11-04 Thread Benjamin Lerer (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15636215#comment-15636215
 ] 

Benjamin Lerer commented on CASSANDRA-12867:


I just have 2 nits:

* I think it will be safer to also check the error messages in the tests that 
mix different type of conditions.
* As {{ColumnsConditions}} are linked to {{Clustering}} ,it would make sense to 
add the same tests on a table with one clustering columns.
 

> Batch with multiple conditional updates for the same partition causes 
> AssertionError
> 
>
> Key: CASSANDRA-12867
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12867
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL
>Reporter: Kurt Greaves
>Assignee: Sylvain Lebresne
>Priority: Critical
> Fix For: 3.0.10, 3.10
>
> Attachments: 12867-3.0.patch
>
>
> Reproduced in 3.0.10 and 3.10. Used to work in 3.0.9 and earlier. Bug was 
> introduced in CASSANDRA-12060.
> The following causes an AssertionError:
> {code}
> CREATE KEYSPACE test WITH replication = { 'class' : 'SimpleStrategy', 
> 'replication_factor' : 1 };
> create table test.test (id int PRIMARY KEY, val text);
> BEGIN BATCH INSERT INTO test.test (id, val) VALUES (999, 'aaa') IF NOT 
> EXISTS; INSERT INTO test.test (id, val) VALUES (999, 'ccc') IF NOT EXISTS; 
> APPLY BATCH ;
> {code}
> Stack trace is as follows:
> {code}
> ERROR [Native-Transport-Requests-2] 2016-10-31 04:16:44,231 Message.java:622 
> - Unexpected exception during request; channel = [id: 0x176e1c04, 
> L:/127.0.0.1:9042 - R:/127.0.0.1:59743]
> java.lang.AssertionError: null
> at 
> org.apache.cassandra.cql3.statements.CQL3CasRequest.setConditionsForRow(CQL3CasRequest.java:138)
>  ~[main/:na]
> at 
> org.apache.cassandra.cql3.statements.CQL3CasRequest.addExistsCondition(CQL3CasRequest.java:104)
>  ~[main/:na]
> at 
> org.apache.cassandra.cql3.statements.CQL3CasRequest.addNotExist(CQL3CasRequest.java:84)
>  ~[main/:na]
> at 
> org.apache.cassandra.cql3.IfNotExistsCondition.addConditionsTo(IfNotExistsCondition.java:28)
>  ~[main/:na]
> at 
> org.apache.cassandra.cql3.statements.ModificationStatement.addConditions(ModificationStatement.java:482)
>  ~[main/:na]
> at 
> org.apache.cassandra.cql3.statements.BatchStatement.makeCasRequest(BatchStatement.java:434)
>  ~[main/:na]
> at 
> org.apache.cassandra.cql3.statements.BatchStatement.executeWithConditions(BatchStatement.java:379)
>  ~[main/:na]
> at 
> org.apache.cassandra.cql3.statements.BatchStatement.execute(BatchStatement.java:358)
>  ~[main/:na]
> at 
> org.apache.cassandra.cql3.statements.BatchStatement.execute(BatchStatement.java:346)
>  ~[main/:na]
> at 
> org.apache.cassandra.cql3.statements.BatchStatement.execute(BatchStatement.java:341)
>  ~[main/:na]
> at 
> org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:218)
>  ~[main/:na]
> at 
> org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:249) 
> ~[main/:na]
> at 
> org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:234) 
> ~[main/:na]
> at 
> org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:115)
>  ~[main/:na]
> at 
> org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:516)
>  [main/:na]
> at 
> org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:409)
>  [main/:na]
> at 
> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
>  [netty-all-4.0.39.Final.jar:4.0.39.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
>  [netty-all-4.0.39.Final.jar:4.0.39.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:35)
>  [netty-all-4.0.39.Final.jar:4.0.39.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:357)
>  [netty-all-4.0.39.Final.jar:4.0.39.Final]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [na:1.8.0_102]
> at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
>  [main/:na]
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) 
> [main/:na]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_102]
> {code}
> The problem is that previous will receive a value after the first statement 
> in the batch is evaluated in BatchStatement.makeCasRequest. I can't see any 
> reason why we 

[jira] [Updated] (CASSANDRA-12854) CommitLogTest.testDeleteIfNotDirty failed in 3.X

2016-11-04 Thread Branimir Lambov (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-12854:

Status: Ready to Commit  (was: Patch Available)

> CommitLogTest.testDeleteIfNotDirty failed in 3.X
> 
>
> Key: CASSANDRA-12854
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12854
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Stefania
>Assignee: Stefania
> Fix For: 3.0.x, 3.x
>
>
> Example failure:
> http://cassci.datastax.com/view/cassandra-3.X/job/cassandra-3.X_testall/31/testReport/junit/org.apache.cassandra.db.commitlog/CommitLogTest/testDeleteIfNotDirty_3__compression/
> {code}
> expected:<1> but was:<2>
> Stacktrace
> junit.framework.AssertionFailedError: expected:<1> but was:<2>
>   at 
> org.apache.cassandra.db.commitlog.CommitLogTest.testDeleteIfNotDirty(CommitLogTest.java:305)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12854) CommitLogTest.testDeleteIfNotDirty failed in 3.X

2016-11-04 Thread Branimir Lambov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15635987#comment-15635987
 ] 

Branimir Lambov commented on CASSANDRA-12854:
-

LGTM

> CommitLogTest.testDeleteIfNotDirty failed in 3.X
> 
>
> Key: CASSANDRA-12854
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12854
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Stefania
>Assignee: Stefania
> Fix For: 3.0.x, 3.x
>
>
> Example failure:
> http://cassci.datastax.com/view/cassandra-3.X/job/cassandra-3.X_testall/31/testReport/junit/org.apache.cassandra.db.commitlog/CommitLogTest/testDeleteIfNotDirty_3__compression/
> {code}
> expected:<1> but was:<2>
> Stacktrace
> junit.framework.AssertionFailedError: expected:<1> but was:<2>
>   at 
> org.apache.cassandra.db.commitlog.CommitLogTest.testDeleteIfNotDirty(CommitLogTest.java:305)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12792) delete with timestamp long.MAX_VALUE for the whole key creates tombstone that cannot be removed.

2016-11-04 Thread Branimir Lambov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15635971#comment-15635971
 ] 

Branimir Lambov commented on CASSANDRA-12792:
-

[The check whether to use 
timestamp|https://github.com/apache/cassandra/compare/trunk...jkni:CASSANDRA-12792-3.X#diff-e8e282423dcbf34d30a3578c8dec15cdR252]
 doesn't seem entirely correct. I believe you should turn the {{hasMemtableCf}} 
flag into a {{hasTimestamp}} to be set whenever you take the minimum.

The rest looks good. I'd prefer it a little less verbose, though:
- We can do without the {{hasPurgeEvaluator}} flag since {{purgeEvaluator != 
null}} can serve the same purpose quite well.
- For 3.0+ it will be a little easier to see what happens if you use lambdas 
instead of the evaluator classes. I know this creates a difference between the 
2.2 and 3+ versions, but IMO adding clarity to the version we are going forward 
with is worth it.

> delete with timestamp long.MAX_VALUE for the whole key creates tombstone that 
> cannot be removed. 
> -
>
> Key: CASSANDRA-12792
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12792
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Ian Ilsley
>Assignee: Joel Knighton
>
> In db/compaction/LazilyCompactedRow.java 
> we only check for  <  MaxPurgeableTimeStamp  
> eg:
> (this.maxRowTombstone.markedForDeleteAt < getMaxPurgeableTimestamp())
> this should probably be <= 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12873) Cassandra can't restart after set NULL to a frozen list

2016-11-04 Thread Benjamin Lerer (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15635875#comment-15635875
 ] 

Benjamin Lerer commented on CASSANDRA-12873:


Does the problem exist in {{2.2}} or {{3.0.x}} ?

> Cassandra can't restart after set NULL to a frozen list
> ---
>
> Key: CASSANDRA-12873
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12873
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Mikhail Krupitskiy
>Priority: Critical
>
> Cassandra 3.5.
> 1) Create a table with frozen list as one of columns.
> 2) Add a row where the column is NULL.
> 3) Stop Cassandra.
> 4) Run Cassandra.
> Cassandra unable to start with the following exception:
> {noformat}
> ERROR o.a.c.utils.JVMStabilityInspector - Exiting due to error while 
> processing commit log during initialization.
> org.apache.cassandra.db.commitlog.CommitLogReplayer$CommitLogReplayException: 
> Unexpected error deserializing mutation; saved to 
> /var/folders/gl/bvj71v5d39339dlr8yf08drcgq/T/mutation5963614818028050337dat.
>   This may be caused by replaying a mutation against a table with the same 
> name but incompatible schema.  Exception follows: 
> org.apache.cassandra.serializers.MarshalException: Not enough bytes to read a 
> list
>   at 
> org.apache.cassandra.db.commitlog.CommitLogReplayer.handleReplayError(CommitLogReplayer.java:611)
>  [apache-cassandra-3.5.jar:3.5]
>   at 
> org.apache.cassandra.db.commitlog.CommitLogReplayer.replayMutation(CommitLogReplayer.java:568)
>  [apache-cassandra-3.5.jar:3.5]
>   at 
> org.apache.cassandra.db.commitlog.CommitLogReplayer.replaySyncSection(CommitLogReplayer.java:521)
>  [apache-cassandra-3.5.jar:3.5]
>   at 
> org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:407)
>  [apache-cassandra-3.5.jar:3.5]
>   at 
> org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:236)
>  [apache-cassandra-3.5.jar:3.5]
>   at 
> org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:192) 
> [apache-cassandra-3.5.jar:3.5]
>   at 
> org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:172) 
> [apache-cassandra-3.5.jar:3.5]
>   at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:283) 
> [apache-cassandra-3.5.jar:3.5]
>   at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:551)
>  [apache-cassandra-3.5.jar:3.5]
>   at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:680) 
> [apache-cassandra-3.5.jar:3.5]
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[na:1.8.0_71]
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[na:1.8.0_71]
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[na:1.8.0_71]
>   at java.lang.reflect.Method.invoke(Method.java:497) ~[na:1.8.0_71]
> {noformat}
> Below is a script for steps #1, #2:
> {code}
> CREATE keyspace if not exists kmv WITH REPLICATION = { 'class' : 
> 'SimpleStrategy', 'replication_factor':'1'} ;
> USE kmv;
> CREATE TABLE if not exists kmv (id int, l frozen, PRIMARY 
> KEY(id));
> INSERT into kmv (id, l) values (1, null) ;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CASSANDRA-12878) Exception occurs caused by nested expression in where-clause

2016-11-04 Thread Benjamin Lerer (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Lerer resolved CASSANDRA-12878.

Resolution: Invalid

> Exception occurs caused by nested expression in where-clause
> 
>
> Key: CASSANDRA-12878
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12878
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL
> Environment: version: cassandra 3.7
> client: cqlsh
> os: debian 7
>Reporter: Joson Choo
>  Labels: cql3
> Fix For: 3.7
>
>
> I can't execute a composite expression as below in cqlsh.
> CQL:
> SELECT * FROM bbot_extract.item_basic_info 
> WHERE  PART_ID=19 AND ( ITEM_ID < 1901000 OR ITEM_ID >= 1909000) ;
> OUTPUT:
> SyntaxException:  message="line 1:85 mismatched input 'OR' expecting ')' (...AND ( ITEM_ID < 
> 1901000 [OR] ITEM_ID...)">
> About the table:
> Primary key(PART_ID, ITEM_ID, PROPERTY)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12878) Exception occurs caused by nested expression in where-clause

2016-11-04 Thread Benjamin Lerer (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15635864#comment-15635864
 ] 

Benjamin Lerer commented on CASSANDRA-12878:


{{OR}} and parenthesis are not supported in the {{WHERE}} clause.

> Exception occurs caused by nested expression in where-clause
> 
>
> Key: CASSANDRA-12878
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12878
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL
> Environment: version: cassandra 3.7
> client: cqlsh
> os: debian 7
>Reporter: Joson Choo
>  Labels: cql3
> Fix For: 3.7
>
>
> I can't execute a composite expression as below in cqlsh.
> CQL:
> SELECT * FROM bbot_extract.item_basic_info 
> WHERE  PART_ID=19 AND ( ITEM_ID < 1901000 OR ITEM_ID >= 1909000) ;
> OUTPUT:
> SyntaxException:  message="line 1:85 mismatched input 'OR' expecting ')' (...AND ( ITEM_ID < 
> 1901000 [OR] ITEM_ID...)">
> About the table:
> Primary key(PART_ID, ITEM_ID, PROPERTY)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-12878) Exception occurs caused by nested expression in where-clause

2016-11-04 Thread Joson Choo (JIRA)
Joson Choo created CASSANDRA-12878:
--

 Summary: Exception occurs caused by nested expression in 
where-clause
 Key: CASSANDRA-12878
 URL: https://issues.apache.org/jira/browse/CASSANDRA-12878
 Project: Cassandra
  Issue Type: Bug
  Components: CQL
 Environment: version: cassandra 3.7
client: cqlsh
os: debian 7
Reporter: Joson Choo
 Fix For: 3.7


I can't execute a composite expression as below in cqlsh.
CQL:
SELECT * FROM bbot_extract.item_basic_info 
WHERE  PART_ID=19 AND ( ITEM_ID < 1901000 OR ITEM_ID >= 1909000) ;

OUTPUT:
SyntaxException: 

About the table:
Primary key(PART_ID, ITEM_ID, PROPERTY)




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)