[jira] [Updated] (CASSANDRA-13787) RangeTombstoneMarker and ParitionDeletion is not properly included in MV

2017-08-22 Thread ZhaoYang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-13787:
-
Description: 
Found two problems related to MV tombstone. 

1. Range-tombstone-Marker being ignored after shadowing first row, subsequent 
base rows are not shadowed in TableViews.

If the range tombstone was not flushed, it was used as deleted row to 
shadow each new update in its range. It works correctly.
After range tombstone was flushed, it was used as RangeTombstoneMarker and 
being skipped after shadowing first update.

2. Partition tombstone is not used when no existing live data, it will 
resurrect deleted cells. It was found in 11500 and included in that patch.


In order not to make 11500 patch more complicated, I will try fix 
range/partition tombstone issue here.


{code:title=Tests to reproduce}
@Test
public void testExistingRangeTombstoneWithFlush() throws Throwable
{
testExistingRangeTombstone(true);
}

@Test
public void testExistingRangeTombstoneWithoutFlush() throws Throwable
{
testExistingRangeTombstone(false);
}

public void testExistingRangeTombstone(boolean flush) throws Throwable
{
createTable("CREATE TABLE %s (k1 int, c1 int, c2 int, v1 int, v2 int, 
PRIMARY KEY (k1, c1, c2))");

execute("USE " + keyspace());
executeNet(protocolVersion, "USE " + keyspace());

createView("view1",
   "CREATE MATERIALIZED VIEW view1 AS SELECT * FROM %%s WHERE 
k1 IS NOT NULL AND c1 IS NOT NULL AND c2 IS NOT NULL PRIMARY KEY (k1, c2, c1)");

updateView("DELETE FROM %s USING TIMESTAMP 10 WHERE k1 = 1 and c1=1");


if (flush)

Keyspace.open(keyspace()).getColumnFamilyStore(currentTable()).forceBlockingFlush();

String table = KEYSPACE + "." + currentTable();
updateView("BEGIN BATCH " +
"INSERT INTO " + table + " (k1, c1, c2, v1, v2) VALUES (1, 0, 
0, 0, 0) USING TIMESTAMP 5; " +
"INSERT INTO " + table + " (k1, c1, c2, v1, v2) VALUES (1, 0, 
1, 0, 1) USING TIMESTAMP 5; " +
"INSERT INTO " + table + " (k1, c1, c2, v1, v2) VALUES (1, 1, 
0, 1, 0) USING TIMESTAMP 5; " +
"INSERT INTO " + table + " (k1, c1, c2, v1, v2) VALUES (1, 1, 
1, 1, 1) USING TIMESTAMP 5; " +
"INSERT INTO " + table + " (k1, c1, c2, v1, v2) VALUES (1, 1, 
2, 1, 2) USING TIMESTAMP 5; " +
"INSERT INTO " + table + " (k1, c1, c2, v1, v2) VALUES (1, 1, 
3, 1, 3) USING TIMESTAMP 5; " +
"INSERT INTO " + table + " (k1, c1, c2, v1, v2) VALUES (1, 2, 
0, 2, 0) USING TIMESTAMP 5; " +
"APPLY BATCH");

assertRowsIgnoringOrder(execute("select * from %s"),
row(1, 0, 0, 0, 0),
row(1, 0, 1, 0, 1),
row(1, 2, 0, 2, 0));
assertRowsIgnoringOrder(execute("select k1,c1,c2,v1,v2 from view1"),
row(1, 0, 0, 0, 0),
row(1, 0, 1, 0, 1),
row(1, 2, 0, 2, 0));
}

@Test
public void testRangeDeletionWithFlush() throws Throwable
{
testExistingParitionDeletion(true);
}

@Test
public void testRangeDeletionWithoutFlush() throws Throwable
{
testExistingParitionDeletion(false);
}

public void testExistingParitionDeletion(boolean flush) throws Throwable
{
// for partition range deletion, need to know that existing row is 
shadowed instead of not existed.
createTable("CREATE TABLE %s (a int, b int, c int, d int, PRIMARY KEY 
(a))");

execute("USE " + keyspace());
executeNet(protocolVersion, "USE " + keyspace());

createView("mv_test1",
   "CREATE MATERIALIZED VIEW %s AS SELECT * FROM %%s WHERE a IS 
NOT NULL AND b IS NOT NULL PRIMARY KEY (a, b)");

Keyspace ks = Keyspace.open(keyspace());
ks.getColumnFamilyStore("mv_test1").disableAutoCompaction();

execute("INSERT INTO %s (a, b, c, d) VALUES (?, ?, ?, ?) using 
timestamp 0", 1, 1, 1, 1);
if (flush)
FBUtilities.waitOnFutures(ks.flush());

assertRowsIgnoringOrder(execute("SELECT * FROM mv_test1"), row(1, 1, 1, 
1));

// remove view row
updateView("UPDATE %s using timestamp 1 set b = null WHERE a=1");
if (flush)
FBUtilities.waitOnFutures(ks.flush());

assertRowsIgnoringOrder(execute("SELECT * FROM mv_test1"));
// remove base row, no view updated generated.
updateView("DELETE FROM %s using timestamp 2 where a=1");
if (flush)
FBUtilities.waitOnFutures(ks.flush());

assertRowsIgnoringOrder(execute("SELECT * FROM mv_test1"));

// restor view row with b,c column. d is still tombstone
upda

[jira] [Commented] (CASSANDRA-13578) mx4j configuration minor improvement

2017-08-22 Thread Jay Zhuang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16137858#comment-16137858
 ] 

Jay Zhuang commented on CASSANDRA-13578:


[~jjordan] that makes sense. I updated the patch to make it backward 
compatible, would you please review?
I tested it with both new way {{MX4J_PORT=8081}} and older way 
{{MX4J_PORT=-Dmx4jport=8081}}, all works fine:
| branch | uTest |
| [13578-trunk|https://github.com/cooldoger/cassandra/tree/13578-trunk] | 
[circleci#80|https://circleci.com/gh/cooldoger/cassandra/80] |

> mx4j configuration minor improvement
> 
>
> Key: CASSANDRA-13578
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13578
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Jay Zhuang
>Assignee: Jay Zhuang
>Priority: Minor
> Fix For: 4.x
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13780) ADD Node streaming throughput performance

2017-08-22 Thread Kurt Greaves (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16137849#comment-16137849
 ] 

Kurt Greaves commented on CASSANDRA-13780:
--

Not really sure why you wouldn't get more than 80MB/s for the cluster unless 
there was some network bottleneck. Have you benchmarked network throughput 
between the nodes to confirm it's not something there?

> ADD Node streaming throughput performance
> -
>
> Key: CASSANDRA-13780
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13780
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
> Environment: Linux 2.6.32-696.3.2.el6.x86_64 #1 SMP Mon Jun 19 
> 11:55:55 PDT 2017 x86_64 x86_64 x86_64 GNU/Linux
> Architecture:  x86_64
> CPU op-mode(s):32-bit, 64-bit
> Byte Order:Little Endian
> CPU(s):40
> On-line CPU(s) list:   0-39
> Thread(s) per core:2
> Core(s) per socket:10
> Socket(s): 2
> NUMA node(s):  2
> Vendor ID: GenuineIntel
> CPU family:6
> Model: 79
> Model name:Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz
> Stepping:  1
> CPU MHz:   2199.869
> BogoMIPS:  4399.36
> Virtualization:VT-x
> L1d cache: 32K
> L1i cache: 32K
> L2 cache:  256K
> L3 cache:  25600K
> NUMA node0 CPU(s): 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38
> NUMA node1 CPU(s): 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39
>  total   used   free sharedbuffers cached
> Mem:  252G   217G34G   708K   308M   149G
> -/+ buffers/cache:67G   185G
> Swap:  16G 0B16G
>Reporter: Kevin Rivait
>Priority: Blocker
> Fix For: 3.0.9
>
>
> Problem: Adding a new node to a large cluster runs at least 1000x slower than 
> what the network and node hardware capacity can support, taking several days 
> per new node.  Adjusting stream throughput and other YAML parameters seems to 
> have no effect on performance.  Essentially, it appears that Cassandra has an 
> architecture scalability growth problem when adding new nodes to a moderate 
> to high data ingestion cluster because Cassandra cannot add new node capacity 
> fast enough to keep up with increasing data ingestion volumes and growth.
> Initial Configuration: 
> Running 3.0.9 and have implemented TWCS on one of our largest table.
> Largest table partitioned on (ID, MM)  using 1 day buckets with a TTL of 
> 60 days.
> Next release will change partitioning to (ID, MMDD) so that partitions 
> are aligned with daily TWCS buckets.
> Each node is currently creating roughly a 30GB SSTable per day.
> TWCS working as expected,  daily SSTables are dropping off daily after 70 
> days ( 60 + 10 day grace)
> Current deployment is a 28 node 2 datacenter cluster, 14 nodes in each DC , 
> replication factor 3
> Data directories are backed with 4 - 2TB SSDs on each node  and a 1 800GB SSD 
> for commit logs.
> Requirement is to double cluster size, capacity, and ingestion volume within 
> a few weeks.
> Observed Behavior:
> 1. streaming throughput during add node – we observed maximum 6 Mb/s 
> streaming from each of the 14 nodes on a 20Gb/s switched network, taking at 
> least 106 hours for each node to join cluster and each node is only about 2.2 
> TB is size.
> 2. compaction on the newly added node - compaction has fallen behind, with 
> anywhere from 4,000 to 10,000 SSTables at any given time.  It took 3 weeks 
> for compaction to finish on each newly added node.   Increasing number of 
> compaction threads to match number of CPU (40)  and increasing compaction 
> throughput to 32MB/s seemed to be the sweet spot. 
> 3. TWCS buckets on new node, data streamed to this node over 4 1/2 days.  
> Compaction correctly placed the data in daily files, but the problem is the 
> file dates reflect when compaction created the file and not the date of the 
> last record written in the TWCS bucket, which will cause the files to remain 
> around much longer than necessary.  
> Two Questions:
> 1. What can be done to substantially improve the performance of adding a new 
> node?
> 2. Can compaction on TWCS partitions for newly added nodes change the file 
> create date to match the highest date record in the file -or- add another 
> piece of meta-data to the TWCS files that reflect the file drop date so that 
> TWCS partitions can be dropped consistently?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional

[jira] [Assigned] (CASSANDRA-13787) RangeTombstoneMarker and ParitionDeletion is not properly included in MV

2017-08-22 Thread ZhaoYang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang reassigned CASSANDRA-13787:


Assignee: ZhaoYang

> RangeTombstoneMarker and ParitionDeletion is not properly included in MV
> 
>
> Key: CASSANDRA-13787
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13787
> Project: Cassandra
>  Issue Type: Bug
>  Components: Materialized Views
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>
> Found two problems related to MV tombstone. 
> 1. Range-tombstone-Marker being ignored after shadowing first row, subsequent 
> base rows are not shadowed in TableViews.
> If the range tombstone was not flushed, it was used as deleted row to 
> shadow new updates. It works correctly.
> After range tombstone was flushed, it was used as RangeTombstoneMarker 
> and being skipped after shadowing first update.
> 2. Partition tombstone is not used when no existing live data, it will 
> resurrect deleted cells. It was found in 11500 and included in that patch.
> In order not to make 11500 patch more complicated, I will try fix 
> range/partition tombstone issue here.
> {code:title=Tests to reproduce}
> @Test
> public void testExistingRangeTombstoneWithFlush() throws Throwable
> {
> testExistingRangeTombstone(true);
> }
> @Test
> public void testExistingRangeTombstoneWithoutFlush() throws Throwable
> {
> testExistingRangeTombstone(false);
> }
> public void testExistingRangeTombstone(boolean flush) throws Throwable
> {
> createTable("CREATE TABLE %s (k1 int, c1 int, c2 int, v1 int, v2 int, 
> PRIMARY KEY (k1, c1, c2))");
> execute("USE " + keyspace());
> executeNet(protocolVersion, "USE " + keyspace());
> createView("view1",
>"CREATE MATERIALIZED VIEW view1 AS SELECT * FROM %%s WHERE 
> k1 IS NOT NULL AND c1 IS NOT NULL AND c2 IS NOT NULL PRIMARY KEY (k1, c2, 
> c1)");
> updateView("DELETE FROM %s USING TIMESTAMP 10 WHERE k1 = 1 and c1=1");
> if (flush)
> 
> Keyspace.open(keyspace()).getColumnFamilyStore(currentTable()).forceBlockingFlush();
> String table = KEYSPACE + "." + currentTable();
> updateView("BEGIN BATCH " +
> "INSERT INTO " + table + " (k1, c1, c2, v1, v2) VALUES (1, 0, 
> 0, 0, 0) USING TIMESTAMP 5; " +
> "INSERT INTO " + table + " (k1, c1, c2, v1, v2) VALUES (1, 0, 
> 1, 0, 1) USING TIMESTAMP 5; " +
> "INSERT INTO " + table + " (k1, c1, c2, v1, v2) VALUES (1, 1, 
> 0, 1, 0) USING TIMESTAMP 5; " +
> "INSERT INTO " + table + " (k1, c1, c2, v1, v2) VALUES (1, 1, 
> 1, 1, 1) USING TIMESTAMP 5; " +
> "INSERT INTO " + table + " (k1, c1, c2, v1, v2) VALUES (1, 1, 
> 2, 1, 2) USING TIMESTAMP 5; " +
> "INSERT INTO " + table + " (k1, c1, c2, v1, v2) VALUES (1, 1, 
> 3, 1, 3) USING TIMESTAMP 5; " +
> "INSERT INTO " + table + " (k1, c1, c2, v1, v2) VALUES (1, 2, 
> 0, 2, 0) USING TIMESTAMP 5; " +
> "APPLY BATCH");
> assertRowsIgnoringOrder(execute("select * from %s"),
> row(1, 0, 0, 0, 0),
> row(1, 0, 1, 0, 1),
> row(1, 2, 0, 2, 0));
> assertRowsIgnoringOrder(execute("select k1,c1,c2,v1,v2 from view1"),
> row(1, 0, 0, 0, 0),
> row(1, 0, 1, 0, 1),
> row(1, 2, 0, 2, 0));
> }
> @Test
> public void testRangeDeletionWithFlush() throws Throwable
> {
> testExistingParitionDeletion(true);
> }
> @Test
> public void testRangeDeletionWithoutFlush() throws Throwable
> {
> testExistingParitionDeletion(false);
> }
> public void testExistingParitionDeletion(boolean flush) throws Throwable
> {
> // for partition range deletion, need to know that existing row is 
> shadowed instead of not existed.
> createTable("CREATE TABLE %s (a int, b int, c int, d int, PRIMARY KEY 
> (a))");
> execute("USE " + keyspace());
> executeNet(protocolVersion, "USE " + keyspace());
> createView("mv_test1",
>"CREATE MATERIALIZED VIEW %s AS SELECT * FROM %%s WHERE a 
> IS NOT NULL AND b IS NOT NULL PRIMARY KEY (a, b)");
> Keyspace ks = Keyspace.open(keyspace());
> ks.getColumnFamilyStore("mv_test1").disableAutoCompaction();
> execute("INSERT INTO %s (a, b, c, d) VALUES (?, ?, ?, ?) using 
> timestamp 0", 1, 1, 1, 1);
> if (flush)
> FBUtilities.waitOnFutures(ks.flush());
> assertRowsIgnoringOrder(execute("SELECT * FROM mv_test1"), row(1, 1,

[jira] [Created] (CASSANDRA-13787) RangeTombstoneMarker and ParitionDeletion is not properly included in MV

2017-08-22 Thread ZhaoYang (JIRA)
ZhaoYang created CASSANDRA-13787:


 Summary: RangeTombstoneMarker and ParitionDeletion is not properly 
included in MV
 Key: CASSANDRA-13787
 URL: https://issues.apache.org/jira/browse/CASSANDRA-13787
 Project: Cassandra
  Issue Type: Bug
  Components: Materialized Views
Reporter: ZhaoYang




Found two problems related to MV tombstone. 

1. Range-tombstone-Marker being ignored after shadowing first row, subsequent 
base rows are not shadowed in TableViews.

If the range tombstone was not flushed, it was used as deleted row to 
shadow new updates. It works correctly.
After range tombstone was flushed, it was used as RangeTombstoneMarker and 
being skipped after shadowing first update.

2. Partition tombstone is not used when no existing live data, it will 
resurrect deleted cells. It was found in 11500 and included in that patch.


In order not to make 11500 patch more complicated, I will try fix 
range/partition tombstone issue here.


{code:title=Tests to reproduce}
@Test
public void testExistingRangeTombstoneWithFlush() throws Throwable
{
testExistingRangeTombstone(true);
}

@Test
public void testExistingRangeTombstoneWithoutFlush() throws Throwable
{
testExistingRangeTombstone(false);
}

public void testExistingRangeTombstone(boolean flush) throws Throwable
{
createTable("CREATE TABLE %s (k1 int, c1 int, c2 int, v1 int, v2 int, 
PRIMARY KEY (k1, c1, c2))");

execute("USE " + keyspace());
executeNet(protocolVersion, "USE " + keyspace());

createView("view1",
   "CREATE MATERIALIZED VIEW view1 AS SELECT * FROM %%s WHERE 
k1 IS NOT NULL AND c1 IS NOT NULL AND c2 IS NOT NULL PRIMARY KEY (k1, c2, c1)");

updateView("DELETE FROM %s USING TIMESTAMP 10 WHERE k1 = 1 and c1=1");


if (flush)

Keyspace.open(keyspace()).getColumnFamilyStore(currentTable()).forceBlockingFlush();

String table = KEYSPACE + "." + currentTable();
updateView("BEGIN BATCH " +
"INSERT INTO " + table + " (k1, c1, c2, v1, v2) VALUES (1, 0, 
0, 0, 0) USING TIMESTAMP 5; " +
"INSERT INTO " + table + " (k1, c1, c2, v1, v2) VALUES (1, 0, 
1, 0, 1) USING TIMESTAMP 5; " +
"INSERT INTO " + table + " (k1, c1, c2, v1, v2) VALUES (1, 1, 
0, 1, 0) USING TIMESTAMP 5; " +
"INSERT INTO " + table + " (k1, c1, c2, v1, v2) VALUES (1, 1, 
1, 1, 1) USING TIMESTAMP 5; " +
"INSERT INTO " + table + " (k1, c1, c2, v1, v2) VALUES (1, 1, 
2, 1, 2) USING TIMESTAMP 5; " +
"INSERT INTO " + table + " (k1, c1, c2, v1, v2) VALUES (1, 1, 
3, 1, 3) USING TIMESTAMP 5; " +
"INSERT INTO " + table + " (k1, c1, c2, v1, v2) VALUES (1, 2, 
0, 2, 0) USING TIMESTAMP 5; " +
"APPLY BATCH");

assertRowsIgnoringOrder(execute("select * from %s"),
row(1, 0, 0, 0, 0),
row(1, 0, 1, 0, 1),
row(1, 2, 0, 2, 0));
assertRowsIgnoringOrder(execute("select k1,c1,c2,v1,v2 from view1"),
row(1, 0, 0, 0, 0),
row(1, 0, 1, 0, 1),
row(1, 2, 0, 2, 0));
}

@Test
public void testRangeDeletionWithFlush() throws Throwable
{
testExistingParitionDeletion(true);
}

@Test
public void testRangeDeletionWithoutFlush() throws Throwable
{
testExistingParitionDeletion(false);
}

public void testExistingParitionDeletion(boolean flush) throws Throwable
{
// for partition range deletion, need to know that existing row is 
shadowed instead of not existed.
createTable("CREATE TABLE %s (a int, b int, c int, d int, PRIMARY KEY 
(a))");

execute("USE " + keyspace());
executeNet(protocolVersion, "USE " + keyspace());

createView("mv_test1",
   "CREATE MATERIALIZED VIEW %s AS SELECT * FROM %%s WHERE a IS 
NOT NULL AND b IS NOT NULL PRIMARY KEY (a, b)");

Keyspace ks = Keyspace.open(keyspace());
ks.getColumnFamilyStore("mv_test1").disableAutoCompaction();

execute("INSERT INTO %s (a, b, c, d) VALUES (?, ?, ?, ?) using 
timestamp 0", 1, 1, 1, 1);
if (flush)
FBUtilities.waitOnFutures(ks.flush());

assertRowsIgnoringOrder(execute("SELECT * FROM mv_test1"), row(1, 1, 1, 
1));

// remove view row
updateView("UPDATE %s using timestamp 1 set b = null WHERE a=1");
if (flush)
FBUtilities.waitOnFutures(ks.flush());

assertRowsIgnoringOrder(execute("SELECT * FROM mv_test1"));
// remove base row, no view updated generated.
updateView("DELETE FROM %s using timestamp 2 where a=1");
if (flush)
FBUtil

[jira] [Commented] (CASSANDRA-13299) Potential OOMs and lock contention in write path streams

2017-08-22 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16137759#comment-16137759
 ] 

ZhaoYang commented on CASSANDRA-13299:
--

[~brstgt] thanks :)

Found one more issue related to RangeTombstoneMarker in MV when writing dtest.

> Potential OOMs and lock contention in write path streams
> 
>
> Key: CASSANDRA-13299
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13299
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Benjamin Roth
>Assignee: ZhaoYang
>
> I see a potential OOM, when a stream (e.g. repair) goes through the write 
> path as it is with MVs.
> StreamReceiveTask gets a bunch of SSTableReaders. These produce rowiterators 
> and they again produce mutations. So every partition creates a single 
> mutation, which in case of (very) big partitions can result in (very) big 
> mutations. Those are created on heap and stay there until they finished 
> processing.
> I don't think it is necessary to create a single mutation for each partition. 
> Why don't we implement a PartitionUpdateGeneratorIterator that takes a 
> UnfilteredRowIterator and a max size and spits out PartitionUpdates to be 
> used to create and apply mutations?
> The max size should be something like min(reasonable_absolute_max_size, 
> max_mutation_size, commitlog_segment_size / 2). reasonable_absolute_max_size 
> could be like 16M or sth.
> A mutation shouldn't be too large as it also affects MV partition locking. 
> The longer a MV partition is locked during a stream, the higher chances are 
> that WTE's occur during streams.
> I could also imagine that a max number of updates per mutation regardless of 
> size in bytes could make sense to avoid lock contention.
> Love to get feedback and suggestions, incl. naming suggestions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-22 Thread mck (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16137722#comment-16137722
 ] 

mck commented on CASSANDRA-13418:
-

{quote}I can remove TWCSCompactionController.getFullyExpiredSSTables(..) if you 
wish, I don't have any strong opinion about it, just say so{quote}

I think so yes. Not because it's my opinion, but because that's the general 
minimalist style of the C* codebase.

{quote}CompactionController:232 any reason not to return an immutable 
set?{quote}

Leave the code as-was in your 0c4d342 commit. Your original rebuttal justifying 
it to be mutable made sense.

{quote}Do you have in mind any test that should be good to have ?{quote}

I'll get back to you on this [~rgerard]. But the test you added was a good 
start.

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13786) Validation compactions can cause orphan sstable warnings

2017-08-22 Thread Blake Eggleston (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16137691#comment-16137691
 ] 

Blake Eggleston commented on CASSANDRA-13786:
-

Yeah for sure. It just seems like changing the interface is an oversize 
solution to the minor annoyance of getting periodic warnings in the logs. Maybe 
we don't need the log entry anymore, or it can be demoted to info, or there's a 
solution I haven't though of. That said, there probably aren't many 3rd party 
compaction strategies written against vanilla Cassandra these days.

> Validation compactions can cause orphan sstable warnings
> 
>
> Key: CASSANDRA-13786
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13786
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Minor
> Fix For: 4.0
>
>
> I've seen LevelledCompactionStrategy occasionally logging: 
> {quote} from level 0 is not on corresponding level in the 
> leveled manifest. This is not a problem per se, but may indicate an orphaned 
> sstable due to a failed compaction not cleaned up properly."{quote} warnings 
> from a ValidationExecutor thread.
> What's happening here is that a compaction running concurrently with the 
> validation is promoting (or demoting) sstables as part of an incremental 
> repair, and an sstable has changed hands by the time the validation 
> compaction gets around to getting scanners for it. The sstable 
> isolation/synchronization done by validation compactions is a lot looser than 
> normal compactions, so seeing this happen isn't very surprising. Given that 
> it's harmless, and not unexpected, I think it would be best to not log these 
> during validation compactions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-9375) setting timeouts to 1ms prevents startup

2017-08-22 Thread Varun Barala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Barala updated CASSANDRA-9375:

Attachment: (was: CASSANDRA-9375_after_review_2.patch)

> setting timeouts to 1ms prevents startup
> 
>
> Key: CASSANDRA-9375
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9375
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Brandon Williams
>Assignee: Varun Barala
>Priority: Trivial
>  Labels: patch
> Fix For: 2.1.x
>
> Attachments: CASSANDRA-9375_after_review, 
> CASSANDRA-9375_after_review_2.patch, CASSANDRA-9375.patch
>
>
> Granted, this is a nonsensical setting, but the error message makes it tough 
> to discern what's wrong:
> {noformat}
> ERROR 17:13:28,726 Exception encountered during startup
> java.lang.ExceptionInInitializerError
>  at 
> org.apache.cassandra.net.MessagingService.instance(MessagingService.java:310)
>  at 
> org.apache.cassandra.service.StorageService.(StorageService.java:233)
>  at 
> org.apache.cassandra.service.StorageService.(StorageService.java:141)
>  at 
> org.apache.cassandra.locator.DynamicEndpointSnitch.(DynamicEndpointSnitch.java:87)
>  at 
> org.apache.cassandra.locator.DynamicEndpointSnitch.(DynamicEndpointSnitch.java:63)
>  at 
> org.apache.cassandra.config.DatabaseDescriptor.createEndpointSnitch(DatabaseDescriptor.java:518)
>  at 
> org.apache.cassandra.config.DatabaseDescriptor.applyConfig(DatabaseDescriptor.java:350)
>  at 
> org.apache.cassandra.config.DatabaseDescriptor.(DatabaseDescriptor.java:112)
>  at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:213)
>  at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:567)
>  at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:656)
> Caused by: java.lang.IllegalArgumentException
>  at 
> java.util.concurrent.ScheduledThreadPoolExecutor.scheduleWithFixedDelay(ScheduledThreadPoolExecutor.java:586)
>  at 
> org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor.scheduleWithFixedDelay(DebuggableScheduledThreadPoolExecutor.java:64)
>  at org.apache.cassandra.utils.ExpiringMap.(ExpiringMap.java:103)
>  at 
> org.apache.cassandra.net.MessagingService.(MessagingService.java:360)
>  at org.apache.cassandra.net.MessagingService.(MessagingService.java:68)
>  at 
> org.apache.cassandra.net.MessagingService$MSHandle.(MessagingService.java:306)
>  ... 11 more
> java.lang.ExceptionInInitializerError
>  at 
> org.apache.cassandra.net.MessagingService.instance(MessagingService.java:310)
>  at 
> org.apache.cassandra.service.StorageService.(StorageService.java:233)
>  at 
> org.apache.cassandra.service.StorageService.(StorageService.java:141)
>  at 
> org.apache.cassandra.locator.DynamicEndpointSnitch.(DynamicEndpointSnitch.java:87)
>  at 
> org.apache.cassandra.locator.DynamicEndpointSnitch.(DynamicEndpointSnitch.java:63)
>  at 
> org.apache.cassandra.config.DatabaseDescriptor.createEndpointSnitch(DatabaseDescriptor.java:518)
>  at 
> org.apache.cassandra.config.DatabaseDescriptor.applyConfig(DatabaseDescriptor.java:350)
>  at 
> org.apache.cassandra.config.DatabaseDescriptor.(DatabaseDescriptor.java:112)
>  at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:213)
>  at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:567)
>  at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:656)
> Caused by: java.lang.IllegalArgumentException
>  at 
> java.util.concurrent.ScheduledThreadPoolExecutor.scheduleWithFixedDelay(ScheduledThreadPoolExecutor.java:586)
>  at 
> org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor.scheduleWithFixedDelay(DebuggableScheduledThreadPoolExecutor.java:64)
>  at org.apache.cassandra.utils.ExpiringMap.(ExpiringMap.java:103)
>  at 
> org.apache.cassandra.net.MessagingService.(MessagingService.java:360)
>  at org.apache.cassandra.net.MessagingService.(MessagingService.java:68)
>  at 
> org.apache.cassandra.net.MessagingService$MSHandle.(MessagingService.java:306)
>  ... 11 more
> Exception encountered during startup: null
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-9375) setting timeouts to 1ms prevents startup

2017-08-22 Thread Varun Barala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Barala updated CASSANDRA-9375:

Attachment: CASSANDRA-9375_after_review_2.patch

> setting timeouts to 1ms prevents startup
> 
>
> Key: CASSANDRA-9375
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9375
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Brandon Williams
>Assignee: Varun Barala
>Priority: Trivial
>  Labels: patch
> Fix For: 2.1.x
>
> Attachments: CASSANDRA-9375_after_review, 
> CASSANDRA-9375_after_review_2.patch, CASSANDRA-9375.patch
>
>
> Granted, this is a nonsensical setting, but the error message makes it tough 
> to discern what's wrong:
> {noformat}
> ERROR 17:13:28,726 Exception encountered during startup
> java.lang.ExceptionInInitializerError
>  at 
> org.apache.cassandra.net.MessagingService.instance(MessagingService.java:310)
>  at 
> org.apache.cassandra.service.StorageService.(StorageService.java:233)
>  at 
> org.apache.cassandra.service.StorageService.(StorageService.java:141)
>  at 
> org.apache.cassandra.locator.DynamicEndpointSnitch.(DynamicEndpointSnitch.java:87)
>  at 
> org.apache.cassandra.locator.DynamicEndpointSnitch.(DynamicEndpointSnitch.java:63)
>  at 
> org.apache.cassandra.config.DatabaseDescriptor.createEndpointSnitch(DatabaseDescriptor.java:518)
>  at 
> org.apache.cassandra.config.DatabaseDescriptor.applyConfig(DatabaseDescriptor.java:350)
>  at 
> org.apache.cassandra.config.DatabaseDescriptor.(DatabaseDescriptor.java:112)
>  at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:213)
>  at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:567)
>  at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:656)
> Caused by: java.lang.IllegalArgumentException
>  at 
> java.util.concurrent.ScheduledThreadPoolExecutor.scheduleWithFixedDelay(ScheduledThreadPoolExecutor.java:586)
>  at 
> org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor.scheduleWithFixedDelay(DebuggableScheduledThreadPoolExecutor.java:64)
>  at org.apache.cassandra.utils.ExpiringMap.(ExpiringMap.java:103)
>  at 
> org.apache.cassandra.net.MessagingService.(MessagingService.java:360)
>  at org.apache.cassandra.net.MessagingService.(MessagingService.java:68)
>  at 
> org.apache.cassandra.net.MessagingService$MSHandle.(MessagingService.java:306)
>  ... 11 more
> java.lang.ExceptionInInitializerError
>  at 
> org.apache.cassandra.net.MessagingService.instance(MessagingService.java:310)
>  at 
> org.apache.cassandra.service.StorageService.(StorageService.java:233)
>  at 
> org.apache.cassandra.service.StorageService.(StorageService.java:141)
>  at 
> org.apache.cassandra.locator.DynamicEndpointSnitch.(DynamicEndpointSnitch.java:87)
>  at 
> org.apache.cassandra.locator.DynamicEndpointSnitch.(DynamicEndpointSnitch.java:63)
>  at 
> org.apache.cassandra.config.DatabaseDescriptor.createEndpointSnitch(DatabaseDescriptor.java:518)
>  at 
> org.apache.cassandra.config.DatabaseDescriptor.applyConfig(DatabaseDescriptor.java:350)
>  at 
> org.apache.cassandra.config.DatabaseDescriptor.(DatabaseDescriptor.java:112)
>  at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:213)
>  at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:567)
>  at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:656)
> Caused by: java.lang.IllegalArgumentException
>  at 
> java.util.concurrent.ScheduledThreadPoolExecutor.scheduleWithFixedDelay(ScheduledThreadPoolExecutor.java:586)
>  at 
> org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor.scheduleWithFixedDelay(DebuggableScheduledThreadPoolExecutor.java:64)
>  at org.apache.cassandra.utils.ExpiringMap.(ExpiringMap.java:103)
>  at 
> org.apache.cassandra.net.MessagingService.(MessagingService.java:360)
>  at org.apache.cassandra.net.MessagingService.(MessagingService.java:68)
>  at 
> org.apache.cassandra.net.MessagingService$MSHandle.(MessagingService.java:306)
>  ... 11 more
> Exception encountered during startup: null
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-9375) setting timeouts to 1ms prevents startup

2017-08-22 Thread Varun Barala (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16137643#comment-16137643
 ] 

Varun Barala commented on CASSANDRA-9375:
-

Thanks [~jjirsa] [~jasobrown] for the review. I updated as per your suggestions.
* Added this check inside {{DatabaseDescriptor}}
* Added junit test case

{{DatabaseDescriptor}} has been refactored in 3.11.0 
[https://github.com/apache/cassandra/commit/9797511c56df4e9c7db964a6b83e67642df96c2d#diff-a8a9935b164cd23da473fd45784fd1dd].
 I'll provide separate patch for this. Thanks!!

I have small doubt:
Do we need to also take care of {{DatabaseDescriptor}} setters? i.e. 
{{#setCasContentionTimeout(Long timeOutInMillis)}}

> setting timeouts to 1ms prevents startup
> 
>
> Key: CASSANDRA-9375
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9375
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Brandon Williams
>Assignee: Varun Barala
>Priority: Trivial
>  Labels: patch
> Fix For: 2.1.x
>
> Attachments: CASSANDRA-9375_after_review, 
> CASSANDRA-9375_after_review_2.patch, CASSANDRA-9375.patch
>
>
> Granted, this is a nonsensical setting, but the error message makes it tough 
> to discern what's wrong:
> {noformat}
> ERROR 17:13:28,726 Exception encountered during startup
> java.lang.ExceptionInInitializerError
>  at 
> org.apache.cassandra.net.MessagingService.instance(MessagingService.java:310)
>  at 
> org.apache.cassandra.service.StorageService.(StorageService.java:233)
>  at 
> org.apache.cassandra.service.StorageService.(StorageService.java:141)
>  at 
> org.apache.cassandra.locator.DynamicEndpointSnitch.(DynamicEndpointSnitch.java:87)
>  at 
> org.apache.cassandra.locator.DynamicEndpointSnitch.(DynamicEndpointSnitch.java:63)
>  at 
> org.apache.cassandra.config.DatabaseDescriptor.createEndpointSnitch(DatabaseDescriptor.java:518)
>  at 
> org.apache.cassandra.config.DatabaseDescriptor.applyConfig(DatabaseDescriptor.java:350)
>  at 
> org.apache.cassandra.config.DatabaseDescriptor.(DatabaseDescriptor.java:112)
>  at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:213)
>  at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:567)
>  at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:656)
> Caused by: java.lang.IllegalArgumentException
>  at 
> java.util.concurrent.ScheduledThreadPoolExecutor.scheduleWithFixedDelay(ScheduledThreadPoolExecutor.java:586)
>  at 
> org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor.scheduleWithFixedDelay(DebuggableScheduledThreadPoolExecutor.java:64)
>  at org.apache.cassandra.utils.ExpiringMap.(ExpiringMap.java:103)
>  at 
> org.apache.cassandra.net.MessagingService.(MessagingService.java:360)
>  at org.apache.cassandra.net.MessagingService.(MessagingService.java:68)
>  at 
> org.apache.cassandra.net.MessagingService$MSHandle.(MessagingService.java:306)
>  ... 11 more
> java.lang.ExceptionInInitializerError
>  at 
> org.apache.cassandra.net.MessagingService.instance(MessagingService.java:310)
>  at 
> org.apache.cassandra.service.StorageService.(StorageService.java:233)
>  at 
> org.apache.cassandra.service.StorageService.(StorageService.java:141)
>  at 
> org.apache.cassandra.locator.DynamicEndpointSnitch.(DynamicEndpointSnitch.java:87)
>  at 
> org.apache.cassandra.locator.DynamicEndpointSnitch.(DynamicEndpointSnitch.java:63)
>  at 
> org.apache.cassandra.config.DatabaseDescriptor.createEndpointSnitch(DatabaseDescriptor.java:518)
>  at 
> org.apache.cassandra.config.DatabaseDescriptor.applyConfig(DatabaseDescriptor.java:350)
>  at 
> org.apache.cassandra.config.DatabaseDescriptor.(DatabaseDescriptor.java:112)
>  at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:213)
>  at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:567)
>  at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:656)
> Caused by: java.lang.IllegalArgumentException
>  at 
> java.util.concurrent.ScheduledThreadPoolExecutor.scheduleWithFixedDelay(ScheduledThreadPoolExecutor.java:586)
>  at 
> org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor.scheduleWithFixedDelay(DebuggableScheduledThreadPoolExecutor.java:64)
>  at org.apache.cassandra.utils.ExpiringMap.(ExpiringMap.java:103)
>  at 
> org.apache.cassandra.net.MessagingService.(MessagingService.java:360)
>  at org.apache.cassandra.net.MessagingService.(MessagingService.java:68)
>  at 
> org.apache.cassandra.net.MessagingService$MSHandle.(MessagingService.java:306)
>  ... 11 more
> Exception encountered during startup: null
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---

[jira] [Updated] (CASSANDRA-9375) setting timeouts to 1ms prevents startup

2017-08-22 Thread Varun Barala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Barala updated CASSANDRA-9375:

Attachment: CASSANDRA-9375_after_review_2.patch

> setting timeouts to 1ms prevents startup
> 
>
> Key: CASSANDRA-9375
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9375
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Brandon Williams
>Assignee: Varun Barala
>Priority: Trivial
>  Labels: patch
> Fix For: 2.1.x
>
> Attachments: CASSANDRA-9375_after_review, 
> CASSANDRA-9375_after_review_2.patch, CASSANDRA-9375.patch
>
>
> Granted, this is a nonsensical setting, but the error message makes it tough 
> to discern what's wrong:
> {noformat}
> ERROR 17:13:28,726 Exception encountered during startup
> java.lang.ExceptionInInitializerError
>  at 
> org.apache.cassandra.net.MessagingService.instance(MessagingService.java:310)
>  at 
> org.apache.cassandra.service.StorageService.(StorageService.java:233)
>  at 
> org.apache.cassandra.service.StorageService.(StorageService.java:141)
>  at 
> org.apache.cassandra.locator.DynamicEndpointSnitch.(DynamicEndpointSnitch.java:87)
>  at 
> org.apache.cassandra.locator.DynamicEndpointSnitch.(DynamicEndpointSnitch.java:63)
>  at 
> org.apache.cassandra.config.DatabaseDescriptor.createEndpointSnitch(DatabaseDescriptor.java:518)
>  at 
> org.apache.cassandra.config.DatabaseDescriptor.applyConfig(DatabaseDescriptor.java:350)
>  at 
> org.apache.cassandra.config.DatabaseDescriptor.(DatabaseDescriptor.java:112)
>  at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:213)
>  at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:567)
>  at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:656)
> Caused by: java.lang.IllegalArgumentException
>  at 
> java.util.concurrent.ScheduledThreadPoolExecutor.scheduleWithFixedDelay(ScheduledThreadPoolExecutor.java:586)
>  at 
> org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor.scheduleWithFixedDelay(DebuggableScheduledThreadPoolExecutor.java:64)
>  at org.apache.cassandra.utils.ExpiringMap.(ExpiringMap.java:103)
>  at 
> org.apache.cassandra.net.MessagingService.(MessagingService.java:360)
>  at org.apache.cassandra.net.MessagingService.(MessagingService.java:68)
>  at 
> org.apache.cassandra.net.MessagingService$MSHandle.(MessagingService.java:306)
>  ... 11 more
> java.lang.ExceptionInInitializerError
>  at 
> org.apache.cassandra.net.MessagingService.instance(MessagingService.java:310)
>  at 
> org.apache.cassandra.service.StorageService.(StorageService.java:233)
>  at 
> org.apache.cassandra.service.StorageService.(StorageService.java:141)
>  at 
> org.apache.cassandra.locator.DynamicEndpointSnitch.(DynamicEndpointSnitch.java:87)
>  at 
> org.apache.cassandra.locator.DynamicEndpointSnitch.(DynamicEndpointSnitch.java:63)
>  at 
> org.apache.cassandra.config.DatabaseDescriptor.createEndpointSnitch(DatabaseDescriptor.java:518)
>  at 
> org.apache.cassandra.config.DatabaseDescriptor.applyConfig(DatabaseDescriptor.java:350)
>  at 
> org.apache.cassandra.config.DatabaseDescriptor.(DatabaseDescriptor.java:112)
>  at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:213)
>  at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:567)
>  at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:656)
> Caused by: java.lang.IllegalArgumentException
>  at 
> java.util.concurrent.ScheduledThreadPoolExecutor.scheduleWithFixedDelay(ScheduledThreadPoolExecutor.java:586)
>  at 
> org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor.scheduleWithFixedDelay(DebuggableScheduledThreadPoolExecutor.java:64)
>  at org.apache.cassandra.utils.ExpiringMap.(ExpiringMap.java:103)
>  at 
> org.apache.cassandra.net.MessagingService.(MessagingService.java:360)
>  at org.apache.cassandra.net.MessagingService.(MessagingService.java:68)
>  at 
> org.apache.cassandra.net.MessagingService$MSHandle.(MessagingService.java:306)
>  ... 11 more
> Exception encountered during startup: null
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13785) Compaction fails for SSTables with large number of keys

2017-08-22 Thread Jay Zhuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Zhuang updated CASSANDRA-13785:
---
Status: Patch Available  (was: Open)

> Compaction fails for SSTables with large number of keys
> ---
>
> Key: CASSANDRA-13785
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13785
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Jay Zhuang
>Assignee: Jay Zhuang
>
> Every a few minutes there're "LEAK DTECTED" messages in the log:
> {noformat}
> ERROR [Reference-Reaper:1] 2017-08-18 17:18:40,357 Ref.java:223 - LEAK 
> DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@3ed22d7) to class 
> org.apache.cassandra.utils.concurrent.WrappedSharedCloseable$Tidy@1022568824:[Memory@[0..159b6ba4),
>  Memory@[0..d8123468)] was not released before the reference was garbage 
> collected
> ERROR [Reference-Reaper:1] 2017-08-18 17:20:49,693 Ref.java:223 - LEAK 
> DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@6470405b) to class 
> org.apache.cassandra.utils.concurrent.WrappedSharedCloseable$Tidy@97898152:[Memory@[0..159b6ba4),
>  Memory@[0..d8123468)] was not released before the reference was garbage 
> collected
> ERROR [Reference-Reaper:1] 2017-08-18 17:22:38,519 Ref.java:223 - LEAK 
> DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@6fc4af5f) to class 
> org.apache.cassandra.utils.concurrent.WrappedSharedCloseable$Tidy@1247404854:[Memory@[0..159b6ba4),
>  Memory@[0..d8123468)] was not released before the reference was garbage 
> collected
> {noformat}
> Debugged the issue and found it's triggered by failed compactions, if the 
> compacted SSTable has more than 51m {{Integer.MAX_VALUE / 40}}) keys, it will 
> fail to create the IndexSummary: 
> [IndexSummary:84|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/io/sstable/IndexSummary.java#L84].
> Cassandra compaction tried to compact every a few minutes and keeps failing.
> The root cause is while [creating 
> SafeMemoryWriter|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/io/sstable/IndexSummaryBuilder.java#L112]
>  with {{> Integer.MAX_VALUE}} space, it returns the tailing 
> {{Integer.MAX_VALUE}} space 
> [SafeMemoryWriter.java:83|https://github.com/apache/cassandra/blob/6a1b1f26b7174e8c9bf86a96514ab626ce2a4117/src/java/org/apache/cassandra/io/util/SafeMemoryWriter.java#L83],
>  which makes the first 
> [entries.length()|https://github.com/apache/cassandra/blob/6a1b1f26b7174e8c9bf86a96514ab626ce2a4117/src/java/org/apache/cassandra/io/sstable/IndexSummaryBuilder.java#L173]
>  not 0. So the assert fails here: 
> [IndexSummary:84|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/io/sstable/IndexSummary.java#L84]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13785) Compaction fails for SSTables with large number of keys

2017-08-22 Thread Jay Zhuang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16137614#comment-16137614
 ] 

Jay Zhuang commented on CASSANDRA-13785:


I'm able to reproduce the problem with an unit-test and here is the patch:
| branch | dTest |
| [13785-3.0|https://github.com/cooldoger/cassandra/tree/13785-3.0] | 
[circleci#76 passed|https://circleci.com/gh/cooldoger/cassandra/76] |
| [13785-3.11|https://github.com/cooldoger/cassandra/tree/13785-3.11] | 
[circleci#77 running|https://circleci.com/gh/cooldoger/cassandra/77] |
| [13785-trunk|https://github.com/cooldoger/cassandra/tree/13785-trunk] | 
[circleci#78 running|https://circleci.com/gh/cooldoger/cassandra/78] |

[~iamaleksey] would you please review it?

> Compaction fails for SSTables with large number of keys
> ---
>
> Key: CASSANDRA-13785
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13785
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Jay Zhuang
>Assignee: Jay Zhuang
>
> Every a few minutes there're "LEAK DTECTED" messages in the log:
> {noformat}
> ERROR [Reference-Reaper:1] 2017-08-18 17:18:40,357 Ref.java:223 - LEAK 
> DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@3ed22d7) to class 
> org.apache.cassandra.utils.concurrent.WrappedSharedCloseable$Tidy@1022568824:[Memory@[0..159b6ba4),
>  Memory@[0..d8123468)] was not released before the reference was garbage 
> collected
> ERROR [Reference-Reaper:1] 2017-08-18 17:20:49,693 Ref.java:223 - LEAK 
> DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@6470405b) to class 
> org.apache.cassandra.utils.concurrent.WrappedSharedCloseable$Tidy@97898152:[Memory@[0..159b6ba4),
>  Memory@[0..d8123468)] was not released before the reference was garbage 
> collected
> ERROR [Reference-Reaper:1] 2017-08-18 17:22:38,519 Ref.java:223 - LEAK 
> DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@6fc4af5f) to class 
> org.apache.cassandra.utils.concurrent.WrappedSharedCloseable$Tidy@1247404854:[Memory@[0..159b6ba4),
>  Memory@[0..d8123468)] was not released before the reference was garbage 
> collected
> {noformat}
> Debugged the issue and found it's triggered by failed compactions, if the 
> compacted SSTable has more than 51m {{Integer.MAX_VALUE / 40}}) keys, it will 
> fail to create the IndexSummary: 
> [IndexSummary:84|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/io/sstable/IndexSummary.java#L84].
> Cassandra compaction tried to compact every a few minutes and keeps failing.
> The root cause is while [creating 
> SafeMemoryWriter|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/io/sstable/IndexSummaryBuilder.java#L112]
>  with {{> Integer.MAX_VALUE}} space, it returns the tailing 
> {{Integer.MAX_VALUE}} space 
> [SafeMemoryWriter.java:83|https://github.com/apache/cassandra/blob/6a1b1f26b7174e8c9bf86a96514ab626ce2a4117/src/java/org/apache/cassandra/io/util/SafeMemoryWriter.java#L83],
>  which makes the first 
> [entries.length()|https://github.com/apache/cassandra/blob/6a1b1f26b7174e8c9bf86a96514ab626ce2a4117/src/java/org/apache/cassandra/io/sstable/IndexSummaryBuilder.java#L173]
>  not 0. So the assert fails here: 
> [IndexSummary:84|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/io/sstable/IndexSummary.java#L84]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13786) Validation compactions can cause orphan sstable warnings

2017-08-22 Thread Jeff Jirsa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16137580#comment-16137580
 ] 

Jeff Jirsa commented on CASSANDRA-13786:


Not a review and haven't read the code, but you're just targeting 4.0? Seems 
like if we're going to break an interface, that's probably the time to do it.


> Validation compactions can cause orphan sstable warnings
> 
>
> Key: CASSANDRA-13786
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13786
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Minor
> Fix For: 4.0
>
>
> I've seen LevelledCompactionStrategy occasionally logging: 
> {quote} from level 0 is not on corresponding level in the 
> leveled manifest. This is not a problem per se, but may indicate an orphaned 
> sstable due to a failed compaction not cleaned up properly."{quote} warnings 
> from a ValidationExecutor thread.
> What's happening here is that a compaction running concurrently with the 
> validation is promoting (or demoting) sstables as part of an incremental 
> repair, and an sstable has changed hands by the time the validation 
> compaction gets around to getting scanners for it. The sstable 
> isolation/synchronization done by validation compactions is a lot looser than 
> normal compactions, so seeing this happen isn't very surprising. Given that 
> it's harmless, and not unexpected, I think it would be best to not log these 
> during validation compactions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13786) Validation compactions can cause orphan sstable warnings

2017-08-22 Thread Blake Eggleston (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16137567#comment-16137567
 ] 

Blake Eggleston commented on CASSANDRA-13786:
-

I have a fix pushed 
[here|https://github.com/bdeggleston/cassandra/commits/orphan-sstables], but I 
don't like that it's changing the compaction strategy interface, anyone have 
any ideas on alternate solutions?

/cc [~krummas] [~jjirsa]


> Validation compactions can cause orphan sstable warnings
> 
>
> Key: CASSANDRA-13786
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13786
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Minor
> Fix For: 4.0
>
>
> I've seen LevelledCompactionStrategy occasionally logging: 
> {quote} from level 0 is not on corresponding level in the 
> leveled manifest. This is not a problem per se, but may indicate an orphaned 
> sstable due to a failed compaction not cleaned up properly."{quote} warnings 
> from a ValidationExecutor thread.
> What's happening here is that a compaction running concurrently with the 
> validation is promoting (or demoting) sstables as part of an incremental 
> repair, and an sstable has changed hands by the time the validation 
> compaction gets around to getting scanners for it. The sstable 
> isolation/synchronization done by validation compactions is a lot looser than 
> normal compactions, so seeing this happen isn't very surprising. Given that 
> it's harmless, and not unexpected, I think it would be best to not log these 
> during validation compactions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-13786) Validation compactions can cause orphan sstable warnings

2017-08-22 Thread Blake Eggleston (JIRA)
Blake Eggleston created CASSANDRA-13786:
---

 Summary: Validation compactions can cause orphan sstable warnings
 Key: CASSANDRA-13786
 URL: https://issues.apache.org/jira/browse/CASSANDRA-13786
 Project: Cassandra
  Issue Type: Bug
Reporter: Blake Eggleston
Assignee: Blake Eggleston
Priority: Minor
 Fix For: 4.0


I've seen LevelledCompactionStrategy occasionally logging: 
{quote} from level 0 is not on corresponding level in the leveled 
manifest. This is not a problem per se, but may indicate an orphaned sstable 
due to a failed compaction not cleaned up properly."{quote} warnings from a 
ValidationExecutor thread.

What's happening here is that a compaction running concurrently with the 
validation is promoting (or demoting) sstables as part of an incremental 
repair, and an sstable has changed hands by the time the validation compaction 
gets around to getting scanners for it. The sstable isolation/synchronization 
done by validation compactions is a lot looser than normal compactions, so 
seeing this happen isn't very surprising. Given that it's harmless, and not 
unexpected, I think it would be best to not log these during validation 
compactions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13635) update dtests to support netty-based internode messaging/streaming

2017-08-22 Thread Jason Brown (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Brown updated CASSANDRA-13635:

Resolution: Fixed
Status: Resolved  (was: Ready to Commit)

Committed as sha {{1a0e266038e75930c69842e338c6a6ee196f721c}}.

> update dtests to support netty-based internode messaging/streaming
> --
>
> Key: CASSANDRA-13635
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13635
> Project: Cassandra
>  Issue Type: Task
>  Components: Streaming and Messaging, Testing
>Reporter: Jason Brown
>Assignee: Jason Brown
> Fix For: 4.0
>
>
> some dtests need to be updated to work correctly with CASSANDRA-13628



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-12229) Move streaming to non-blocking IO and netty (streaming 2.1)

2017-08-22 Thread Jason Brown (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Brown updated CASSANDRA-12229:

Resolution: Fixed
Status: Resolved  (was: Awaiting Feedback)

Committed as sha {{fc92db2b9b56c143516026ba29cecdec37e286bb}}

> Move streaming to non-blocking IO and netty (streaming 2.1)
> ---
>
> Key: CASSANDRA-12229
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12229
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Streaming and Messaging
>Reporter: Jason Brown
>Assignee: Jason Brown
> Fix For: 4.0
>
>
> As followup work to CASSANDRA-8457, we need to move streaming to use netty.
> Streaming 2.0 (CASSANDRA-5286) brought many good improvements to how files 
> are transferred between nodes in a cluster. However, the low-level details of 
> the current streaming implementation does not line up nicely with a 
> non-blocking model, so I think this is a good time to review some of those 
> details and add in additional goodness. The current implementation assumes a 
> sequential or "single threaded" approach to the sending of stream messages as 
> well as the transfer of files. In short, after several iterative prototypes, 
> I propose the following:
> 1) use a single bi-diredtional connection (instead of requiring to two 
> sockets & two threads)
> 2) send the "non-file" {{StreamMessage}} s (basically anything not 
> {{OutboundFileMessage}}) via the normal internode messaging. This will 
> require a slight bit more management of the session (the ability to look up a 
> {{StreamSession}} from a static function on {{StreamManager}}, but we have 
> have most of the pieces we need for this already.
> 3) switch to a non-blocking IO model (facilitated via netty)
> 4) Allow files to be streamed in parallel (CASSANDRA-4663) - this should just 
> be a thing already
> 5) If the entire sstable is to streamed, in addition to the DATA component, 
> transfer all the components of the sstable (primary index, bloom filter, 
> stats, and so on). This way we can avoid the CPU and GC pressure from 
> deserializing the stream into objects. File streaming then amounts to a 
> block-level transfer.
> Note: The progress/results of CASSANDRA-11303 will need to be reflected here, 
> as well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-8457) nio MessagingService

2017-08-22 Thread Jason Brown (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Brown updated CASSANDRA-8457:
---
   Resolution: Fixed
Fix Version/s: (was: 4.x)
   4.0
   Status: Resolved  (was: Patch Available)

CASSANDRA-13630 is well underway with a review happening now, and it should be 
committed ASAP.

Commited as sha {{356dc3c253224751cbf80b32cfce4e3c1640de11}}



> nio MessagingService
> 
>
> Key: CASSANDRA-8457
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8457
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Jonathan Ellis
>Assignee: Jason Brown
>Priority: Minor
>  Labels: netty, performance
> Fix For: 4.0
>
> Attachments: 8457-load.tgz
>
>
> Thread-per-peer (actually two each incoming and outbound) is a big 
> contributor to context switching, especially for larger clusters.  Let's look 
> at switching to nio, possibly via Netty.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



cassandra-dtest git commit: update dtests to support netty-based internode messaging/streaming

2017-08-22 Thread jasobrown
Repository: cassandra-dtest
Updated Branches:
  refs/heads/master b8842b979 -> 1a0e26603


update dtests to support netty-based internode messaging/streaming

patch by jasobrown, reviewed by Marcus Eriksson for CASSANDRA-13635


Project: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/commit/1a0e2660
Tree: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/tree/1a0e2660
Diff: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/diff/1a0e2660

Branch: refs/heads/master
Commit: 1a0e266038e75930c69842e338c6a6ee196f721c
Parents: b8842b9
Author: Jason Brown 
Authored: Fri Jun 16 05:03:36 2017 -0700
Committer: Jason Brown 
Committed: Tue Aug 22 13:56:41 2017 -0700

--
 bootstrap_test.py| 11 ---
 byteman/4.0/decommission_failure_inject.btm  | 17 +
 .../4.0/inject_failure_streaming_to_node2.btm| 17 +
 byteman/4.0/stream_failure.btm   | 17 +
 byteman/decommission_failure_inject.btm  | 17 -
 byteman/inject_failure_streaming_to_node2.btm| 17 -
 byteman/pre4.0/decommission_failure_inject.btm   | 17 +
 .../pre4.0/inject_failure_streaming_to_node2.btm | 17 +
 byteman/pre4.0/stream_failure.btm| 17 +
 byteman/stream_failure.btm   | 17 -
 native_transport_ssl_test.py |  2 +-
 nodetool_test.py |  8 +---
 rebuild_test.py  |  5 -
 replace_address_test.py  | 10 +++---
 secondary_indexes_test.py| 13 +++--
 sslnodetonode_test.py| 19 +--
 topology_test.py |  5 -
 17 files changed, 151 insertions(+), 75 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra-dtest/blob/1a0e2660/bootstrap_test.py
--
diff --git a/bootstrap_test.py b/bootstrap_test.py
index 1d149e6..54c49c1 100644
--- a/bootstrap_test.py
+++ b/bootstrap_test.py
@@ -148,8 +148,10 @@ class TestBootstrap(BaseBootstrapTest):
 2*streaming_keep_alive_period_in_secs to receive a single sstable
 """
 cluster = self.cluster
-
cluster.set_configuration_options(values={'streaming_socket_timeout_in_ms': 
1000,
-  
'streaming_keep_alive_period_in_secs': 2})
+yaml_opts = {'streaming_keep_alive_period_in_secs': 2}
+if cluster.version() < '4.0':
+yamp_opts['streaming_socket_timeout_in_ms'] = 1000
+cluster.set_configuration_options(values=yaml_opts)
 
 # Create a single node cluster
 cluster.populate(1)
@@ -306,7 +308,10 @@ class TestBootstrap(BaseBootstrapTest):
 
 cluster.start(wait_other_notice=True)
 # kill stream to node3 in the middle of streaming to let it fail
-node1.byteman_submit(['./byteman/stream_failure.btm'])
+if cluster.version() < '4.0':
+node1.byteman_submit(['./byteman/pre4.0/stream_failure.btm'])
+else:
+node1.byteman_submit(['./byteman/4.0/stream_failure.btm'])
 node1.stress(['write', 'n=1K', 'no-warmup', 'cl=TWO', '-schema', 
'replication(factor=2)', '-rate', 'threads=50'])
 cluster.flush()
 

http://git-wip-us.apache.org/repos/asf/cassandra-dtest/blob/1a0e2660/byteman/4.0/decommission_failure_inject.btm
--
diff --git a/byteman/4.0/decommission_failure_inject.btm 
b/byteman/4.0/decommission_failure_inject.btm
new file mode 100644
index 000..a6418fc
--- /dev/null
+++ b/byteman/4.0/decommission_failure_inject.btm
@@ -0,0 +1,17 @@
+#
+# Inject decommission failure to fail streaming from 127.0.0.1
+#
+# Before start streaming files in `StreamSession#onInitializationComplete()` 
method,
+# interrupt streaming by throwing RuntimeException.
+#
+RULE inject decommission failure
+CLASS org.apache.cassandra.streaming.StreamSession
+METHOD prepareSynAck
+AT INVOKE startStreamingFiles
+BIND peer = $0.peer
+# set flag to only run this rule once.
+IF peer.equals(InetAddress.getByName("127.0.0.1")) AND NOT flagged("done")
+DO
+   flag("done");
+   throw new java.lang.RuntimeException("Triggering network failure")
+ENDRULE
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/cassandra-dtest/blob/1a0e2660/byteman/4.0/inject_failure_streaming_to_node2.btm
--
diff --git a/byteman/4.0/inject_failure_streaming_to_node2.btm 
b/byteman/4.0/inject_failure_st

[01/11] cassandra git commit: move streaming to use netty

2017-08-22 Thread jasobrown
Repository: cassandra
Updated Branches:
  refs/heads/trunk 3d4a7e7b6 -> fc92db2b9


http://git-wip-us.apache.org/repos/asf/cassandra/blob/fc92db2b/src/java/org/apache/cassandra/streaming/messages/StreamInitMessage.java
--
diff --git 
a/src/java/org/apache/cassandra/streaming/messages/StreamInitMessage.java 
b/src/java/org/apache/cassandra/streaming/messages/StreamInitMessage.java
index ceaa4d1..68c6034 100644
--- a/src/java/org/apache/cassandra/streaming/messages/StreamInitMessage.java
+++ b/src/java/org/apache/cassandra/streaming/messages/StreamInitMessage.java
@@ -19,103 +19,64 @@ package org.apache.cassandra.streaming.messages;
 
 import java.io.IOException;
 import java.net.InetAddress;
-import java.nio.ByteBuffer;
 import java.util.UUID;
 
 import org.apache.cassandra.db.TypeSizes;
-import org.apache.cassandra.io.IVersionedSerializer;
 import org.apache.cassandra.io.util.DataInputPlus;
-import org.apache.cassandra.io.util.DataOutputBuffer;
-import org.apache.cassandra.io.util.DataOutputBufferFixed;
-import org.apache.cassandra.io.util.DataOutputPlus;
+import org.apache.cassandra.io.util.DataOutputStreamPlus;
 import org.apache.cassandra.net.CompactEndpointSerializationHelper;
 import org.apache.cassandra.net.MessagingService;
 import org.apache.cassandra.streaming.StreamOperation;
 import org.apache.cassandra.streaming.PreviewKind;
+import org.apache.cassandra.streaming.StreamSession;
 import org.apache.cassandra.utils.UUIDSerializer;
 
 /**
  * StreamInitMessage is first sent from the node where {@link 
org.apache.cassandra.streaming.StreamSession} is started,
  * to initiate corresponding {@link 
org.apache.cassandra.streaming.StreamSession} on the other side.
  */
-public class StreamInitMessage
+public class StreamInitMessage extends StreamMessage
 {
-public static IVersionedSerializer serializer = new 
StreamInitMessageSerializer();
+public static Serializer serializer = new 
StreamInitMessageSerializer();
 
 public final InetAddress from;
 public final int sessionIndex;
 public final UUID planId;
 public final StreamOperation streamOperation;
 
-// true if this init message is to connect for outgoing message on 
receiving side
-public final boolean isForOutgoing;
 public final boolean keepSSTableLevel;
 public final UUID pendingRepair;
 public final PreviewKind previewKind;
 
-public StreamInitMessage(InetAddress from, int sessionIndex, UUID planId, 
StreamOperation streamOperation, boolean isForOutgoing, boolean 
keepSSTableLevel, UUID pendingRepair, PreviewKind previewKind)
+public StreamInitMessage(InetAddress from, int sessionIndex, UUID planId, 
StreamOperation streamOperation, boolean keepSSTableLevel, UUID pendingRepair, 
PreviewKind previewKind)
 {
+super(Type.STREAM_INIT);
 this.from = from;
 this.sessionIndex = sessionIndex;
 this.planId = planId;
 this.streamOperation = streamOperation;
-this.isForOutgoing = isForOutgoing;
 this.keepSSTableLevel = keepSSTableLevel;
 this.pendingRepair = pendingRepair;
 this.previewKind = previewKind;
 }
 
-/**
- * Create serialized message.
- *
- * @param compress true if message is compressed
- * @param version Streaming protocol version
- * @return serialized message in ByteBuffer format
- */
-public ByteBuffer createMessage(boolean compress, int version)
+@Override
+public String toString()
 {
-int header = 0;
-// set compression bit.
-if (compress)
-header |= 4;
-// set streaming bit
-header |= 8;
-// Setting up the version bit
-header |= (version << 8);
-
-byte[] bytes;
-try
-{
-int size = (int)StreamInitMessage.serializer.serializedSize(this, 
version);
-try (DataOutputBuffer buffer = new DataOutputBufferFixed(size))
-{
-StreamInitMessage.serializer.serialize(this, buffer, version);
-bytes = buffer.getData();
-}
-}
-catch (IOException e)
-{
-throw new RuntimeException(e);
-}
-assert bytes.length > 0;
-
-ByteBuffer buffer = ByteBuffer.allocate(4 + 4 + bytes.length);
-buffer.putInt(MessagingService.PROTOCOL_MAGIC);
-buffer.putInt(header);
-buffer.put(bytes);
-buffer.flip();
-return buffer;
+StringBuilder sb = new StringBuilder(128);
+sb.append("StreamInitMessage: from = ").append(from);
+sb.append(", planId = ").append(planId).append(", session index = 
").append(sessionIndex);
+return sb.toString();
 }
 
-private static class StreamInitMessageSerializer implements 
IVersionedSerializer
+private static class StreamInitMessageSerializer implements 
Serializer
 {
-public void serialize(St

[11/11] cassandra git commit: switch internode messaging to netty

2017-08-22 Thread jasobrown
switch internode messaging to netty

patch by jasobrown, reviewed by pcmanus for CASSANDRA-8457


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/356dc3c2
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/356dc3c2
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/356dc3c2

Branch: refs/heads/trunk
Commit: 356dc3c253224751cbf80b32cfce4e3c1640de11
Parents: 3d4a7e7
Author: Jason Brown 
Authored: Mon Feb 8 07:04:00 2016 -0800
Committer: Jason Brown 
Committed: Tue Aug 22 13:54:44 2017 -0700

--
 CHANGES.txt |   1 +
 build.xml   |   2 +-
 conf/cassandra-env.sh   |   1 +
 lib/licenses/netty-4.1.14.txt   | 202 ++
 lib/licenses/netty-all-4.0.44.Final.txt | 202 --
 lib/netty-all-4.0.44.Final.jar  | Bin 2342652 -> 0 bytes
 lib/netty-all-4.1.14.Final.jar  | Bin 0 -> 3690637 bytes
 .../org/apache/cassandra/config/Config.java |   7 +-
 .../cassandra/config/DatabaseDescriptor.java|  15 +
 .../cassandra/config/EncryptionOptions.java |   4 +-
 src/java/org/apache/cassandra/db/TypeSizes.java |   6 +
 .../cassandra/locator/PropertyFileSnitch.java   |   2 +-
 .../locator/ReconnectableSnitchHelper.java  |  10 +-
 .../cassandra/metrics/ConnectionMetrics.java|  27 +-
 .../cassandra/net/IncomingTcpConnection.java| 197 -
 .../org/apache/cassandra/net/MessageIn.java |  35 +-
 .../org/apache/cassandra/net/MessageOut.java| 128 +++-
 .../apache/cassandra/net/MessagingService.java  | 577 +++
 .../cassandra/net/OutboundTcpConnection.java| 693 --
 .../net/OutboundTcpConnectionPool.java  | 229 --
 .../net/async/ByteBufDataInputPlus.java |  31 +
 .../net/async/ByteBufDataOutputPlus.java| 140 
 .../cassandra/net/async/ChannelWriter.java  | 418 +++
 .../cassandra/net/async/ExpiredException.java   |  28 +
 .../cassandra/net/async/HandshakeProtocol.java  | 304 
 .../net/async/InboundHandshakeHandler.java  | 293 
 .../cassandra/net/async/MessageInHandler.java   | 314 
 .../cassandra/net/async/MessageOutHandler.java  | 324 +
 .../cassandra/net/async/MessageResult.java  |  51 ++
 .../cassandra/net/async/NettyFactory.java   | 375 ++
 .../net/async/OutboundConnectionIdentifier.java | 161 +
 .../net/async/OutboundConnectionParams.java | 202 ++
 .../net/async/OutboundHandshakeHandler.java | 255 +++
 .../net/async/OutboundMessagingConnection.java  | 716 +++
 .../net/async/OutboundMessagingPool.java| 173 +
 .../cassandra/net/async/QueuedMessage.java  |  75 ++
 .../apache/cassandra/security/SSLFactory.java   | 222 +++---
 .../streaming/DefaultConnectionFactory.java |  31 +-
 .../org/apache/cassandra/tracing/Tracing.java   |   3 +-
 .../apache/cassandra/tracing/TracingImpl.java   |   3 +-
 .../org/apache/cassandra/transport/Message.java |   4 +-
 .../org/apache/cassandra/transport/Server.java  |  25 +-
 .../cassandra/transport/SimpleClient.java   |  18 +-
 .../cassandra/utils/CoalescingStrategies.java   | 406 ---
 .../org/apache/cassandra/utils/FBUtilities.java |   7 +
 .../apache/cassandra/utils/NativeLibrary.java   |   2 +-
 test/conf/cassandra_ssl_test.keystore   | Bin 0 -> 2281 bytes
 test/conf/cassandra_ssl_test.truststore | Bin 0 -> 992 bytes
 .../apache/cassandra/db/ReadCommandTest.java|  33 +
 .../apache/cassandra/locator/EC2SnitchTest.java |  20 -
 .../cassandra/net/MessagingServiceTest.java | 120 +++-
 .../net/OutboundTcpConnectionTest.java  | 175 -
 .../net/async/ByteBufDataOutputPlusTest.java| 178 +
 .../cassandra/net/async/ChannelWriterTest.java  | 312 
 .../net/async/HandshakeHandlersTest.java| 204 ++
 .../net/async/HandshakeProtocolTest.java|  95 +++
 .../net/async/InboundHandshakeHandlerTest.java  | 289 
 .../net/async/MessageInHandlerTest.java | 242 +++
 .../net/async/MessageOutHandlerTest.java| 289 
 .../cassandra/net/async/NettyFactoryTest.java   | 300 
 .../NonSendingOutboundMessagingConnection.java  |  42 ++
 .../net/async/OutboundConnectionParamsTest.java |  36 +
 .../net/async/OutboundHandshakeHandlerTest.java | 209 ++
 .../async/OutboundMessagingConnectionTest.java  | 519 ++
 .../net/async/OutboundMessagingPoolTest.java| 149 
 .../cassandra/net/async/TestAuthenticator.java  |  42 ++
 .../RepairMessageSerializationsTest.java|   2 +
 .../cassandra/security/SSLFactoryTest.java  | 136 +++-
 .../streaming/StreamingTransferTest.java|  29 +-
 .../utils/CoalescingStrategiesTest.java | 453 ++--
 70 files changed, 8024

[07/11] cassandra git commit: switch internode messaging to netty

2017-08-22 Thread jasobrown
http://git-wip-us.apache.org/repos/asf/cassandra/blob/356dc3c2/src/java/org/apache/cassandra/security/SSLFactory.java
--
diff --git a/src/java/org/apache/cassandra/security/SSLFactory.java 
b/src/java/org/apache/cassandra/security/SSLFactory.java
index 33c1ad6..3c1293f 100644
--- a/src/java/org/apache/cassandra/security/SSLFactory.java
+++ b/src/java/org/apache/cassandra/security/SSLFactory.java
@@ -17,63 +17,67 @@
  */
 package org.apache.cassandra.security;
 
-import java.nio.file.Files;
-import java.nio.file.Paths;
-import java.io.InputStream;
+
 import java.io.IOException;
+import java.io.InputStream;
 import java.net.InetAddress;
-import java.net.InetSocketAddress;
+import java.nio.file.Files;
+import java.nio.file.Paths;
 import java.security.KeyStore;
 import java.security.cert.X509Certificate;
 import java.util.Arrays;
 import java.util.Date;
 import java.util.Enumeration;
 import java.util.List;
-
+import java.util.concurrent.atomic.AtomicReference;
 import javax.net.ssl.KeyManagerFactory;
 import javax.net.ssl.SSLContext;
 import javax.net.ssl.SSLParameters;
-import javax.net.ssl.SSLServerSocket;
 import javax.net.ssl.SSLSocket;
 import javax.net.ssl.TrustManager;
 import javax.net.ssl.TrustManagerFactory;
 
-import org.apache.cassandra.config.EncryptionOptions;
-import org.apache.cassandra.io.util.FileUtils;
-import org.slf4j.Logger;
-import org.slf4j.LoggerFactory;
-
+import com.google.common.annotations.VisibleForTesting;
 import com.google.common.base.Predicates;
 import com.google.common.collect.ImmutableSet;
 import com.google.common.collect.Iterables;
 import com.google.common.collect.Sets;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import io.netty.handler.ssl.ClientAuth;
+import io.netty.handler.ssl.OpenSsl;
+import io.netty.handler.ssl.SslContext;
+import io.netty.handler.ssl.SslContextBuilder;
+import io.netty.handler.ssl.SslProvider;
+import io.netty.handler.ssl.SupportedCipherSuiteFilter;
+import io.netty.util.ReferenceCountUtil;
+import org.apache.cassandra.config.EncryptionOptions;
 
 /**
- * A Factory for providing and setting up Client and Server SSL wrapped
- * Socket and ServerSocket
+ * A Factory for providing and setting up client {@link SSLSocket}s. Also 
provides
+ * methods for creating both JSSE {@link SSLContext} instances as well as 
netty {@link SslContext} instances.
+ *
+ * Netty {@link SslContext} instances are expensive to create (as well as to 
destroy) and consume a lof of resources
+ * (especially direct memory), but instances can be reused across connections 
(assuming the SSL params are the same).
+ * Hence we cache created instances in {@link #clientSslContext} and {@link 
#serverSslContext}.
  */
 public final class SSLFactory
 {
 private static final Logger logger = 
LoggerFactory.getLogger(SSLFactory.class);
-private static boolean checkedExpiry = false;
 
-public static SSLServerSocket getServerSocket(EncryptionOptions options, 
InetAddress address, int port) throws IOException
-{
-SSLContext ctx = createSSLContext(options, true);
-SSLServerSocket serverSocket = 
(SSLServerSocket)ctx.getServerSocketFactory().createServerSocket();
-try
-{
-serverSocket.setReuseAddress(true);
-prepareSocket(serverSocket, options);
-serverSocket.bind(new InetSocketAddress(address, port), 500);
-return serverSocket;
-}
-catch (IllegalArgumentException | SecurityException | IOException e)
-{
-serverSocket.close();
-throw e;
-}
-}
+@VisibleForTesting
+static volatile boolean checkedExpiry = false;
+
+/**
+ * A cached reference of the {@link SslContext} for client-facing 
connections.
+ */
+private static final AtomicReference clientSslContext = new 
AtomicReference<>();
+
+/**
+ * A cached reference of the {@link SslContext} for peer-to-peer, 
internode messaging connections.
+ */
+private static final AtomicReference serverSslContext = new 
AtomicReference<>();
 
 /** Create a socket and connect */
 public static SSLSocket getSocket(EncryptionOptions options, InetAddress 
address, int port, InetAddress localAddress, int localPort) throws IOException
@@ -109,37 +113,6 @@ public final class SSLFactory
 }
 }
 
-/** Just create a socket */
-public static SSLSocket getSocket(EncryptionOptions options) throws 
IOException
-{
-SSLContext ctx = createSSLContext(options, true);
-SSLSocket socket = (SSLSocket) ctx.getSocketFactory().createSocket();
-try
-{
-prepareSocket(socket, options);
-return socket;
-}
-catch (IllegalArgumentException e)
-{
-socket.close();
-throw e;
-}
-}
-
-/** Sets relevant socket options specified in encryption settings */
-private

[09/11] cassandra git commit: switch internode messaging to netty

2017-08-22 Thread jasobrown
http://git-wip-us.apache.org/repos/asf/cassandra/blob/356dc3c2/src/java/org/apache/cassandra/net/async/ChannelWriter.java
--
diff --git a/src/java/org/apache/cassandra/net/async/ChannelWriter.java 
b/src/java/org/apache/cassandra/net/async/ChannelWriter.java
new file mode 100644
index 000..e984736
--- /dev/null
+++ b/src/java/org/apache/cassandra/net/async/ChannelWriter.java
@@ -0,0 +1,418 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.cassandra.net.async;
+
+import java.io.IOException;
+import java.util.Optional;
+import java.util.Queue;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.atomic.AtomicBoolean;
+import java.util.concurrent.atomic.AtomicLong;
+import java.util.function.Consumer;
+
+import com.google.common.annotations.VisibleForTesting;
+
+import io.netty.buffer.Unpooled;
+import io.netty.channel.Channel;
+import io.netty.channel.ChannelFuture;
+import io.netty.channel.ChannelFutureListener;
+import io.netty.channel.ChannelHandlerContext;
+import io.netty.channel.ChannelOutboundBuffer;
+import io.netty.channel.ChannelPromise;
+import io.netty.channel.MessageSizeEstimator;
+import io.netty.handler.timeout.IdleStateEvent;
+import io.netty.util.Attribute;
+import io.netty.util.AttributeKey;
+import io.netty.util.concurrent.Future;
+import org.apache.cassandra.config.DatabaseDescriptor;
+import org.apache.cassandra.utils.CoalescingStrategies;
+import org.apache.cassandra.utils.CoalescingStrategies.CoalescingStrategy;
+
+/**
+ * Represents a ready and post-handshake channel that can send outbound 
messages. This class groups a netty channel
+ * with any other channel-related information we track and, most importantly, 
handles the details on when the channel is flushed.
+ *
+ * Flushing
+ *
+ * We don't flush to the socket on every message as it's a bit of a 
performance drag (making the system call, copying
+ * the buffer, sending out a small packet). Thus, by waiting until we have a 
decent chunk of data (for some definition
+ * of 'decent'), we can achieve better efficiency and improved performance 
(yay!).
+ * 
+ * When to flush mainly depends on whether we use message coalescing or not 
(see {@link CoalescingStrategies}).
+ * 
+ * Note that the callback functions are invoked on the netty event loop, which 
is (in almost all cases) different
+ * from the thread that will be invoking {@link #write(QueuedMessage, 
boolean)}.
+ *
+ * Flushing without coalescing
+ *
+ * When no coalescing is in effect, we want to send new message "right away". 
However, as said above, flushing after
+ * every message would be particularly inefficient when there is lots of 
message in our sending queue, and so in
+ * practice we want to flush in 2 cases:
+ *  1) After any message if there is no pending message in the send 
queue.
+ *  2) When we've filled up or exceeded the netty outbound buffer (see {@link 
ChannelOutboundBuffer})
+ * 
+ * The second part is relatively simple and handled generically in {@link 
MessageOutHandler#write(ChannelHandlerContext, Object, ChannelPromise)} [1].
+ * The first part however is made a little more complicated by how netty's 
event loop executes. It is woken up by
+ * external callers to the channel invoking a flush, via either {@link 
Channel#flush} or one of the {@link Channel#writeAndFlush}
+ * methods [2]. So a plain {@link Channel#write} will only queue the message 
in the channel, and not wake up the event loop.
+ * 
+ * This means we don't want to simply call {@link Channel#write} as we want 
the message processed immediately. But we
+ * also don't want to flush on every message if there is more in the sending 
queue, so simply calling
+ * {@link Channel#writeAndFlush} isn't completely appropriate either. In 
practice, we handle this by calling
+ * {@link Channel#writeAndFlush} (so the netty event loop does wake 
up), but we override the flush behavior so
+ * it actually only flushes if there are no pending messages (see how {@link 
MessageOutHandler#flush} delegates the flushing
+ * decision back to this class through {@link #onTriggeredFlush}, and how 
{@link SimpleChannelWriter} makes this a no-

[06/11] cassandra git commit: switch internode messaging to netty

2017-08-22 Thread jasobrown
http://git-wip-us.apache.org/repos/asf/cassandra/blob/356dc3c2/test/unit/org/apache/cassandra/net/async/ChannelWriterTest.java
--
diff --git a/test/unit/org/apache/cassandra/net/async/ChannelWriterTest.java 
b/test/unit/org/apache/cassandra/net/async/ChannelWriterTest.java
new file mode 100644
index 000..128fe4b
--- /dev/null
+++ b/test/unit/org/apache/cassandra/net/async/ChannelWriterTest.java
@@ -0,0 +1,312 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.cassandra.net.async;
+
+import java.io.IOException;
+import java.net.InetSocketAddress;
+import java.util.Optional;
+import java.util.concurrent.BlockingQueue;
+import java.util.concurrent.LinkedBlockingQueue;
+
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+import io.netty.buffer.ByteBuf;
+import io.netty.channel.ChannelHandlerContext;
+import io.netty.channel.ChannelOption;
+import io.netty.channel.ChannelOutboundHandlerAdapter;
+import io.netty.channel.ChannelPromise;
+import io.netty.channel.WriteBufferWaterMark;
+import io.netty.channel.embedded.EmbeddedChannel;
+import org.apache.cassandra.config.DatabaseDescriptor;
+import org.apache.cassandra.net.MessageOut;
+import org.apache.cassandra.net.MessagingService;
+import org.apache.cassandra.net.async.ChannelWriter.CoalescingChannelWriter;
+import org.apache.cassandra.utils.CoalescingStrategies;
+import org.apache.cassandra.utils.CoalescingStrategies.CoalescingStrategy;
+
+import static org.apache.cassandra.net.MessagingService.Verb.ECHO;
+
+/**
+ * with the write_Coalescing_* methods, if there's data in the 
channel.unsafe().outboundBuffer()
+ * it means that there's something in the channel that hasn't yet been flushed 
to the transport (socket).
+ * once a flush occurs, there will be an entry in EmbeddedChannel's 
outboundQueue. those two facts are leveraged in these tests.
+ */
+public class ChannelWriterTest
+{
+private static final int COALESCE_WINDOW_MS = 10;
+
+private EmbeddedChannel channel;
+private ChannelWriter channelWriter;
+private NonSendingOutboundMessagingConnection omc;
+private Optional coalescingStrategy;
+
+@BeforeClass
+public static void before()
+{
+DatabaseDescriptor.daemonInitialization();
+}
+
+@Before
+public void setup()
+{
+OutboundConnectionIdentifier id = 
OutboundConnectionIdentifier.small(new InetSocketAddress("127.0.0.1", 0),
+ 
new InetSocketAddress("127.0.0.2", 0));
+channel = new EmbeddedChannel();
+omc = new NonSendingOutboundMessagingConnection(id, null, 
Optional.empty());
+channelWriter = ChannelWriter.create(channel, 
omc::handleMessageResult, Optional.empty());
+channel.pipeline().addFirst(new MessageOutHandler(id, 
MessagingService.current_version, channelWriter, () -> null));
+coalescingStrategy = 
CoalescingStrategies.newCoalescingStrategy(CoalescingStrategies.Strategy.FIXED.name(),
 COALESCE_WINDOW_MS, null, "test");
+}
+
+@Test
+public void create_nonCoalescing()
+{
+Assert.assertSame(ChannelWriter.SimpleChannelWriter.class, 
ChannelWriter.create(channel, omc::handleMessageResult, 
Optional.empty()).getClass());
+}
+
+@Test
+public void create_Coalescing()
+{
+Assert.assertSame(CoalescingChannelWriter.class, 
ChannelWriter.create(channel, omc::handleMessageResult, 
coalescingStrategy).getClass());
+}
+
+@Test
+public void write_IsWritable()
+{
+Assert.assertTrue(channel.isWritable());
+Assert.assertTrue(channelWriter.write(new QueuedMessage(new 
MessageOut<>(ECHO), 42), true));
+Assert.assertTrue(channel.isWritable());
+Assert.assertTrue(channel.releaseOutbound());
+}
+
+@Test
+public void write_NotWritable()
+{
+channel.config().setOption(ChannelOption.WRITE_BUFFER_WATER_MARK, new 
WriteBufferWaterMark(1, 2));
+
+// send one message through, which will trigger the writability check 
(and turn it off)
+Assert.assertTrue(chan

[03/11] cassandra git commit: move streaming to use netty

2017-08-22 Thread jasobrown
http://git-wip-us.apache.org/repos/asf/cassandra/blob/fc92db2b/src/java/org/apache/cassandra/streaming/StreamReader.java
--
diff --git a/src/java/org/apache/cassandra/streaming/StreamReader.java 
b/src/java/org/apache/cassandra/streaming/StreamReader.java
index 3a95015..590ba5f 100644
--- a/src/java/org/apache/cassandra/streaming/StreamReader.java
+++ b/src/java/org/apache/cassandra/streaming/StreamReader.java
@@ -18,8 +18,6 @@
 package org.apache.cassandra.streaming;
 
 import java.io.*;
-import java.nio.channels.Channels;
-import java.nio.channels.ReadableByteChannel;
 import java.util.Collection;
 import java.util.UUID;
 
@@ -30,8 +28,7 @@ import com.google.common.collect.UnmodifiableIterator;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
-import com.ning.compress.lzf.LZFInputStream;
-
+import org.apache.cassandra.io.util.TrackedDataInputPlus;
 import org.apache.cassandra.schema.TableId;
 import org.apache.cassandra.schema.TableMetadata;
 import org.apache.cassandra.db.*;
@@ -42,9 +39,10 @@ import 
org.apache.cassandra.io.sstable.format.RangeAwareSSTableWriter;
 import org.apache.cassandra.io.sstable.format.SSTableFormat;
 import org.apache.cassandra.io.sstable.format.Version;
 import org.apache.cassandra.io.util.DataInputPlus;
+import org.apache.cassandra.streaming.compress.StreamCompressionInputStream;
 import org.apache.cassandra.streaming.messages.FileMessageHeader;
+import org.apache.cassandra.streaming.messages.StreamMessage;
 import org.apache.cassandra.utils.ByteBufferUtil;
-import org.apache.cassandra.io.util.TrackedInputStream;
 import org.apache.cassandra.utils.FBUtilities;
 import org.apache.cassandra.utils.Pair;
 
@@ -88,12 +86,12 @@ public class StreamReader
 }
 
 /**
- * @param channel where this reads data from
+ * @param inputPlus where this reads data from
  * @return SSTable transferred
  * @throws IOException if reading the remote sstable fails. Will throw an 
RTE if local write fails.
  */
-@SuppressWarnings("resource") // channel needs to remain open, streams on 
top of it can't be closed
-public SSTableMultiWriter read(ReadableByteChannel channel) throws 
IOException
+@SuppressWarnings("resource") // input needs to remain open, streams on 
top of it can't be closed
+public SSTableMultiWriter read(DataInputPlus inputPlus) throws IOException
 {
 long totalSize = totalSize();
 
@@ -108,7 +106,8 @@ public class StreamReader
  session.planId(), fileSeqNum, session.peer, repairedAt, 
totalSize, cfs.keyspace.getName(),
  cfs.getTableName(), pendingRepair);
 
-TrackedInputStream in = new TrackedInputStream(new 
LZFInputStream(Channels.newInputStream(channel)));
+
+TrackedDataInputPlus in = new TrackedDataInputPlus(new 
StreamCompressionInputStream(inputPlus, StreamMessage.CURRENT_VERSION));
 StreamDeserializer deserializer = new 
StreamDeserializer(cfs.metadata(), in, inputVersion, getHeader(cfs.metadata()));
 SSTableMultiWriter writer = null;
 try
@@ -179,10 +178,10 @@ public class StreamReader
 private Row staticRow;
 private IOException exception;
 
-public StreamDeserializer(TableMetadata metadata, InputStream in, 
Version version, SerializationHeader header) throws IOException
+public StreamDeserializer(TableMetadata metadata, DataInputPlus in, 
Version version, SerializationHeader header) throws IOException
 {
 this.metadata = metadata;
-this.in = new DataInputPlus.DataInputStreamPlus(in);
+this.in = in;
 this.helper = new SerializationHelper(metadata, 
version.correspondingMessagingVersion(), 
SerializationHelper.Flag.PRESERVE_SIZE);
 this.header = header;
 }
@@ -256,8 +255,8 @@ public class StreamReader
 // to what we do in hasNext)
 Unfiltered unfiltered = iterator.next();
 return metadata.isCounter() && unfiltered.kind() == 
Unfiltered.Kind.ROW
- ? maybeMarkLocalToBeCleared((Row) unfiltered)
- : unfiltered;
+   ? maybeMarkLocalToBeCleared((Row) unfiltered)
+   : unfiltered;
 }
 
 private Row maybeMarkLocalToBeCleared(Row row)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/fc92db2b/src/java/org/apache/cassandra/streaming/StreamReceiveException.java
--
diff --git 
a/src/java/org/apache/cassandra/streaming/StreamReceiveException.java 
b/src/java/org/apache/cassandra/streaming/StreamReceiveException.java
new file mode 100644
index 000..54b365a
--- /dev/null
+++ b/src/java/org/apache/cassandra/streaming/StreamReceiveException.java
@@ -0,0 +1,36 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTIC

[02/11] cassandra git commit: move streaming to use netty

2017-08-22 Thread jasobrown
http://git-wip-us.apache.org/repos/asf/cassandra/blob/fc92db2b/src/java/org/apache/cassandra/streaming/async/StreamingInboundHandler.java
--
diff --git 
a/src/java/org/apache/cassandra/streaming/async/StreamingInboundHandler.java 
b/src/java/org/apache/cassandra/streaming/async/StreamingInboundHandler.java
new file mode 100644
index 000..cc6f9e0
--- /dev/null
+++ b/src/java/org/apache/cassandra/streaming/async/StreamingInboundHandler.java
@@ -0,0 +1,268 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.cassandra.streaming.async;
+
+import java.io.EOFException;
+import java.io.IOException;
+import java.net.InetAddress;
+import java.net.InetSocketAddress;
+import java.util.UUID;
+import java.util.concurrent.TimeUnit;
+import java.util.function.Function;
+import javax.annotation.Nullable;
+
+import com.google.common.annotations.VisibleForTesting;
+import com.google.common.util.concurrent.Uninterruptibles;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import io.netty.buffer.ByteBuf;
+import io.netty.channel.Channel;
+import io.netty.channel.ChannelHandlerContext;
+import io.netty.channel.ChannelInboundHandlerAdapter;
+import io.netty.util.ReferenceCountUtil;
+import io.netty.util.concurrent.FastThreadLocalThread;
+import org.apache.cassandra.net.async.RebufferingByteBufDataInputPlus;
+import org.apache.cassandra.streaming.StreamManager;
+import org.apache.cassandra.streaming.StreamReceiveException;
+import org.apache.cassandra.streaming.StreamResultFuture;
+import org.apache.cassandra.streaming.StreamSession;
+import org.apache.cassandra.streaming.messages.FileMessageHeader;
+import org.apache.cassandra.streaming.messages.IncomingFileMessage;
+import org.apache.cassandra.streaming.messages.KeepAliveMessage;
+import org.apache.cassandra.streaming.messages.StreamInitMessage;
+import org.apache.cassandra.streaming.messages.StreamMessage;
+import org.apache.cassandra.utils.JVMStabilityInspector;
+
+import static 
org.apache.cassandra.streaming.async.NettyStreamingMessageSender.createLogTag;
+
+/**
+ * Handles the inbound side of streaming messages and sstable data. From the 
incoming data, we derserialize the message
+ * and potentially reify partitions and rows and write those out to new 
sstable files. Because deserialization is a blocking affair,
+ * we can't block the netty event loop. Thus we have a background thread 
perform all the blocking deserialization.
+ */
+public class StreamingInboundHandler extends ChannelInboundHandlerAdapter
+{
+private static final Logger logger = 
LoggerFactory.getLogger(StreamingInboundHandler.class);
+static final Function 
DEFAULT_SESSION_PROVIDER = sid -> StreamManager.instance.findSession(sid.from, 
sid.planId, sid.sessionIndex);
+
+private static final int AUTO_READ_LOW_WATER_MARK = 1 << 15;
+private static final int AUTO_READ_HIGH_WATER_MARK = 1 << 16;
+
+private final InetSocketAddress remoteAddress;
+private final int protocolVersion;
+
+private final StreamSession session;
+
+/**
+ * A collection of {@link ByteBuf}s that are yet to be processed. Incoming 
buffers are first dropped into this
+ * structure, and then consumed.
+ * 
+ * For thread safety, this structure's resources are released on the 
consuming thread
+ * (via {@link RebufferingByteBufDataInputPlus#close()},
+ * but the producing side calls {@link 
RebufferingByteBufDataInputPlus#markClose()} to notify the input that is should 
close.
+ */
+private RebufferingByteBufDataInputPlus buffers;
+
+private volatile boolean closed;
+
+public StreamingInboundHandler(InetSocketAddress remoteAddress, int 
protocolVersion, @Nullable StreamSession session)
+{
+this.remoteAddress = remoteAddress;
+this.protocolVersion = protocolVersion;
+this.session = session;
+}
+
+@Override
+@SuppressWarnings("resource")
+public void handlerAdded(ChannelHandlerContext ctx)
+{
+buffers = new 
RebufferingByteBufDataInputPlus(AUTO_READ_LOW_WATER_MARK, 
AUTO_READ_HIGH_WATER_MARK, ctx.channel().config());
+Thread blockingIOThread = ne

[08/11] cassandra git commit: switch internode messaging to netty

2017-08-22 Thread jasobrown
http://git-wip-us.apache.org/repos/asf/cassandra/blob/356dc3c2/src/java/org/apache/cassandra/net/async/NettyFactory.java
--
diff --git a/src/java/org/apache/cassandra/net/async/NettyFactory.java 
b/src/java/org/apache/cassandra/net/async/NettyFactory.java
new file mode 100644
index 000..13d8810
--- /dev/null
+++ b/src/java/org/apache/cassandra/net/async/NettyFactory.java
@@ -0,0 +1,375 @@
+package org.apache.cassandra.net.async;
+
+import java.net.InetSocketAddress;
+import java.util.zip.Checksum;
+
+import javax.net.ssl.SSLEngine;
+import javax.net.ssl.SSLParameters;
+
+import com.google.common.annotations.VisibleForTesting;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import io.netty.bootstrap.Bootstrap;
+import io.netty.bootstrap.ServerBootstrap;
+import io.netty.channel.Channel;
+import io.netty.channel.ChannelFuture;
+import io.netty.channel.ChannelInitializer;
+import io.netty.channel.ChannelOption;
+import io.netty.channel.ChannelPipeline;
+import io.netty.channel.EventLoopGroup;
+import io.netty.channel.ServerChannel;
+import io.netty.channel.epoll.Epoll;
+import io.netty.channel.epoll.EpollEventLoopGroup;
+import io.netty.channel.epoll.EpollServerSocketChannel;
+import io.netty.channel.epoll.EpollSocketChannel;
+import io.netty.channel.group.ChannelGroup;
+import io.netty.channel.nio.NioEventLoopGroup;
+import io.netty.channel.socket.SocketChannel;
+import io.netty.channel.socket.nio.NioServerSocketChannel;
+import io.netty.channel.socket.nio.NioSocketChannel;
+import io.netty.handler.codec.compression.Lz4FrameDecoder;
+import io.netty.handler.codec.compression.Lz4FrameEncoder;
+import io.netty.handler.logging.LogLevel;
+import io.netty.handler.logging.LoggingHandler;
+import io.netty.handler.ssl.OpenSsl;
+import io.netty.handler.ssl.SslContext;
+import io.netty.handler.ssl.SslHandler;
+import io.netty.util.concurrent.DefaultEventExecutor;
+import io.netty.util.concurrent.DefaultThreadFactory;
+import io.netty.util.concurrent.EventExecutor;
+import io.netty.util.internal.logging.InternalLoggerFactory;
+import io.netty.util.internal.logging.Slf4JLoggerFactory;
+import net.jpountz.lz4.LZ4Factory;
+import net.jpountz.xxhash.XXHashFactory;
+import org.apache.cassandra.auth.IInternodeAuthenticator;
+import org.apache.cassandra.config.DatabaseDescriptor;
+import org.apache.cassandra.config.EncryptionOptions.ServerEncryptionOptions;
+import 
org.apache.cassandra.config.EncryptionOptions.ServerEncryptionOptions.InternodeEncryption;
+import org.apache.cassandra.exceptions.ConfigurationException;
+import org.apache.cassandra.net.MessagingService;
+import org.apache.cassandra.security.SSLFactory;
+import org.apache.cassandra.service.NativeTransportService;
+import org.apache.cassandra.utils.ChecksumType;
+import org.apache.cassandra.utils.CoalescingStrategies;
+import org.apache.cassandra.utils.FBUtilities;
+
+/**
+ * A factory for building Netty {@link Channel}s. Channels here are setup with 
a pipeline to participate
+ * in the internode protocol handshake, either the inbound or outbound side as 
per the method invoked.
+ */
+public final class NettyFactory
+{
+private static final Logger logger = 
LoggerFactory.getLogger(NettyFactory.class);
+
+/**
+ * The block size for use with netty's lz4 code.
+ */
+private static final int COMPRESSION_BLOCK_SIZE = 1 << 16;
+
+private static final int LZ4_HASH_SEED = 0x9747b28c;
+
+public enum Mode { MESSAGING, STREAMING }
+
+private static final String SSL_CHANNEL_HANDLER_NAME = "ssl";
+static final String INBOUND_COMPRESSOR_HANDLER_NAME = "inboundCompressor";
+static final String OUTBOUND_COMPRESSOR_HANDLER_NAME = 
"outboundCompressor";
+private static final String HANDSHAKE_HANDLER_NAME = "handshakeHandler";
+
+/** a useful addition for debugging; simply set to true to get more data 
in your logs */
+private static final boolean WIRETRACE = false;
+static
+{
+if (WIRETRACE)
+
InternalLoggerFactory.setDefaultFactory(Slf4JLoggerFactory.INSTANCE);
+}
+
+private static final boolean DEFAULT_USE_EPOLL = 
NativeTransportService.useEpoll();
+static
+{
+if (!DEFAULT_USE_EPOLL)
+logger.warn("epoll not availble {}", Epoll.unavailabilityCause());
+}
+
+/**
+ * The size of the receive queue for the outbound channels. As outbound 
channels do not receive data
+ * (outside of the internode messaging protocol's handshake), this value 
can be relatively small.
+ */
+private static final int OUTBOUND_CHANNEL_RECEIVE_BUFFER_SIZE = 1 << 10;
+
+/**
+ * The size of the send queue for the inbound channels. As inbound 
channels do not send data
+ * (outside of the internode messaging protocol's handshake), this value 
can be relatively small.
+ */
+private static final int INBOUND_CHANNEL_SEND_BUFFER_SIZE = 1 << 10;
+
+/**
+ * A factor

[10/11] cassandra git commit: switch internode messaging to netty

2017-08-22 Thread jasobrown
http://git-wip-us.apache.org/repos/asf/cassandra/blob/356dc3c2/src/java/org/apache/cassandra/net/MessagingService.java
--
diff --git a/src/java/org/apache/cassandra/net/MessagingService.java 
b/src/java/org/apache/cassandra/net/MessagingService.java
index 41771e7..6caada1 100644
--- a/src/java/org/apache/cassandra/net/MessagingService.java
+++ b/src/java/org/apache/cassandra/net/MessagingService.java
@@ -17,45 +17,63 @@
  */
 package org.apache.cassandra.net;
 
-import java.io.*;
+import java.io.IOError;
+import java.io.IOException;
 import java.lang.management.ManagementFactory;
-import java.net.*;
-import java.nio.channels.AsynchronousCloseException;
-import java.nio.channels.ClosedChannelException;
-import java.nio.channels.ServerSocketChannel;
-import java.util.*;
+import java.net.InetAddress;
+import java.net.InetSocketAddress;
+import java.net.UnknownHostException;
+import java.util.ArrayList;
+import java.util.Collection;
+import java.util.EnumMap;
+import java.util.EnumSet;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.Set;
 import java.util.concurrent.ConcurrentMap;
 import java.util.concurrent.CopyOnWriteArraySet;
 import java.util.concurrent.TimeUnit;
 import java.util.concurrent.atomic.AtomicInteger;
-import java.util.stream.Collectors;
-import java.util.stream.StreamSupport;
-
 import javax.management.MBeanServer;
 import javax.management.ObjectName;
-import javax.net.ssl.SSLHandshakeException;
 
 import com.google.common.annotations.VisibleForTesting;
 import com.google.common.base.Function;
 import com.google.common.collect.Lists;
-import com.google.common.collect.Sets;
-
 import org.cliffc.high_scale_lib.NonBlockingHashMap;
-
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
 import com.carrotsearch.hppc.IntObjectMap;
 import com.carrotsearch.hppc.IntObjectOpenHashMap;
+import io.netty.channel.Channel;
+import io.netty.channel.group.ChannelGroup;
+import io.netty.channel.group.DefaultChannelGroup;
+import org.apache.cassandra.auth.IInternodeAuthenticator;
+import org.apache.cassandra.batchlog.Batch;
 import org.apache.cassandra.concurrent.ExecutorLocals;
+import org.apache.cassandra.concurrent.LocalAwareExecutorService;
 import org.apache.cassandra.concurrent.ScheduledExecutors;
 import org.apache.cassandra.concurrent.Stage;
 import org.apache.cassandra.concurrent.StageManager;
-import org.apache.cassandra.concurrent.LocalAwareExecutorService;
 import org.apache.cassandra.config.DatabaseDescriptor;
 import org.apache.cassandra.config.EncryptionOptions.ServerEncryptionOptions;
-import org.apache.cassandra.db.*;
-import org.apache.cassandra.batchlog.Batch;
+import org.apache.cassandra.db.ColumnFamilyStore;
+import org.apache.cassandra.db.ConsistencyLevel;
+import org.apache.cassandra.db.CounterMutation;
+import org.apache.cassandra.db.IMutation;
+import org.apache.cassandra.db.Keyspace;
+import org.apache.cassandra.db.Mutation;
+import org.apache.cassandra.db.ReadCommand;
+import org.apache.cassandra.db.ReadResponse;
+import org.apache.cassandra.db.SnapshotCommand;
+import org.apache.cassandra.db.SystemKeyspace;
+import org.apache.cassandra.db.TruncateResponse;
+import org.apache.cassandra.db.Truncation;
+import org.apache.cassandra.db.WriteResponse;
 import org.apache.cassandra.dht.AbstractBounds;
 import org.apache.cassandra.dht.BootStrapper;
 import org.apache.cassandra.dht.IPartitioner;
@@ -70,22 +88,32 @@ import org.apache.cassandra.hints.HintResponse;
 import org.apache.cassandra.io.IVersionedSerializer;
 import org.apache.cassandra.io.util.DataInputPlus;
 import org.apache.cassandra.io.util.DataOutputPlus;
-import org.apache.cassandra.io.util.FileUtils;
+import org.apache.cassandra.locator.IEndpointSnitch;
 import org.apache.cassandra.locator.ILatencySubscriber;
 import org.apache.cassandra.metrics.CassandraMetricsRegistry;
 import org.apache.cassandra.metrics.ConnectionMetrics;
 import org.apache.cassandra.metrics.DroppedMessageMetrics;
 import org.apache.cassandra.metrics.MessagingMetrics;
+import org.apache.cassandra.net.async.OutboundMessagingPool;
+import org.apache.cassandra.net.async.NettyFactory;
+import org.apache.cassandra.net.async.NettyFactory.InboundInitializer;
 import org.apache.cassandra.repair.messages.RepairMessage;
 import org.apache.cassandra.schema.MigrationManager;
 import org.apache.cassandra.schema.TableId;
-import org.apache.cassandra.security.SSLFactory;
-import org.apache.cassandra.service.*;
+import org.apache.cassandra.service.AbstractWriteResponseHandler;
+import org.apache.cassandra.service.StorageProxy;
+import org.apache.cassandra.service.StorageService;
 import org.apache.cassandra.service.paxos.Commit;
 import org.apache.cassandra.service.paxos.PrepareResponse;
 import org.apache.cassandra.tracing.TraceState;
 import org.apache.cassandra.tracing.Tracing;
-import org.apache.c

[04/11] cassandra git commit: move streaming to use netty

2017-08-22 Thread jasobrown
move streaming to use netty

patch by jasobrown, reviewed by aweisberg for CASSANDRA-12229


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/fc92db2b
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/fc92db2b
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/fc92db2b

Branch: refs/heads/trunk
Commit: fc92db2b9b56c143516026ba29cecdec37e286bb
Parents: 356dc3c
Author: Jason Brown 
Authored: Mon Apr 11 05:26:18 2016 -0700
Committer: Jason Brown 
Committed: Tue Aug 22 13:54:44 2017 -0700

--
 CHANGES.txt |   1 +
 lib/compress-lzf-0.8.4.jar  | Bin 25490 -> 0 bytes
 .../org/apache/cassandra/config/Config.java |   6 -
 .../cassandra/config/DatabaseDescriptor.java|  15 -
 .../exceptions/ChecksumMismatchException.java   |  34 ++
 .../io/compress/CompressionMetadata.java|   4 +-
 .../cassandra/io/sstable/SSTableLoader.java |   2 +-
 .../io/util/DataIntegrityMetadata.java  |  26 +
 .../net/IncomingStreamingConnection.java| 104 
 .../net/async/ByteBufDataInputPlus.java |  12 +
 .../net/async/ByteBufDataOutputStreamPlus.java  | 191 +++
 .../net/async/InboundHandshakeHandler.java  |  36 +-
 .../cassandra/net/async/NettyFactory.java   |  24 +-
 .../net/async/OutboundConnectionIdentifier.java |  21 +-
 .../net/async/OutboundHandshakeHandler.java |   9 +-
 .../async/RebufferingByteBufDataInputPlus.java  | 250 +
 .../apache/cassandra/security/SSLFactory.java   |  49 --
 .../cassandra/service/StorageService.java   |  11 -
 .../cassandra/service/StorageServiceMBean.java  |   3 -
 .../cassandra/streaming/ConnectionHandler.java  | 428 
 .../streaming/DefaultConnectionFactory.java | 122 +++--
 .../streaming/StreamConnectionFactory.java  |  11 +-
 .../cassandra/streaming/StreamCoordinator.java  |  22 +-
 .../cassandra/streaming/StreamManager.java  |  24 +-
 .../apache/cassandra/streaming/StreamPlan.java  |   2 +-
 .../cassandra/streaming/StreamReader.java   |  25 +-
 .../streaming/StreamReceiveException.java   |  36 ++
 .../cassandra/streaming/StreamReceiveTask.java  |   1 +
 .../cassandra/streaming/StreamResultFuture.java |  32 +-
 .../cassandra/streaming/StreamSession.java  | 396 ---
 .../cassandra/streaming/StreamTransferTask.java |  10 +-
 .../cassandra/streaming/StreamWriter.java   | 115 +++--
 .../streaming/StreamingMessageSender.java   |  34 ++
 .../async/NettyStreamingMessageSender.java  | 508 +++
 .../async/StreamCompressionSerializer.java  | 133 +
 .../async/StreamingInboundHandler.java  | 268 ++
 .../cassandra/streaming/async/package-info.java |  71 +++
 .../ByteBufCompressionDataOutputStreamPlus.java |  76 +++
 .../compress/CompressedInputStream.java | 225 
 .../compress/CompressedStreamReader.java|  17 +-
 .../compress/CompressedStreamWriter.java|  25 +-
 .../compress/StreamCompressionInputStream.java  |  78 +++
 .../streaming/messages/CompleteMessage.java |  10 +-
 .../streaming/messages/FileMessageHeader.java   |  38 +-
 .../streaming/messages/IncomingFileMessage.java |  30 +-
 .../streaming/messages/KeepAliveMessage.java|   9 +-
 .../streaming/messages/OutgoingFileMessage.java |  28 +-
 .../streaming/messages/PrepareAckMessage.java   |  57 +++
 .../streaming/messages/PrepareMessage.java  |  93 
 .../messages/PrepareSynAckMessage.java  |  80 +++
 .../streaming/messages/PrepareSynMessage.java   |  98 
 .../streaming/messages/ReceivedMessage.java |  11 +-
 .../streaming/messages/RetryMessage.java|  71 ---
 .../messages/SessionFailedMessage.java  |  10 +-
 .../streaming/messages/StreamInitMessage.java   |  73 +--
 .../streaming/messages/StreamMessage.java   |  58 +--
 .../tools/BulkLoadConnectionFactory.java|  32 +-
 .../org/apache/cassandra/tools/NodeProbe.java   |   7 -
 .../cassandra/tools/nodetool/GetTimeout.java|   2 +-
 .../org/apache/cassandra/utils/UUIDGen.java |   8 +-
 .../cassandra/streaming/LongStreamingTest.java  |  34 +-
 .../cassandra/cql3/PreparedStatementsTest.java  |   2 +-
 .../util/RewindableDataInputStreamPlusTest.java |   2 +-
 .../net/async/HandshakeHandlersTest.java|   4 +-
 .../net/async/InboundHandshakeHandlerTest.java  |   8 +-
 .../async/OutboundMessagingConnectionTest.java  |  45 --
 .../RebufferingByteBufDataInputPlusTest.java| 126 +
 .../net/async/TestScheduledFuture.java  |  66 +++
 .../apache/cassandra/service/RemoveTest.java|   3 +-
 .../streaming/StreamTransferTaskTest.java   |   7 +-
 .../streaming/StreamingTransferTest.java|  28 -
 .../async/NettyStreamingMessageSenderTest.java  | 202 
 .../async/StreamCompressionSerializerTest.java  | 135 ++

[05/11] cassandra git commit: switch internode messaging to netty

2017-08-22 Thread jasobrown
http://git-wip-us.apache.org/repos/asf/cassandra/blob/356dc3c2/test/unit/org/apache/cassandra/net/async/OutboundMessagingConnectionTest.java
--
diff --git 
a/test/unit/org/apache/cassandra/net/async/OutboundMessagingConnectionTest.java 
b/test/unit/org/apache/cassandra/net/async/OutboundMessagingConnectionTest.java
new file mode 100644
index 000..772e47d
--- /dev/null
+++ 
b/test/unit/org/apache/cassandra/net/async/OutboundMessagingConnectionTest.java
@@ -0,0 +1,519 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.cassandra.net.async;
+
+import java.io.IOException;
+import java.net.InetAddress;
+import java.net.InetSocketAddress;
+import java.util.HashMap;
+import java.util.Map;
+import java.util.Optional;
+import java.util.concurrent.Delayed;
+import java.util.concurrent.ExecutionException;
+import java.util.concurrent.ScheduledFuture;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.TimeoutException;
+import java.util.concurrent.atomic.AtomicReference;
+import javax.net.ssl.SSLHandshakeException;
+
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+import io.netty.channel.ChannelFuture;
+import io.netty.channel.ChannelPromise;
+import io.netty.channel.embedded.EmbeddedChannel;
+import org.apache.cassandra.auth.AllowAllInternodeAuthenticator;
+import org.apache.cassandra.auth.IInternodeAuthenticator;
+import org.apache.cassandra.config.Config;
+import org.apache.cassandra.config.DatabaseDescriptor;
+import org.apache.cassandra.exceptions.ConfigurationException;
+import org.apache.cassandra.locator.AbstractEndpointSnitch;
+import org.apache.cassandra.locator.IEndpointSnitch;
+import org.apache.cassandra.net.MessageOut;
+import org.apache.cassandra.net.MessagingService;
+import org.apache.cassandra.net.MessagingServiceTest;
+import org.apache.cassandra.net.async.OutboundHandshakeHandler.HandshakeResult;
+import org.apache.cassandra.net.async.OutboundMessagingConnection.State;
+
+import static org.apache.cassandra.net.MessagingService.Verb.ECHO;
+import static 
org.apache.cassandra.net.async.OutboundMessagingConnection.State.CLOSED;
+import static 
org.apache.cassandra.net.async.OutboundMessagingConnection.State.CREATING_CHANNEL;
+import static 
org.apache.cassandra.net.async.OutboundMessagingConnection.State.NOT_READY;
+import static 
org.apache.cassandra.net.async.OutboundMessagingConnection.State.READY;
+
+public class OutboundMessagingConnectionTest
+{
+private static final InetSocketAddress LOCAL_ADDR = new 
InetSocketAddress("127.0.0.1", 9998);
+private static final InetSocketAddress REMOTE_ADDR = new 
InetSocketAddress("127.0.0.2", );
+private static final InetSocketAddress RECONNECT_ADDR = new 
InetSocketAddress("127.0.0.3", );
+private static final int MESSAGING_VERSION = 
MessagingService.current_version;
+
+private OutboundConnectionIdentifier connectionId;
+private OutboundMessagingConnection omc;
+private EmbeddedChannel channel;
+
+private IEndpointSnitch snitch;
+
+@BeforeClass
+public static void before()
+{
+DatabaseDescriptor.daemonInitialization();
+}
+
+@Before
+public void setup()
+{
+connectionId = OutboundConnectionIdentifier.small(LOCAL_ADDR, 
REMOTE_ADDR);
+omc = new OutboundMessagingConnection(connectionId, null, 
Optional.empty(), new AllowAllInternodeAuthenticator());
+channel = new EmbeddedChannel();
+omc.setChannelWriter(ChannelWriter.create(channel, 
omc::handleMessageResult, Optional.empty()));
+
+snitch = DatabaseDescriptor.getEndpointSnitch();
+}
+
+@After
+public void tearDown()
+{
+DatabaseDescriptor.setEndpointSnitch(snitch);
+channel.finishAndReleaseAll();
+}
+
+@Test
+public void sendMessage_CreatingChannel()
+{
+Assert.assertEquals(0, omc.backlogSize());
+omc.setState(CREATING_CHANNEL);
+Assert.assertTrue(omc.sendMessage(new MessageOut<>(ECHO), 1));
+Assert.assertEquals(1, omc.backlogSize());
+Assert.assertEquals(

[jira] [Commented] (CASSANDRA-13006) Disable automatic heap dumps on OOM error

2017-08-22 Thread Joshua McKenzie (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16137242#comment-16137242
 ] 

Joshua McKenzie commented on CASSANDRA-13006:
-

bq. Joshua McKenzie you worked on CASSANDRA-7507. Do you have any concern with 
the approach I am suggesting?
No major concerns; maybe we should have a commented out option in the config to 
add {{CrashOnOutOfMemoryError}} for operators that would prefer to use that?

bq. in the News.txt upgrade procedure: request people to use a java version >= 
8u92
We can also enforce that or warn when we version check in the startup scripts.

> Disable automatic heap dumps on OOM error
> -
>
> Key: CASSANDRA-13006
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13006
> Project: Cassandra
>  Issue Type: Bug
>  Components: Configuration
>Reporter: anmols
>Assignee: Benjamin Lerer
>Priority: Minor
> Fix For: 3.0.9
>
> Attachments: 13006-3.0.9.txt
>
>
> With CASSANDRA-9861, a change was added to enable collecting heap dumps by 
> default if the process encountered an OOM error. These heap dumps are stored 
> in the Apache Cassandra home directory unless configured otherwise (see 
> [Cassandra Support 
> Document|https://support.datastax.com/hc/en-us/articles/204225959-Generating-and-Analyzing-Heap-Dumps]
>  for this feature).
>  
> The creation and storage of heap dumps aides debugging and investigative 
> workflows, but is not be desirable for a production environment where these 
> heap dumps may occupy a large amount of disk space and require manual 
> intervention for cleanups. 
>  
> Managing heap dumps on out of memory errors and configuring the paths for 
> these heap dumps are available as JVM options in JVM. The current behavior 
> conflicts with the Boolean JVM flag HeapDumpOnOutOfMemoryError. 
>  
> A patch can be proposed here that would make the heap dump on OOM error honor 
> the HeapDumpOnOutOfMemoryError flag. Users who would want to still generate 
> heap dumps on OOM errors can set the -XX:+HeapDumpOnOutOfMemoryError JVM 
> option.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-11334) Make compaction operations (scrub, etc) asynchronous

2017-08-22 Thread Varun Gupta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16137211#comment-16137211
 ] 

Varun Gupta commented on CASSANDRA-11334:
-

[~michaelsembwever] Can you please review the patch?

> Make compaction operations (scrub, etc) asynchronous
> 
>
> Key: CASSANDRA-11334
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11334
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Tools
>Reporter: Yuki Morishita
>Priority: Minor
> Fix For: 4.x
>
> Attachments: make-compaction-manager-return-Future.patch
>
>
> Prior to make nodetool to use JMX notification, we need to make scrub, 
> verify, cleanup etc. to asynchronous.
> Right now, those operations block in {{CompactionManager}}, so instead we 
> want to be able to get {{Future}} object that can be listened for completion.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-11334) Make compaction operations (scrub, etc) asynchronous

2017-08-22 Thread Varun Gupta (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Gupta updated CASSANDRA-11334:

Attachment: make-compaction-manager-return-Future.patch

> Make compaction operations (scrub, etc) asynchronous
> 
>
> Key: CASSANDRA-11334
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11334
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Tools
>Reporter: Yuki Morishita
>Priority: Minor
> Fix For: 4.x
>
> Attachments: make-compaction-manager-return-Future.patch
>
>
> Prior to make nodetool to use JMX notification, we need to make scrub, 
> verify, cleanup etc. to asynchronous.
> Right now, those operations block in {{CompactionManager}}, so instead we 
> want to be able to get {{Future}} object that can be listened for completion.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-13785) Compaction fails for SSTables with large number of keys

2017-08-22 Thread Jay Zhuang (JIRA)
Jay Zhuang created CASSANDRA-13785:
--

 Summary: Compaction fails for SSTables with large number of keys
 Key: CASSANDRA-13785
 URL: https://issues.apache.org/jira/browse/CASSANDRA-13785
 Project: Cassandra
  Issue Type: Bug
  Components: Compaction
Reporter: Jay Zhuang
Assignee: Jay Zhuang


Every a few minutes there're "LEAK DTECTED" messages in the log:
{noformat}
ERROR [Reference-Reaper:1] 2017-08-18 17:18:40,357 Ref.java:223 - LEAK 
DETECTED: a reference (org.apache.cassandra.utils.concurrent.Ref$State@3ed22d7) 
to class 
org.apache.cassandra.utils.concurrent.WrappedSharedCloseable$Tidy@1022568824:[Memory@[0..159b6ba4),
 Memory@[0..d8123468)] was not released before the reference was garbage 
collected
ERROR [Reference-Reaper:1] 2017-08-18 17:20:49,693 Ref.java:223 - LEAK 
DETECTED: a reference 
(org.apache.cassandra.utils.concurrent.Ref$State@6470405b) to class 
org.apache.cassandra.utils.concurrent.WrappedSharedCloseable$Tidy@97898152:[Memory@[0..159b6ba4),
 Memory@[0..d8123468)] was not released before the reference was garbage 
collected
ERROR [Reference-Reaper:1] 2017-08-18 17:22:38,519 Ref.java:223 - LEAK 
DETECTED: a reference 
(org.apache.cassandra.utils.concurrent.Ref$State@6fc4af5f) to class 
org.apache.cassandra.utils.concurrent.WrappedSharedCloseable$Tidy@1247404854:[Memory@[0..159b6ba4),
 Memory@[0..d8123468)] was not released before the reference was garbage 
collected
{noformat}

Debugged the issue and found it's triggered by failed compactions, if the 
compacted SSTable has more than 51m {{Integer.MAX_VALUE / 40}}) keys, it will 
fail to create the IndexSummary: 
[IndexSummary:84|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/io/sstable/IndexSummary.java#L84].
Cassandra compaction tried to compact every a few minutes and keeps failing.

The root cause is while [creating 
SafeMemoryWriter|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/io/sstable/IndexSummaryBuilder.java#L112]
 with {{> Integer.MAX_VALUE}} space, it returns the tailing 
{{Integer.MAX_VALUE}} space 
[SafeMemoryWriter.java:83|https://github.com/apache/cassandra/blob/6a1b1f26b7174e8c9bf86a96514ab626ce2a4117/src/java/org/apache/cassandra/io/util/SafeMemoryWriter.java#L83],
 which makes the first 
[entries.length()|https://github.com/apache/cassandra/blob/6a1b1f26b7174e8c9bf86a96514ab626ce2a4117/src/java/org/apache/cassandra/io/sstable/IndexSummaryBuilder.java#L173]
 not 0. So the assert fails here: 
[IndexSummary:84|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/io/sstable/IndexSummary.java#L84]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13433) RPM distribution improvements and known issues

2017-08-22 Thread Michael Shuler (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16137151#comment-16137151
 ] 

Michael Shuler commented on CASSANDRA-13433:


How do you deal with the dependency of python-2.7 on an OS where python-2.7 is 
not available/installable?  At datastax, the spec just completely ignored the 
version, which is what made cassandra installable on < centos/rhel-7.

> RPM distribution improvements and known issues
> --
>
> Key: CASSANDRA-13433
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13433
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Packaging
>Reporter: Stefan Podkowinski
>Assignee: Stefan Podkowinski
>
> Starting with CASSANDRA-13252, new releases will be provided as both official 
> RPM and Debian packages.  While the Debian packages are already well 
> established with our user base, the RPMs just have been release for the first 
> time and still require some attention. 
> Feel free to discuss RPM related issues in this ticket and open a sub-task to 
> fill a bug report. 
> Please note that native systemd support will be implemented with 
> CASSANDRA-13148 and this is not strictly a RPM specific issue. We still 
> intent to offer non-systemd support based on the already working init scripts 
> that we ship. Therefor the first step is to make use of systemd backward 
> compatibility for SysV/LSB scripts, so we can provide RPMs for both systemd 
> and non-systemd environments.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13433) RPM distribution improvements and known issues

2017-08-22 Thread Nathaniel Tabernero (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16137115#comment-16137115
 ] 

Nathaniel Tabernero commented on CASSANDRA-13433:
-

Hi,

My project is stuck on Centos 6 (for now).  We came up with a spec file to 
build an rpm that is works for Centos 6.  Is this something you would be 
interested in?

> RPM distribution improvements and known issues
> --
>
> Key: CASSANDRA-13433
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13433
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Packaging
>Reporter: Stefan Podkowinski
>Assignee: Stefan Podkowinski
>
> Starting with CASSANDRA-13252, new releases will be provided as both official 
> RPM and Debian packages.  While the Debian packages are already well 
> established with our user base, the RPMs just have been release for the first 
> time and still require some attention. 
> Feel free to discuss RPM related issues in this ticket and open a sub-task to 
> fill a bug report. 
> Please note that native systemd support will be implemented with 
> CASSANDRA-13148 and this is not strictly a RPM specific issue. We still 
> intent to offer non-systemd support based on the already working init scripts 
> that we ship. Therefor the first step is to make use of systemd backward 
> compatibility for SysV/LSB scripts, so we can provide RPMs for both systemd 
> and non-systemd environments.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13630) support large internode messages with netty

2017-08-22 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16137116#comment-16137116
 ] 

Ariel Weisberg commented on CASSANDRA-13630:


I added some comments on 
https://github.com/jasobrown/cassandra/commit/2d58ad5f0ca5a63cc0fbead0b9234876d2dbd770#diff-55d5a06a8f012c31e11a06fc3f5bb960R265

At a high level the thing that worries me most is fan out message patterns. I 
thought worst case memory amplification from this NIO approach was 2x message 
size which is worse than our current 1x message size, but it's not, it's 
cluster size * message size if a message is fanned out to all nodes in the 
cluster. At the barest of bare minimums we need to detect this condition (large 
message + fanout) and log it. But really I would need to be convinced that we 
don't ever send large messages to the entire cluster. Just by nature of the 
problem serialization is faster than networking so the large message would be 
serialized to all the connections faster than the bytes can be drained out.

I think you have the gist of it on the receive side where a thread is forced to 
block for large messages. You are also creating a thread per large message 
channel. I really wonder if that be a shared pool of threads and we size it 
generously. Heck, use same pool for send and receive.

Looking over the tests now.

> support large internode messages with netty
> ---
>
> Key: CASSANDRA-13630
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13630
> Project: Cassandra
>  Issue Type: Task
>  Components: Streaming and Messaging
>Reporter: Jason Brown
>Assignee: Jason Brown
> Fix For: 4.0
>
>
> As part of CASSANDRA-8457, we decided to punt on large mesages to reduce the 
> scope of that ticket. However, we still need that functionality to ship a 
> correctly operating internode messaging subsystem.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-12373) 3.0 breaks CQL compatibility with super columns families

2017-08-22 Thread Jeremiah Jordan (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16137056#comment-16137056
 ] 

Jeremiah Jordan commented on CASSANDRA-12373:
-

bq. I'll note at this point that the "why" this is done this way is unimportant 
here, we are way way past changing any of this. This is how things work in 2.x 
however and the only goal here is to make it work the same way in 3.x so user 
can upgrade without problems.


I agree we need to expose these the same way in 3.x, but one thing to remember 
is that in 2.x non SCF tables worked the same way, but in 3.x we started 
exposing the defined columns as "static" and the undefined ones as 
column1/value. Is there a similar way to expose all the data for SCF?

> 3.0 breaks CQL compatibility with super columns families
> 
>
> Key: CASSANDRA-12373
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12373
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL
>Reporter: Sylvain Lebresne
>Assignee: Alex Petrov
> Fix For: 3.0.x, 3.11.x
>
>
> This is a follow-up to CASSANDRA-12335 to fix the CQL side of super column 
> compatibility.
> The details and a proposed solution can be found in the comments of 
> CASSANDRA-12335 but the crux of the issue is that super column famillies show 
> up differently in CQL in 3.0.x/3.x compared to 2.x, hence breaking backward 
> compatibilty.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-12373) 3.0 breaks CQL compatibility with super columns families

2017-08-22 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16137023#comment-16137023
 ] 

Sylvain Lebresne commented on CASSANDRA-12373:
--

The general approach lgtm, but I think there is problems around dealing with 
'dense' versus 'non-dense'.

Those things are a bit complicated however, and frankly under-documented, so 
allow me first to remind what 'dense' and 'non dense' mean for super column 
families (SCF in what follows), to make sure we're on the same page.

The first important thing to note is that contrarily to other {{COMPACT 
STORAGE}} tables, the value of the {{is_dense}} flag doesn't impact the 
internal layout of a SCF, neither in 2.x nor in 3.x. What it does impact 
however (in 2.x so far at least) is how the SCF is exposed through CQL, and 
that's what we're trying to make work in this ticket (and it must work in the 
same way than in 2.x) .

So the definition of being "dense" for a SCF in 2.x is that the user hasn't 
added any column in the thrift {{column_metadata}} of that SCF. Which 
equivalently means that {{is_dense == true}} if a SCF has no {{REGULAR}} 
columns internally.

With that defined, the impact on CQL is the following:
* a "dense" SCF having no {{REGULAR}} column, CQL exposes each "thrift column" 
of the SCF as an individual CQL row. So if you take a dense SCF containing 
something the following (using imaged representation of a thrift SCF here, 
hopefully it's clear what I mean):
{noformat}
'k' : {
'sc1' : {
'a' : 1,
'b' : 2,
'c' : 2,
},
'sc2': {
'b' : '3'
}
}
{noformat}
then this is exposed in CQL as:
{noformat}
key | column1 | column2 | value
+-+-+---
'k' |   'sc1' | 'a' | 1
'k' |   'sc1' | 'b' | 2
'k' |   'sc1' | 'c' | 2
'k' |   'sc2' | 'b' | 3
{noformat}
* a "non dense" SCF however only exposes through CQL the values of "thrift 
columns" that belong to a defined thrift {{column_metadata}}. So considering 
the same SCF example, but saying that SCF is now non dense because the user has 
defined {{column_metadata=[{column_name: b, validation_class: UTF8Type}]}}, 
then that SCF will be exposed in CQL as
{noformat}
key | column1 | b
+-+---
'k' |   'sc1' | 2
'k' |   'sc2' | 3
{noformat}
Note in particular that any value not associated to a non-declared 
{{column_metadata}} is simply not exposed: it's there internally, but not 
accessible through CQL.

I'll note at this point that the "why" this is done this way is unimportant 
here, we are way way past changing any of this. This is how things work in 2.x 
however and the only goal here is to make it work the same way in 3.x so user 
can upgrade without problems.


Anyway, back to the patch, I think there is 2 problems:
# the methods in {{SuperColumnCompatibility}} don't seem to handle non-dense 
super columns properly. Typically, I haven't actually tested, but from reading 
the code I believe that for my example above (using the non dense case where 
'b' is the only defined column) would yield something like:
{noformat}
key | column1 | b
+-+---
'k' |   'sc1' | 2
'k' |   'sc1' | 2
{noformat}
which is, well, not what 2.x does.
# I believe 3.x hasn't been setting the {{is_dense}} flag so far (which went 
unnoticed because, as said above, that flag only influence CQL in the case of 
SCF, and that hasn't working in 3.x so far). More precisely, I believe all SCF 
currently have their {{is_dense}} flag set to {{false}} in 3.x. And while the 
attached patch correctly fixes this issue _for upgrades from 2.x_, it's too 
late for cluster already upgraded to 3.x. And that's unfortunate because if I'm 
not mistaken, the schema migration process from 3.x has unintentionally erased 
the information we need to correct that problem. More precisely, the way to 
recognize a dense SCF in 2.1 is that is has no {{REGULAR}} column definitions 
internally, but dense SCF in 2.1 had a {{COMPACT_VALUE}} column definition, and 
in the 3.x schema migration code, we have converted that to {{REGULAR}}. In 
other words, if on 3.x we have a SCF with a single {{REGULAR}} column 
definition, we cannot really know with certainty if it's a dense SCF whose 
{{COMPACT_VALUE}} has been converted to a {{REGULAR}}, or a genuinely non-dense 
SCF with a single user-declared definition. I can't currently think of a good 
solution to that problem. We can play some guessing game using a few 
assumptions and hope no user will break those assumptions but I wanted to open 
up the discussion on that problem before delving into bad solutions in case 
someone has an actually good solution.


Other than that, I only have a few very minor remarks:
* I believe the last line of the class javadoc ("On write path, ...") of 
{{SuperColumnCompatibility}} has been truncated.
* Hardcoding "column2" and "value" in {{SuperColumnC

[jira] [Updated] (CASSANDRA-13630) support large internode messages with netty

2017-08-22 Thread Ariel Weisberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ariel Weisberg updated CASSANDRA-13630:
---
Reviewer: Ariel Weisberg

> support large internode messages with netty
> ---
>
> Key: CASSANDRA-13630
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13630
> Project: Cassandra
>  Issue Type: Task
>  Components: Streaming and Messaging
>Reporter: Jason Brown
>Assignee: Jason Brown
> Fix For: 4.0
>
>
> As part of CASSANDRA-8457, we decided to punt on large mesages to reduce the 
> scope of that ticket. However, we still need that functionality to ship a 
> correctly operating internode messaging subsystem.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Resolved] (CASSANDRA-13777) Altering an UDT does not invalidate the prepared statements using that type

2017-08-22 Thread Benjamin Lerer (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Lerer resolved CASSANDRA-13777.

Resolution: Duplicate

> Altering an UDT does not invalidate the prepared statements using that type
> ---
>
> Key: CASSANDRA-13777
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13777
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Benjamin Lerer
>Assignee: Benjamin Lerer
>
> C* does not invalidate the prepared statements using a specific UDT when that 
> UDT has been altered.
> Due to that the new UDT fields will not be returned by the java driver for 
> example. 
> This also means that the statements cannot be re-prepared as they are still 
> in the prepared statements cache. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13777) Altering an UDT does not invalidate the prepared statements using that type

2017-08-22 Thread Benjamin Lerer (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16136957#comment-16136957
 ] 

Benjamin Lerer commented on CASSANDRA-13777:


After looking into the problem the prepared statements are correctly 
invalidated. The problem is caused by a Java driver issue: 
[JAVA-420|https://datastax-oss.atlassian.net/projects/JAVA/issues/JAVA-420] 
which need CASSANDRA-10786.
So, I will close the ticket as duplicate.

> Altering an UDT does not invalidate the prepared statements using that type
> ---
>
> Key: CASSANDRA-13777
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13777
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Benjamin Lerer
>Assignee: Benjamin Lerer
>
> C* does not invalidate the prepared statements using a specific UDT when that 
> UDT has been altered.
> Due to that the new UDT fields will not be returned by the java driver for 
> example. 
> This also means that the statements cannot be re-prepared as they are still 
> in the prepared statements cache. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-22 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16136964#comment-16136964
 ] 

Romain GERARD commented on CASSANDRA-13418:
---

Do you have in mind any test that should be good to have ?

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Resolved] (CASSANDRA-13712) DropTable could cause hints

2017-08-22 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko resolved CASSANDRA-13712.
---
Resolution: Not A Problem

> DropTable could cause hints
> ---
>
> Key: CASSANDRA-13712
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13712
> Project: Cassandra
>  Issue Type: Bug
>  Components: Coordination
>Reporter: Jay Zhuang
>Priority: Minor
>
> While dropping table and there's on going write traffic, we saw hints 
> generated on each node. You can find hints dispatch message in the log. Not 
> sure if it's an issue.
> Here are the reproduce steps:
> 1. Create a 3 nodes cluster:
>   {{$ ccm create test13696 -v 3.0.14 && ccm populate -n 3 && ccm start}}
> 2. Send some traffics with cassandra-stress (blogpost.yaml is only in trunk, 
> if you use another yaml file, change the RF=3)
>   {{$ tools/bin/cassandra-stress user profile=test/resources/blogpost.yaml 
> cl=QUORUM truncate=never ops\(insert=1\) duration=30m -rate threads=2 -mode 
> native cql3 -node 127.0.0.1}}
> 3. While the traffic is running, drop table
>   {{$ cqlsh -e "drop table  stresscql.blogposts"}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13712) DropTable could cause hints

2017-08-22 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16136889#comment-16136889
 ] 

Aleksey Yeschenko commented on CASSANDRA-13712:
---

Schema propagation is not instantaneous, and even it were, there'd still 
possibly be encoded hints in the write buffers. This is not an issue, because 
they'll be safely skipped on replay.

> DropTable could cause hints
> ---
>
> Key: CASSANDRA-13712
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13712
> Project: Cassandra
>  Issue Type: Bug
>  Components: Coordination
>Reporter: Jay Zhuang
>Priority: Minor
>
> While dropping table and there's on going write traffic, we saw hints 
> generated on each node. You can find hints dispatch message in the log. Not 
> sure if it's an issue.
> Here are the reproduce steps:
> 1. Create a 3 nodes cluster:
>   {{$ ccm create test13696 -v 3.0.14 && ccm populate -n 3 && ccm start}}
> 2. Send some traffics with cassandra-stress (blogpost.yaml is only in trunk, 
> if you use another yaml file, change the RF=3)
>   {{$ tools/bin/cassandra-stress user profile=test/resources/blogpost.yaml 
> cl=QUORUM truncate=never ops\(insert=1\) duration=30m -rate threads=2 -mode 
> native cql3 -node 127.0.0.1}}
> 3. While the traffic is running, drop table
>   {{$ cqlsh -e "drop table  stresscql.blogposts"}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Resolved] (CASSANDRA-13681) checkAccess throws exceptions when statement does not exist

2017-08-22 Thread Benjamin Lerer (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Lerer resolved CASSANDRA-13681.

Resolution: Invalid

> checkAccess throws exceptions when statement does not exist
> ---
>
> Key: CASSANDRA-13681
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13681
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL
>Reporter: Hao Zhong
>
> CASSANDRA-6687 fixed a wrong exception. In DropTableStatement, the buggy code 
> is:
> {code}
> public void checkAccess(ClientState state) throws UnauthorizedException, 
> InvalidRequestException
> {
> state.hasColumnFamilyAccess(keyspace(), columnFamily(), 
> Permission.DROP);
> }
> {code}
> The fixed code is:
> {code}
> public void checkAccess(ClientState state) throws UnauthorizedException, 
> InvalidRequestException
> {
> try
> {
> state.hasColumnFamilyAccess(keyspace(), columnFamily(), 
> Permission.DROP);
> }
> catch (InvalidRequestException e)
> {
> if (!ifExists)
> throw e;
> }
> }
> {code}
> I found that ModificationStatement_checkAccess can have the same problem, 
> since it calls  state.hasColumnFamilyAccess. In particular, its code is as 
> follow:
> {code}
>  public void checkAccess(ClientState state) throws InvalidRequestException, 
> UnauthorizedException
> {
> state.hasColumnFamilyAccess(metadata, Permission.MODIFY);
> // CAS updates can be used to simulate a SELECT query, so should 
> require Permission.SELECT as well.
> if (hasConditions())
> state.hasColumnFamilyAccess(metadata, Permission.SELECT);
> // MV updates need to get the current state from the table, and might 
> update the views
> // Require Permission.SELECT on the base table, and Permission.MODIFY 
> on the views
> Iterator views = View.findAll(keyspace(), 
> columnFamily()).iterator();
> if (views.hasNext())
> {
> state.hasColumnFamilyAccess(metadata, Permission.SELECT);
> do
> {
> state.hasColumnFamilyAccess(views.next().metadata, 
> Permission.MODIFY);
> } while (views.hasNext());
> }
> for (Function function : getFunctions())
> state.ensureHasPermission(Permission.EXECUTE, function);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13780) ADD Node streaming throughput performance

2017-08-22 Thread Kevin Rivait (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16136849#comment-16136849
 ] 

Kevin Rivait commented on CASSANDRA-13780:
--

thank you Kurt
reg 1.
On our DEV environment (1 DC) we did set streamthroughput  to zero  on all 
nodes including the new node with no observed change in throughput.
On our PROD environment ( 2 DCs) we set streamthroughput  to zero on all nodes 
in the DC we were adding the node in. We observed streaming from all nodes in 
the local DC. 
We cannot achieve the default 200Mb/s streamthroughput per node

Our most recent experimental observation is the total streamthroughput for the 
whole cluster is ~80Mb/s  regardless of the number of nodes in the cluster.  As 
we add more nodes, the throughput of each node drops such that the sum total 
remains ~ 80Mb/s
Are there other parameters or bottlenecks that can reduce streaming throughput 
of each node?

reg 2.
You are correct.
We went back and dumped the SSTABLE max/min dates and verified that the buckets 
older than TTL and GCGS are in fact being dropped.

thank you
Kevin


> ADD Node streaming throughput performance
> -
>
> Key: CASSANDRA-13780
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13780
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
> Environment: Linux 2.6.32-696.3.2.el6.x86_64 #1 SMP Mon Jun 19 
> 11:55:55 PDT 2017 x86_64 x86_64 x86_64 GNU/Linux
> Architecture:  x86_64
> CPU op-mode(s):32-bit, 64-bit
> Byte Order:Little Endian
> CPU(s):40
> On-line CPU(s) list:   0-39
> Thread(s) per core:2
> Core(s) per socket:10
> Socket(s): 2
> NUMA node(s):  2
> Vendor ID: GenuineIntel
> CPU family:6
> Model: 79
> Model name:Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz
> Stepping:  1
> CPU MHz:   2199.869
> BogoMIPS:  4399.36
> Virtualization:VT-x
> L1d cache: 32K
> L1i cache: 32K
> L2 cache:  256K
> L3 cache:  25600K
> NUMA node0 CPU(s): 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38
> NUMA node1 CPU(s): 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39
>  total   used   free sharedbuffers cached
> Mem:  252G   217G34G   708K   308M   149G
> -/+ buffers/cache:67G   185G
> Swap:  16G 0B16G
>Reporter: Kevin Rivait
>Priority: Blocker
> Fix For: 3.0.9
>
>
> Problem: Adding a new node to a large cluster runs at least 1000x slower than 
> what the network and node hardware capacity can support, taking several days 
> per new node.  Adjusting stream throughput and other YAML parameters seems to 
> have no effect on performance.  Essentially, it appears that Cassandra has an 
> architecture scalability growth problem when adding new nodes to a moderate 
> to high data ingestion cluster because Cassandra cannot add new node capacity 
> fast enough to keep up with increasing data ingestion volumes and growth.
> Initial Configuration: 
> Running 3.0.9 and have implemented TWCS on one of our largest table.
> Largest table partitioned on (ID, MM)  using 1 day buckets with a TTL of 
> 60 days.
> Next release will change partitioning to (ID, MMDD) so that partitions 
> are aligned with daily TWCS buckets.
> Each node is currently creating roughly a 30GB SSTable per day.
> TWCS working as expected,  daily SSTables are dropping off daily after 70 
> days ( 60 + 10 day grace)
> Current deployment is a 28 node 2 datacenter cluster, 14 nodes in each DC , 
> replication factor 3
> Data directories are backed with 4 - 2TB SSDs on each node  and a 1 800GB SSD 
> for commit logs.
> Requirement is to double cluster size, capacity, and ingestion volume within 
> a few weeks.
> Observed Behavior:
> 1. streaming throughput during add node – we observed maximum 6 Mb/s 
> streaming from each of the 14 nodes on a 20Gb/s switched network, taking at 
> least 106 hours for each node to join cluster and each node is only about 2.2 
> TB is size.
> 2. compaction on the newly added node - compaction has fallen behind, with 
> anywhere from 4,000 to 10,000 SSTables at any given time.  It took 3 weeks 
> for compaction to finish on each newly added node.   Increasing number of 
> compaction threads to match number of CPU (40)  and increasing compaction 
> throughput to 32MB/s seemed to be the sweet spot. 
> 3. TWCS buckets on new node, data streamed to this node over 4 1/2 days.  
> Compaction correctly placed the data in daily files, but the problem is the 
> file dates reflect when compaction created the file and not the date of th

[jira] [Updated] (CASSANDRA-13749) add documentation about upgrade process to docs

2017-08-22 Thread Joshua McKenzie (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie updated CASSANDRA-13749:

Component/s: Documentation and Website

> add documentation about upgrade process to docs
> ---
>
> Key: CASSANDRA-13749
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13749
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Documentation and Website
>Reporter: Jon Haddad
>  Labels: documentation
>
> The docs don't have any information on how to upgrade.  This question gets 
> asked constantly on the mailing list.
> Seems like it belongs under the "Operating Cassandra" section.
> https://cassandra.apache.org/doc/latest/operating/index.html



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13717) INSERT statement fails when Tuple type is used as clustering column with default DESC order

2017-08-22 Thread Joshua McKenzie (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie updated CASSANDRA-13717:

Component/s: CQL
 Core

> INSERT statement fails when Tuple type is used as clustering column with 
> default DESC order
> ---
>
> Key: CASSANDRA-13717
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13717
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core, CQL
> Environment: Cassandra 3.11
>Reporter: Anastasios Kichidis
>Assignee: Stavros Kontopoulos
>Priority: Critical
> Attachments: example_queries.cql, fix_13717
>
>
> When a column family is created and a Tuple is used on clustering column with 
> default clustering order DESC, then the INSERT statement fails. 
> For example, the following table will make the INSERT statement fail with 
> error message "Invalid tuple type literal for tdemo of type 
> frozen>" , although the INSERT statement is correct 
> (works as expected when the default order is ASC)
> {noformat}
> create table test_table (
>   id int,
>   tdemo tuple,
>   primary key (id, tdemo)
> ) with clustering order by (tdemo desc);
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13712) DropTable could cause hints

2017-08-22 Thread Joshua McKenzie (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie updated CASSANDRA-13712:

Component/s: (was: Core)
 Coordination

> DropTable could cause hints
> ---
>
> Key: CASSANDRA-13712
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13712
> Project: Cassandra
>  Issue Type: Bug
>  Components: Coordination
>Reporter: Jay Zhuang
>Priority: Minor
>
> While dropping table and there's on going write traffic, we saw hints 
> generated on each node. You can find hints dispatch message in the log. Not 
> sure if it's an issue.
> Here are the reproduce steps:
> 1. Create a 3 nodes cluster:
>   {{$ ccm create test13696 -v 3.0.14 && ccm populate -n 3 && ccm start}}
> 2. Send some traffics with cassandra-stress (blogpost.yaml is only in trunk, 
> if you use another yaml file, change the RF=3)
>   {{$ tools/bin/cassandra-stress user profile=test/resources/blogpost.yaml 
> cl=QUORUM truncate=never ops\(insert=1\) duration=30m -rate threads=2 -mode 
> native cql3 -node 127.0.0.1}}
> 3. While the traffic is running, drop table
>   {{$ cqlsh -e "drop table  stresscql.blogposts"}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13702) Error on keyspace create/alter if referencing non-existing DC in cluster

2017-08-22 Thread Joshua McKenzie (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie updated CASSANDRA-13702:

Component/s: CQL

> Error on keyspace create/alter if referencing non-existing DC in cluster
> 
>
> Key: CASSANDRA-13702
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13702
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL, Distributed Metadata
>Reporter: Johnny Miller
>Priority: Minor
>
> It is possible to create/alter a keyspace using NetworkTopologyStrategy and a 
> DC that does not exist. It would be great if this was validated to prevent 
> accidents.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13702) Error on keyspace create/alter if referencing non-existing DC in cluster

2017-08-22 Thread Joshua McKenzie (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie updated CASSANDRA-13702:

Component/s: Distributed Metadata

> Error on keyspace create/alter if referencing non-existing DC in cluster
> 
>
> Key: CASSANDRA-13702
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13702
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL, Distributed Metadata
>Reporter: Johnny Miller
>Priority: Minor
>
> It is possible to create/alter a keyspace using NetworkTopologyStrategy and a 
> DC that does not exist. It would be great if this was validated to prevent 
> accidents.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13698) Reinstate or get rid of unit tests with multiple compaction strategies

2017-08-22 Thread Joshua McKenzie (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie updated CASSANDRA-13698:

Component/s: Testing

> Reinstate or get rid of unit tests with multiple compaction strategies
> --
>
> Key: CASSANDRA-13698
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13698
> Project: Cassandra
>  Issue Type: Test
>  Components: Testing
>Reporter: Paulo Motta
>Priority: Minor
>  Labels: lhf
>
> At some point there were (anti-)compaction tests with multiple compaction 
> strategy classes, but now it's only tested with {{STCS}}:
> * 
> [AnticompactionTest|https://github.com/apache/cassandra/blob/8b3a60b9a7dbefeecc06bace617279612ec7092d/test/unit/org/apache/cassandra/db/compaction/AntiCompactionTest.java#L247]
> * 
> [CompactionsTest|https://github.com/apache/cassandra/blob/8b3a60b9a7dbefeecc06bace617279612ec7092d/test/unit/org/apache/cassandra/db/compaction/CompactionsTest.java#L85]
> We should either reinstate these tests or decide they are not important and 
> remove the unused parameter.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13697) CDC and VIEW writeType missing from spec for write_timeout / write_failure

2017-08-22 Thread Sylvain Lebresne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-13697:
-
Labels: lhf  (was: )

> CDC and VIEW writeType missing from spec for write_timeout / write_failure
> --
>
> Key: CASSANDRA-13697
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13697
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Documentation and Website
>Reporter: Andy Tolbert
>Priority: Minor
>  Labels: lhf
>
> In cassandra 3.0 a new {{WriteType}} {{VIEW}} was added which appears to be 
> used when raising a {{WriteTimeoutException}} when the local view lock for a 
> key cannot be acquired within timeout.
> In cassandra 3.8 {{CDC}} {{WriteType}} was added for when 
> {{cdc_total_space_in_mb}} is exceeded when doing a write to data tracked by 
> cdc.
> The [v4 
> spec|https://github.com/apache/cassandra/blob/cassandra-3.11.0/doc/native_protocol_v4.spec#L1051-L1066]
>  currently doesn't cover these two write types.   While the protocol allows 
> for a free form string for write type, it would be nice to document that 
> types are available since some drivers (java, cpp, python) attempt to 
> deserialize write type into an enum and may not handle it well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13697) CDC and VIEW writeType missing from spec for write_timeout / write_failure

2017-08-22 Thread Joshua McKenzie (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie updated CASSANDRA-13697:

Component/s: Documentation and Website

> CDC and VIEW writeType missing from spec for write_timeout / write_failure
> --
>
> Key: CASSANDRA-13697
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13697
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Documentation and Website
>Reporter: Andy Tolbert
>Priority: Minor
>  Labels: lhf
>
> In cassandra 3.0 a new {{WriteType}} {{VIEW}} was added which appears to be 
> used when raising a {{WriteTimeoutException}} when the local view lock for a 
> key cannot be acquired within timeout.
> In cassandra 3.8 {{CDC}} {{WriteType}} was added for when 
> {{cdc_total_space_in_mb}} is exceeded when doing a write to data tracked by 
> cdc.
> The [v4 
> spec|https://github.com/apache/cassandra/blob/cassandra-3.11.0/doc/native_protocol_v4.spec#L1051-L1066]
>  currently doesn't cover these two write types.   While the protocol allows 
> for a free form string for write type, it would be nice to document that 
> types are available since some drivers (java, cpp, python) attempt to 
> deserialize write type into an enum and may not handle it well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13695) ReadStage threads have no timeout

2017-08-22 Thread Joshua McKenzie (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie updated CASSANDRA-13695:

Component/s: Local Write-Read Paths

> ReadStage threads have no timeout
> -
>
> Key: CASSANDRA-13695
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13695
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local Write-Read Paths
>Reporter: Vladimir Yudovin
>
> Following this discussion: [High CPU after read 
> timeout|https://lists.apache.org/thread.html/e22a2a77634f9228bf1d5474cc77ea461262f2e125cd2fa21a17f7a2@%3Cdev.cassandra.apache.org%3E]
> Currently ReadStage threads have no timeout and continue to run without 
> limitation after xxx_request_timeout_in_ms expired. Thus single bad request 
> like SELECT ... ALLOW FILTERING can paralyze the whole cluster for hours and 
> even more.
> I guess that read request should include a kind *timeout *or *expired_at 
> parameter* and handling thread will check it and stop processing after 
> expiration time.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13776) Adding a field to an UDT can corrupte the tables using it

2017-08-22 Thread Benjamin Lerer (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Lerer updated CASSANDRA-13776:
---
Reviewer: Robert Stupp

> Adding a field to an UDT can corrupte the tables using it
> -
>
> Key: CASSANDRA-13776
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13776
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Benjamin Lerer
>Assignee: Benjamin Lerer
>Priority: Critical
>
> Adding a field to an UDT which is used as a {{Set}} element or as a {{Map}} 
> element can corrupt the table.
> The problem can be reproduced using the following test case:
> {code}
> @Test
> public void testReadAfterAlteringUserTypeNestedWithinSet() throws 
> Throwable
> {
> String ut1 = createType("CREATE TYPE %s (a int)");
> String columnType = KEYSPACE + "." + ut1;
> try
> {
> createTable("CREATE TABLE %s (x int PRIMARY KEY, y set columnType + ">>)");
> disableCompaction();
> execute("INSERT INTO %s (x, y) VALUES(1, ?)", set(userType(1), 
> userType(2)));
> assertRows(execute("SELECT * FROM %s"), row(1, set(userType(1), 
> userType(2;
> flush();
> assertRows(execute("SELECT * FROM %s WHERE x = 1"),
>row(1, set(userType(1), userType(2;
> execute("ALTER TYPE " + KEYSPACE + "." + ut1 + " ADD b int");
> execute("UPDATE %s SET y = y + ? WHERE x = 1",
> set(userType(1, 1), userType(1, 2), userType(2, 1)));
> flush();
> assertRows(execute("SELECT * FROM %s WHERE x = 1"),
>row(1, set(userType(1),
>   userType(1, 1),
>   userType(1, 2),
>   userType(2),
>   userType(2, 1;
> compact();
> assertRows(execute("SELECT * FROM %s WHERE x = 1"),
>row(1, set(userType(1),
>   userType(1, 1),
>   userType(1, 2),
>   userType(2),
>   userType(2, 1;
> }
> finally
> {
> enableCompaction();
> }
> }
> {code} 
> There are in fact 2 problems:
> # When the {{sets}} from the 2 versions are merged the {{ColumnDefinition}} 
> being picked up can be the older one. In which case when the tuples are 
> sorted it my lead to an {{IndexOutOfBoundsException}}.
> # During compaction, the old column definition can be the one being kept for 
> the SSTable metadata. If it is the case the SSTable will not be readable any 
> more and will be marked as {{corrupted}}.
> If one of the tables using the type has a Materialized View attached to it, 
> the MV updates can also fail with {{IndexOutOfBoundsException}}.
> This problem can be reproduced using the following test:
> {code}
> @Test
> public void testAlteringUserTypeNestedWithinSetWithView() throws Throwable
> {
> String columnType = typeWithKs(createType("CREATE TYPE %s (a int)"));
> createTable("CREATE TABLE %s (pk int, c int, v int, s set columnType + ">>, PRIMARY KEY (pk, c))");
> execute("CREATE MATERIALIZED VIEW " + keyspace() + ".view1 AS SELECT 
> c, pk, v FROM %s WHERE pk IS NOT NULL AND c IS NOT NULL AND v IS NOT NULL 
> PRIMARY KEY (c, pk)");
> execute("INSERT INTO %s (pk, c, v, s) VALUES(?, ?, ?, ?)", 1, 1, 1, 
> set(userType(1), userType(2)));
> flush();
> execute("ALTER TYPE " + columnType + " ADD b int");
> execute("UPDATE %s SET s = s + ?, v = ? WHERE pk = ? AND c = ?",
> set(userType(1, 1), userType(1, 2), userType(2, 1)), 2, 1, 1);
> assertRows(execute("SELECT * FROM %s WHERE pk = ? AND c = ?", 1, 1),
>row(1, 1, 2, set(userType(1),
> userType(1, 1),
> userType(1, 2),
> userType(2),
> userType(2, 1;
> }
> {code}  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13692) CompactionAwareWriter_getWriteDirectory throws incompatible exceptions

2017-08-22 Thread Sylvain Lebresne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-13692:
-
Labels: lhf  (was: )

> CompactionAwareWriter_getWriteDirectory throws incompatible exceptions
> --
>
> Key: CASSANDRA-13692
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13692
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Hao Zhong
>  Labels: lhf
>
> The CompactionAwareWriter_getWriteDirectory throws RuntimeException:
> {code}
> public Directories.DataDirectory getWriteDirectory(Iterable 
> sstables, long estimatedWriteSize)
> {
> File directory = null;
> for (SSTableReader sstable : sstables)
> {
> if (directory == null)
> directory = sstable.descriptor.directory;
> if (!directory.equals(sstable.descriptor.directory))
> {
> logger.trace("All sstables not from the same disk - putting 
> results in {}", directory);
> break;
> }
> }
> Directories.DataDirectory d = 
> getDirectories().getDataDirectoryForFile(directory);
> if (d != null)
> {
> long availableSpace = d.getAvailableSpace();
> if (availableSpace < estimatedWriteSize)
> throw new RuntimeException(String.format("Not enough space to 
> write %s to %s (%s available)",
>  
> FBUtilities.prettyPrintMemory(estimatedWriteSize),
>  d.location,
>  
> FBUtilities.prettyPrintMemory(availableSpace)));
> logger.trace("putting compaction results in {}", directory);
> return d;
> }
> d = getDirectories().getWriteableLocation(estimatedWriteSize);
> if (d == null)
> throw new RuntimeException(String.format("Not enough disk space 
> to store %s",
>  
> FBUtilities.prettyPrintMemory(estimatedWriteSize)));
> return d;
> }
> {code}
> However, the thrown exception does not  trigger the failure policy. 
> CASSANDRA-11448 fixed a similar problem. The buggy code is:
> {code}
> protected Directories.DataDirectory getWriteDirectory(long writeSize)
> {
> Directories.DataDirectory directory = 
> getDirectories().getWriteableLocation(writeSize);
> if (directory == null)
> throw new RuntimeException("Insufficient disk space to write " + 
> writeSize + " bytes");
> return directory;
> }
> {code}
> The fixed code is:
> {code}
> protected Directories.DataDirectory getWriteDirectory(long writeSize)
> {
> Directories.DataDirectory directory = 
> getDirectories().getWriteableLocation(writeSize);
> if (directory == null)
> throw new FSWriteError(new IOException("Insufficient disk space 
> to write " + writeSize + " bytes"), "");
> return directory;
> }
> {code}
> The fixed code throws FSWE and triggers the failure policy.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13693) A potential problem in the Ec2MultiRegionSnitch_gossiperStarting method

2017-08-22 Thread Joshua McKenzie (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie updated CASSANDRA-13693:

Component/s: Distributed Metadata

> A potential problem in the Ec2MultiRegionSnitch_gossiperStarting method
> ---
>
> Key: CASSANDRA-13693
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13693
> Project: Cassandra
>  Issue Type: Bug
>  Components: Distributed Metadata
>Reporter: Hao Zhong
>
> The code of Ec2MultiRegionSnitch_gossiperStarting is as follow:
> {code}
> public void gossiperStarting()
> {
> super.gossiperStarting();
> 
> Gossiper.instance.addLocalApplicationState(ApplicationState.INTERNAL_IP, 
> StorageService.instance.valueFactory.internalIP(localPrivateAddress));
> Gossiper.instance.register(new ReconnectableSnitchHelper(this, 
> ec2region, true));
> }
> {code}
> I notice that CASSANDRA-5897 fixed a bug, whose buggy code is identical. The 
> fixed code is 
> {code}
> public void gossiperStarting()
> {
> super.gossiperStarting();
> 
> Gossiper.instance.addLocalApplicationState(ApplicationState.INTERNAL_IP,
> 
> StorageService.instance.valueFactory.internalIP(FBUtilities.getLocalAddress().getHostAddress()));
> reloadGossiperState();
> gossipStarted = true;
> }
> private void reloadGossiperState()
> {
> if (Gossiper.instance != null)
> {
> ReconnectableSnitchHelper pendingHelper = new 
> ReconnectableSnitchHelper(this, myDC, preferLocal);
> Gossiper.instance.register(pendingHelper);
> 
> pendingHelper = snitchHelperReference.getAndSet(pendingHelper);
> if (pendingHelper != null)
> Gossiper.instance.unregister(pendingHelper);
> }
> // else this will eventually rerun at gossiperStarting()
> }
> {code}
> If Ec2MultiRegionSnitch is supposed to auto-reload, the above fix shall be 
> applied to its code.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13687) Abnormal heap growth and CPU usage during repair.

2017-08-22 Thread Joshua McKenzie (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie updated CASSANDRA-13687:

Component/s: Streaming and Messaging

> Abnormal heap growth and CPU usage during repair.
> -
>
> Key: CASSANDRA-13687
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13687
> Project: Cassandra
>  Issue Type: Bug
>  Components: Streaming and Messaging
>Reporter: Stanislav Vishnevskiy
> Attachments: 3.0.14cpu.png, 3.0.14heap.png, 3.0.14.png, 
> 3.0.9heap.png, 3.0.9.png
>
>
> We recently upgraded from 3.0.9 to 3.0.14 to get the fix from CASSANDRA-13004
> Sadly 3 out of the last 7 nights we have had to wake up due Cassandra dying 
> on us. We currently don't have any data to help reproduce this, but maybe 
> since there aren't many commits between the 2 versions it might be obvious.
> Basically we trigger a parallel incremental repair from a single node every 
> night at 1AM. That node will sometimes start allocating a lot and keeping the 
> heap maxed and triggering GC. Some of these GC can last up to 2 minutes. This 
> effectively destroys the whole cluster due to timeouts to this node.
> The only solution we currently have is to drain the node and restart the 
> repair, it has worked fine the second time every time.
> I attached heap charts from 3.0.9 and 3.0.14 during repair.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13685) PartitionColumns.java:161: java.lang.AssertionError: null

2017-08-22 Thread Joshua McKenzie (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie updated CASSANDRA-13685:

Component/s: Local Write-Read Paths

> PartitionColumns.java:161: java.lang.AssertionError: null
> -
>
> Key: CASSANDRA-13685
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13685
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core, Local Write-Read Paths
>Reporter: Jay Zhuang
>Priority: Minor
>  Labels: lhf
>
> Similar to CASSANDRA-8192, I guess the SSTable is corrupted:
> {noformat}
> ERROR [SSTableBatchOpen:1] 2017-07-10 21:28:09,325 CassandraDaemon.java:207 - 
> Exception in thread Thread[SSTableBatchOpen:1,5,main]
> java.lang.AssertionError: null
> at 
> org.apache.cassandra.db.PartitionColumns$Builder.add(PartitionColumns.java:161)
>  ~[apache-cassandra-3.0.14.jar:3.0.14]
> at 
> org.apache.cassandra.db.SerializationHeader$Component.toHeader(SerializationHeader.java:339)
>  ~[apache-cassandra-3.0.14.jar:3.0.14]
> at 
> org.apache.cassandra.io.sstable.format.SSTableReader.open(SSTableReader.java:486)
>  ~[apache-cassandra-3.0.14.jar:3.0.14]
> at 
> org.apache.cassandra.io.sstable.format.SSTableReader.open(SSTableReader.java:375)
>  ~[apache-cassandra-3.0.14.jar:3.0.14]
> at 
> org.apache.cassandra.io.sstable.format.SSTableReader$4.run(SSTableReader.java:534)
>  ~[apache-cassandra-3.0.14.jar:3.0.14]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_121]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[na:1.8.0_121]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[na:1.8.0_121]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_121]
> at 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
>  [apache-cassandra-3.0.14.jar:3.0.14]
> at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_121]
> {noformat}
> Would be better to report {{CorruptSSTableException}} with SSTable path.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13681) checkAccess throws exceptions when statement does not exist

2017-08-22 Thread Joshua McKenzie (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie updated CASSANDRA-13681:

Component/s: CQL

> checkAccess throws exceptions when statement does not exist
> ---
>
> Key: CASSANDRA-13681
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13681
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL
>Reporter: Hao Zhong
>
> CASSANDRA-6687 fixed a wrong exception. In DropTableStatement, the buggy code 
> is:
> {code}
> public void checkAccess(ClientState state) throws UnauthorizedException, 
> InvalidRequestException
> {
> state.hasColumnFamilyAccess(keyspace(), columnFamily(), 
> Permission.DROP);
> }
> {code}
> The fixed code is:
> {code}
> public void checkAccess(ClientState state) throws UnauthorizedException, 
> InvalidRequestException
> {
> try
> {
> state.hasColumnFamilyAccess(keyspace(), columnFamily(), 
> Permission.DROP);
> }
> catch (InvalidRequestException e)
> {
> if (!ifExists)
> throw e;
> }
> }
> {code}
> I found that ModificationStatement_checkAccess can have the same problem, 
> since it calls  state.hasColumnFamilyAccess. In particular, its code is as 
> follow:
> {code}
>  public void checkAccess(ClientState state) throws InvalidRequestException, 
> UnauthorizedException
> {
> state.hasColumnFamilyAccess(metadata, Permission.MODIFY);
> // CAS updates can be used to simulate a SELECT query, so should 
> require Permission.SELECT as well.
> if (hasConditions())
> state.hasColumnFamilyAccess(metadata, Permission.SELECT);
> // MV updates need to get the current state from the table, and might 
> update the views
> // Require Permission.SELECT on the base table, and Permission.MODIFY 
> on the views
> Iterator views = View.findAll(keyspace(), 
> columnFamily()).iterator();
> if (views.hasNext())
> {
> state.hasColumnFamilyAccess(metadata, Permission.SELECT);
> do
> {
> state.hasColumnFamilyAccess(views.next().metadata, 
> Permission.MODIFY);
> } while (views.hasNext());
> }
> for (Function function : getFunctions())
> state.ensureHasPermission(Permission.EXECUTE, function);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13776) Adding a field to an UDT can corrupte the tables using it

2017-08-22 Thread Benjamin Lerer (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16136801#comment-16136801
 ] 

Benjamin Lerer commented on CASSANDRA-13776:


I pushed the patches for 
[3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...blerer:13776-3.0],
 
[3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...blerer:13776-3.11]
 and 
[trunk|https://github.com/apache/cassandra/compare/trunk...blerer:13776-trunk].
I ran the tests on our internal CI and the failing tests look unrelated to the 
patches.

> Adding a field to an UDT can corrupte the tables using it
> -
>
> Key: CASSANDRA-13776
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13776
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Benjamin Lerer
>Assignee: Benjamin Lerer
>Priority: Critical
>
> Adding a field to an UDT which is used as a {{Set}} element or as a {{Map}} 
> element can corrupt the table.
> The problem can be reproduced using the following test case:
> {code}
> @Test
> public void testReadAfterAlteringUserTypeNestedWithinSet() throws 
> Throwable
> {
> String ut1 = createType("CREATE TYPE %s (a int)");
> String columnType = KEYSPACE + "." + ut1;
> try
> {
> createTable("CREATE TABLE %s (x int PRIMARY KEY, y set columnType + ">>)");
> disableCompaction();
> execute("INSERT INTO %s (x, y) VALUES(1, ?)", set(userType(1), 
> userType(2)));
> assertRows(execute("SELECT * FROM %s"), row(1, set(userType(1), 
> userType(2;
> flush();
> assertRows(execute("SELECT * FROM %s WHERE x = 1"),
>row(1, set(userType(1), userType(2;
> execute("ALTER TYPE " + KEYSPACE + "." + ut1 + " ADD b int");
> execute("UPDATE %s SET y = y + ? WHERE x = 1",
> set(userType(1, 1), userType(1, 2), userType(2, 1)));
> flush();
> assertRows(execute("SELECT * FROM %s WHERE x = 1"),
>row(1, set(userType(1),
>   userType(1, 1),
>   userType(1, 2),
>   userType(2),
>   userType(2, 1;
> compact();
> assertRows(execute("SELECT * FROM %s WHERE x = 1"),
>row(1, set(userType(1),
>   userType(1, 1),
>   userType(1, 2),
>   userType(2),
>   userType(2, 1;
> }
> finally
> {
> enableCompaction();
> }
> }
> {code} 
> There are in fact 2 problems:
> # When the {{sets}} from the 2 versions are merged the {{ColumnDefinition}} 
> being picked up can be the older one. In which case when the tuples are 
> sorted it my lead to an {{IndexOutOfBoundsException}}.
> # During compaction, the old column definition can be the one being kept for 
> the SSTable metadata. If it is the case the SSTable will not be readable any 
> more and will be marked as {{corrupted}}.
> If one of the tables using the type has a Materialized View attached to it, 
> the MV updates can also fail with {{IndexOutOfBoundsException}}.
> This problem can be reproduced using the following test:
> {code}
> @Test
> public void testAlteringUserTypeNestedWithinSetWithView() throws Throwable
> {
> String columnType = typeWithKs(createType("CREATE TYPE %s (a int)"));
> createTable("CREATE TABLE %s (pk int, c int, v int, s set columnType + ">>, PRIMARY KEY (pk, c))");
> execute("CREATE MATERIALIZED VIEW " + keyspace() + ".view1 AS SELECT 
> c, pk, v FROM %s WHERE pk IS NOT NULL AND c IS NOT NULL AND v IS NOT NULL 
> PRIMARY KEY (c, pk)");
> execute("INSERT INTO %s (pk, c, v, s) VALUES(?, ?, ?, ?)", 1, 1, 1, 
> set(userType(1), userType(2)));
> flush();
> execute("ALTER TYPE " + columnType + " ADD b int");
> execute("UPDATE %s SET s = s + ?, v = ? WHERE pk = ? AND c = ?",
> set(userType(1, 1), userType(1, 2), userType(2, 1)), 2, 1, 1);
> assertRows(execute("SELECT * FROM %s WHERE pk = ? AND c = ?", 1, 1),
>row(1, 1, 2, set(userType(1),
> userType(1, 1),
> userType(1, 2),
> userType(2),
> userType(2, 1;
> }
> {code}  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsub

[jira] [Updated] (CASSANDRA-13776) Adding a field to an UDT can corrupte the tables using it

2017-08-22 Thread Benjamin Lerer (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Lerer updated CASSANDRA-13776:
---
Status: Patch Available  (was: Open)

> Adding a field to an UDT can corrupte the tables using it
> -
>
> Key: CASSANDRA-13776
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13776
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Benjamin Lerer
>Assignee: Benjamin Lerer
>Priority: Critical
>
> Adding a field to an UDT which is used as a {{Set}} element or as a {{Map}} 
> element can corrupt the table.
> The problem can be reproduced using the following test case:
> {code}
> @Test
> public void testReadAfterAlteringUserTypeNestedWithinSet() throws 
> Throwable
> {
> String ut1 = createType("CREATE TYPE %s (a int)");
> String columnType = KEYSPACE + "." + ut1;
> try
> {
> createTable("CREATE TABLE %s (x int PRIMARY KEY, y set columnType + ">>)");
> disableCompaction();
> execute("INSERT INTO %s (x, y) VALUES(1, ?)", set(userType(1), 
> userType(2)));
> assertRows(execute("SELECT * FROM %s"), row(1, set(userType(1), 
> userType(2;
> flush();
> assertRows(execute("SELECT * FROM %s WHERE x = 1"),
>row(1, set(userType(1), userType(2;
> execute("ALTER TYPE " + KEYSPACE + "." + ut1 + " ADD b int");
> execute("UPDATE %s SET y = y + ? WHERE x = 1",
> set(userType(1, 1), userType(1, 2), userType(2, 1)));
> flush();
> assertRows(execute("SELECT * FROM %s WHERE x = 1"),
>row(1, set(userType(1),
>   userType(1, 1),
>   userType(1, 2),
>   userType(2),
>   userType(2, 1;
> compact();
> assertRows(execute("SELECT * FROM %s WHERE x = 1"),
>row(1, set(userType(1),
>   userType(1, 1),
>   userType(1, 2),
>   userType(2),
>   userType(2, 1;
> }
> finally
> {
> enableCompaction();
> }
> }
> {code} 
> There are in fact 2 problems:
> # When the {{sets}} from the 2 versions are merged the {{ColumnDefinition}} 
> being picked up can be the older one. In which case when the tuples are 
> sorted it my lead to an {{IndexOutOfBoundsException}}.
> # During compaction, the old column definition can be the one being kept for 
> the SSTable metadata. If it is the case the SSTable will not be readable any 
> more and will be marked as {{corrupted}}.
> If one of the tables using the type has a Materialized View attached to it, 
> the MV updates can also fail with {{IndexOutOfBoundsException}}.
> This problem can be reproduced using the following test:
> {code}
> @Test
> public void testAlteringUserTypeNestedWithinSetWithView() throws Throwable
> {
> String columnType = typeWithKs(createType("CREATE TYPE %s (a int)"));
> createTable("CREATE TABLE %s (pk int, c int, v int, s set columnType + ">>, PRIMARY KEY (pk, c))");
> execute("CREATE MATERIALIZED VIEW " + keyspace() + ".view1 AS SELECT 
> c, pk, v FROM %s WHERE pk IS NOT NULL AND c IS NOT NULL AND v IS NOT NULL 
> PRIMARY KEY (c, pk)");
> execute("INSERT INTO %s (pk, c, v, s) VALUES(?, ?, ?, ?)", 1, 1, 1, 
> set(userType(1), userType(2)));
> flush();
> execute("ALTER TYPE " + columnType + " ADD b int");
> execute("UPDATE %s SET s = s + ?, v = ? WHERE pk = ? AND c = ?",
> set(userType(1, 1), userType(1, 2), userType(2, 1)), 2, 1, 1);
> assertRows(execute("SELECT * FROM %s WHERE pk = ? AND c = ?", 1, 1),
>row(1, 1, 2, set(userType(1),
> userType(1, 1),
> userType(1, 2),
> userType(2),
> userType(2, 1;
> }
> {code}  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13752) Corrupted SSTables created in 3.11

2017-08-22 Thread JIRA

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16136624#comment-16136624
 ] 

Hannu Kröger edited comment on CASSANDRA-13752 at 8/22/17 1:29 PM:
---

I created a branch with potential fix for this particular problem

||Branch||utest||dtest||
|[3.11|https://github.com/hkroger/cassandra/tree/cassandra-3.11-13752]|[3.11 
circle|https://circleci.com/gh/hkroger/cassandra/tree/cassandra-3.11-13752]|???|



was (Author: hkroger):
I created a branch with potential fix for this particular problem

||Branch||utest||dtest||
|[3.11|https://circleci.com/gh/hkroger/cassandra/tree/cassandra-3.11-13752]|[3.11
 
circle|https://circleci.com/gh/hkroger/cassandra/tree/cassandra-3.11-13752]|???|


> Corrupted SSTables created in 3.11
> --
>
> Key: CASSANDRA-13752
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13752
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Hannu Kröger
>Priority: Blocker
>
> We have discovered issues with corrupted SSTables. 
> {code}
> ERROR [SSTableBatchOpen:22] 2017-08-03 20:19:53,195 SSTableReader.java:577 - 
> Cannot read sstable 
> /cassandra/data/mykeyspace/mytable-7a4992800d5611e7b782cb90016f2d17/mc-35556-big=[Data.db,
>  Statistics.db, Summary.db, Digest.crc32, CompressionInfo.db, TOC.txt, 
> Index.db, Filter.db]; other IO error, skipping table
> java.io.EOFException: EOF after 1898 bytes out of 21093
> at 
> org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:68)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:60)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:402) 
> ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.utils.ByteBufferUtil.readWithShortLength(ByteBufferUtil.java:377)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.io.sstable.metadata.StatsMetadata$StatsMetadataSerializer.deserialize(StatsMetadata.java:325)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.io.sstable.metadata.StatsMetadata$StatsMetadataSerializer.deserialize(StatsMetadata.java:231)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:122)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:93)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.io.sstable.format.SSTableReader.open(SSTableReader.java:488)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.io.sstable.format.SSTableReader.open(SSTableReader.java:396)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.io.sstable.format.SSTableReader$5.run(SSTableReader.java:561)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [na:1.8.0_111]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> [na:1.8.0_111]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  [na:1.8.0_111]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_111]
> at 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81)
>  [apache-cassandra-3.11.0.jar:3.11.0]
> {code}
> Files look like this:
> {code}
> -rw-r--r--. 1 cassandra cassandra 3899251 Aug  7 08:37 
> mc-6166-big-CompressionInfo.db
> -rw-r--r--. 1 cassandra cassandra 16874421686 Aug  7 08:37 mc-6166-big-Data.db
> -rw-r--r--. 1 cassandra cassandra  10 Aug  7 08:37 
> mc-6166-big-Digest.crc32
> -rw-r--r--. 1 cassandra cassandra 2930904 Aug  7 08:37 
> mc-6166-big-Filter.db
> -rw-r--r--. 1 cassandra cassandra   75880 Aug  7 08:37 
> mc-6166-big-Index.db
> -rw-r--r--. 1 cassandra cassandra   13762 Aug  7 08:37 
> mc-6166-big-Statistics.db
> -rw-r--r--. 1 cassandra cassandra  882008 Aug  7 08:37 
> mc-6166-big-Summary.db
> -rw-r--r--. 1 cassandra cassandra  92 Aug  7 08:37 mc-6166-big-TOC.txt
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13363) java.lang.ArrayIndexOutOfBoundsException: null

2017-08-22 Thread zhaoyan (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16136740#comment-16136740
 ] 

zhaoyan edited comment on CASSANDRA-13363 at 8/22/17 12:50 PM:
---

use the Command.Index as the "USING INDEX"。 (for CASSANDRA-10214 ) and  make 
sure the field never changes and is only set once at construction time。 

without a protocol change.

I think it is a good idea. 

And add one another cache field  (that do not participate in serialization, 
only use as cache local) to save  the index cassandra lookup?  


was (Author: zhaoyan):
use the Command.Index as the "USING INDEX"。 (for CASSANDRA-10214 ) and  make 
sure the field never changes and is only set once at construction time

I think it is a good idea. 

And add one another cache field  (that do not participate in serialization, 
only use as cache local) to save  the index cassandra lookup?  

> java.lang.ArrayIndexOutOfBoundsException: null
> --
>
> Key: CASSANDRA-13363
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13363
> Project: Cassandra
>  Issue Type: Bug
> Environment: CentOS 6, Cassandra 3.10
>Reporter: Artem Rokhin
>Assignee: Aleksey Yeschenko
>Priority: Critical
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> Constantly see this error in the log without any additional information or a 
> stack trace.
> {code}
> Exception in thread Thread[MessagingService-Incoming-/10.0.1.26,5,main]
> {code}
> {code}
> java.lang.ArrayIndexOutOfBoundsException: null
> {code}
> Logger: org.apache.cassandra.service.CassandraDaemon
> Thrdead: MessagingService-Incoming-/10.0.1.12
> Method: uncaughtException
> File: CassandraDaemon.java
> Line: 229



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13363) java.lang.ArrayIndexOutOfBoundsException: null

2017-08-22 Thread zhaoyan (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16136740#comment-16136740
 ] 

zhaoyan commented on CASSANDRA-13363:
-

use the Command.Index as the "USING INDEX"。 (for CASSANDRA-10214 ) and  make 
sure the field never changes and is only set once at construction time

I think it is a good idea. 

And add one another cache field  (that do not participate in serialization, 
only use as cache local) to save  the index cassandra lookup?  

> java.lang.ArrayIndexOutOfBoundsException: null
> --
>
> Key: CASSANDRA-13363
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13363
> Project: Cassandra
>  Issue Type: Bug
> Environment: CentOS 6, Cassandra 3.10
>Reporter: Artem Rokhin
>Assignee: Aleksey Yeschenko
>Priority: Critical
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> Constantly see this error in the log without any additional information or a 
> stack trace.
> {code}
> Exception in thread Thread[MessagingService-Incoming-/10.0.1.26,5,main]
> {code}
> {code}
> java.lang.ArrayIndexOutOfBoundsException: null
> {code}
> Logger: org.apache.cassandra.service.CassandraDaemon
> Thrdead: MessagingService-Incoming-/10.0.1.12
> Method: uncaughtException
> File: CassandraDaemon.java
> Line: 229



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13780) ADD Node streaming throughput performance

2017-08-22 Thread Kurt Greaves (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16136696#comment-16136696
 ] 

Kurt Greaves commented on CASSANDRA-13780:
--

1. This is one of the things vnodes are meant to help with, but they are not 
great for large clusters for a variety of reasons. One question, did you 
increase the stream throughput on all the nodes, or just on the joining node?

2. This is already recorded in the SSTable metadata. Once the minimum timestamp 
as reported by {{sstablemetadata}} is expired + GCGS the SSTable will be 
dropped.

> ADD Node streaming throughput performance
> -
>
> Key: CASSANDRA-13780
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13780
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
> Environment: Linux 2.6.32-696.3.2.el6.x86_64 #1 SMP Mon Jun 19 
> 11:55:55 PDT 2017 x86_64 x86_64 x86_64 GNU/Linux
> Architecture:  x86_64
> CPU op-mode(s):32-bit, 64-bit
> Byte Order:Little Endian
> CPU(s):40
> On-line CPU(s) list:   0-39
> Thread(s) per core:2
> Core(s) per socket:10
> Socket(s): 2
> NUMA node(s):  2
> Vendor ID: GenuineIntel
> CPU family:6
> Model: 79
> Model name:Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz
> Stepping:  1
> CPU MHz:   2199.869
> BogoMIPS:  4399.36
> Virtualization:VT-x
> L1d cache: 32K
> L1i cache: 32K
> L2 cache:  256K
> L3 cache:  25600K
> NUMA node0 CPU(s): 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38
> NUMA node1 CPU(s): 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39
>  total   used   free sharedbuffers cached
> Mem:  252G   217G34G   708K   308M   149G
> -/+ buffers/cache:67G   185G
> Swap:  16G 0B16G
>Reporter: Kevin Rivait
>Priority: Blocker
> Fix For: 3.0.9
>
>
> Problem: Adding a new node to a large cluster runs at least 1000x slower than 
> what the network and node hardware capacity can support, taking several days 
> per new node.  Adjusting stream throughput and other YAML parameters seems to 
> have no effect on performance.  Essentially, it appears that Cassandra has an 
> architecture scalability growth problem when adding new nodes to a moderate 
> to high data ingestion cluster because Cassandra cannot add new node capacity 
> fast enough to keep up with increasing data ingestion volumes and growth.
> Initial Configuration: 
> Running 3.0.9 and have implemented TWCS on one of our largest table.
> Largest table partitioned on (ID, MM)  using 1 day buckets with a TTL of 
> 60 days.
> Next release will change partitioning to (ID, MMDD) so that partitions 
> are aligned with daily TWCS buckets.
> Each node is currently creating roughly a 30GB SSTable per day.
> TWCS working as expected,  daily SSTables are dropping off daily after 70 
> days ( 60 + 10 day grace)
> Current deployment is a 28 node 2 datacenter cluster, 14 nodes in each DC , 
> replication factor 3
> Data directories are backed with 4 - 2TB SSDs on each node  and a 1 800GB SSD 
> for commit logs.
> Requirement is to double cluster size, capacity, and ingestion volume within 
> a few weeks.
> Observed Behavior:
> 1. streaming throughput during add node – we observed maximum 6 Mb/s 
> streaming from each of the 14 nodes on a 20Gb/s switched network, taking at 
> least 106 hours for each node to join cluster and each node is only about 2.2 
> TB is size.
> 2. compaction on the newly added node - compaction has fallen behind, with 
> anywhere from 4,000 to 10,000 SSTables at any given time.  It took 3 weeks 
> for compaction to finish on each newly added node.   Increasing number of 
> compaction threads to match number of CPU (40)  and increasing compaction 
> throughput to 32MB/s seemed to be the sweet spot. 
> 3. TWCS buckets on new node, data streamed to this node over 4 1/2 days.  
> Compaction correctly placed the data in daily files, but the problem is the 
> file dates reflect when compaction created the file and not the date of the 
> last record written in the TWCS bucket, which will cause the files to remain 
> around much longer than necessary.  
> Two Questions:
> 1. What can be done to substantially improve the performance of adding a new 
> node?
> 2. Can compaction on TWCS partitions for newly added nodes change the file 
> create date to match the highest date record in the file -or- add another 
> piece of meta-data to the TWCS files that reflect the file drop date so that 
> TWCS partitions can be dropped consistently?



--
This message was sen

[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-22 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16136652#comment-16136652
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 8/22/17 11:50 AM:
-

New version here 
https://github.com/criteo-forks/cassandra/commit/95c7bb758478a86abf3506fd6e3ddb5d06413bce

{{---}}


I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you 
wish, I don't have any strong opinion about it, just say so

{{---}}

{quote}
I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated
---
Do we want this? It feels like if we expect to be able to drop entire sstables 
due to being expired, it would be pretty wasteful to run a single sstable 
tombstone compaction when there are 20% tombstones in the sstable? We would 
probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't confortable to ignore overlaps when checking for 
fully expired sstables but not ignoring them when looking for sstables to 
compact, as in both case it will result in not doing the job due to checking 
globally instead of just locally to the sstable. I was willing to enforce the 
{{if you want to drop stuff and ignoreOverlaps is activated then look locally 
instead of globally}}

{{---}}

{quote} CompactionController:232 any reason not to return an immutable 
set?{quote}
I tried to change everything to an ImmutableSet but it breaks a lot of tests.

{{---}}

N.B: I tried to apply the syle guide found in {{.idea/codeStyleSettings.xml}} 
but it is changing me a lot of things. Do you know if it is up to date ?


was (Author: rgerard):
New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b


{{---}}


I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you 
wish, I don't have any strong opinion about it, just say so

{{---}}

{quote}
I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated
---
Do we want this? It feels like if we expect to be able to drop entire sstables 
due to being expired, it would be pretty wasteful to run a single sstable 
tombstone compaction when there are 20% tombstones in the sstable? We would 
probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't confortable to ignore overlaps when checking for 
fully expired sstables but not ignoring them when looking for sstables to 
compact, as in both case it will result in not doing the job due to checking 
globally instead of just locally to the sstable. I was willing to enforce the 
{{if you want to drop stuff and ignoreOverlaps is activated then look locally 
instead of globally}}

{{---}}

{quote} CompactionController:232 any reason not to return an immutable 
set?{quote}
I tried to change everything to an ImmutableSet but it breaks a lot of tests.

{{---}}

N.B: I tried to apply the syle guide found in {{.idea/codeStyleSettings.xml}} 
but it is changing me a lot of things. Do you know if it is up to date ?

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTa

[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-22 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16136652#comment-16136652
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 8/22/17 11:47 AM:
-

New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b


{{---}}


I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you 
wish, I don't have any strong opinion about it, just say so

{{---}}

{quote}
I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated
---
Do we want this? It feels like if we expect to be able to drop entire sstables 
due to being expired, it would be pretty wasteful to run a single sstable 
tombstone compaction when there are 20% tombstones in the sstable? We would 
probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't confortable to ignore overlaps when checking for 
fully expired sstables but not ignoring them when looking for sstables to 
compact, as in both case it will result in not doing the job due to checking 
globally instead of just locally to the sstable. I was willing to enforce the 
{{if you want to drop stuff and ignoreOverlaps is activated then look locally 
instead of globally}}

{{---}}

{quote} CompactionController:232 any reason not to return an immutable 
set?{quote}
I tried to change everything to an ImmutableSet but it breaks a lot of tests.

{{---}}

N.B: I tried to apply the syle guide found in {{.idea/codeStyleSettings.xml}} 
but it is changing me a lot of things. Do you know if it is up to date ?


was (Author: rgerard):
New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b


{{---}}


I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you 
wish, I don't have any strong opinion about it, just say so

{{---}}

{quote}
I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated
---
Do we want this? It feels like if we expect to be able to drop entire sstables 
due to being expired, it would be pretty wasteful to run a single sstable 
tombstone compaction when there are 20% tombstones in the sstable? We would 
probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't confortable to ignore overlaps when checking for 
fully expired sstables but not ignoring them when looking for sstables to 
compact, as in both case it will result in not doing the job due to checking 
globally instead of just locally to the sstable. I was willing to enforce the 
{{if you want to drop stuff and ignoreOverlaps is activated then look locally 
instead of globally}}

{{---}}

{quote} CompactionController:232 any reason not to return an immutable 
set?{quote}
I tried to change everything to an ImmutableSet but it breaks a lot of tests.

N.B: I tried to apply the syle guide found in {{.idea/codeStyleSettings.xml}} 
but it is changing me a lot of things. Do you know if it is up to date ?

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables whe

[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-22 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16136652#comment-16136652
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 8/22/17 11:46 AM:
-

New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b


{{---}}


I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you 
wish, I don't have any strong opinion about it, just say so

{{---}}

{quote}
I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated
---
Do we want this? It feels like if we expect to be able to drop entire sstables 
due to being expired, it would be pretty wasteful to run a single sstable 
tombstone compaction when there are 20% tombstones in the sstable? We would 
probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't confortable to ignore overlaps when checking for 
fully expired sstables but not ignoring them when looking for sstables to 
compact, as in both case it will result in not doing the job due to checking 
globally instead of just locally to the sstable. I was willing to enforce the 
{{if you want to drop stuff and ignoreOverlaps is activated then look locally 
instead of globally}}

{{---}}

{quote} CompactionController:232 any reason not to return an immutable 
set?{quote}
I tried to change everything to an ImmutableSet but it breaks a lot of tests.

N.B: I tried to apply the syle guide found in {{.idea/codeStyleSettings.xml}} 
but it is changing me a lot of things. Do you know if it is up to date ?


was (Author: rgerard):
New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you 
wish, I don't have any strong opinion about it, just say so

{{---}}

{quote}
I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated
---
Do we want this? It feels like if we expect to be able to drop entire sstables 
due to being expired, it would be pretty wasteful to run a single sstable 
tombstone compaction when there are 20% tombstones in the sstable? We would 
probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't confortable to ignore overlaps when checking for 
fully expired sstables but not ignoring them when looking for sstables to 
compact, as in both case it will result in not doing the job due to checking 
globally instead of just locally to the sstable. I was willing to enforce the 
{{if you want to drop stuff and ignoreOverlaps is activated then look locally 
instead of globally}}

{{---}}

{quote} CompactionController:232 any reason not to return an immutable 
set?{quote}
I tried to change everything to an ImmutableSet but it breaks a lot of tests.

N.B: I tried to apply the syle guide found in {{.idea/codeStyleSettings.xml}} 
but it is changing me a lot of things. Do you know if it is up to date ?

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fu

[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-22 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16136652#comment-16136652
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 8/22/17 11:46 AM:
-

New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you 
wish, I don't have any strong opinion about it, just say so

{{---}}

{quote}
I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated
---
Do we want this? It feels like if we expect to be able to drop entire sstables 
due to being expired, it would be pretty wasteful to run a single sstable 
tombstone compaction when there are 20% tombstones in the sstable? We would 
probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't confortable to ignore overlaps when checking for 
fully expired sstables but not ignoring them when looking for sstables to 
compact, as in both case it will result in not doing the job due to checking 
globally instead of just locally to the sstable. I was willing to enforce the 
{{if you want to drop stuff and ignoreOverlaps is activated then look locally 
instead of globally}}

{{---}}

{quote} CompactionController:232 any reason not to return an immutable 
set?{quote}
I tried to change everything to an ImmutableSet but it breaks a lot of tests.

N.B: I tried to apply the syle guide found in {{.idea/codeStyleSettings.xml}} 
but it is changing me a lot of things. Do you know if it is up to date ?


was (Author: rgerard):
New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you 
wish, I don't have any strong opinion about it, just say so

{{_}}

{quote}
I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated
---
Do we want this? It feels like if we expect to be able to drop entire sstables 
due to being expired, it would be pretty wasteful to run a single sstable 
tombstone compaction when there are 20% tombstones in the sstable? We would 
probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't confortable to ignore overlaps when checking for 
fully expired sstables but not ignoring them when looking for sstables to 
compact, as in both case it will result in not doing the job due to checking 
globally instead of just locally to the sstable. I was willing to enforce the 
{{if you want to drop stuff and ignoreOverlaps is activated then look locally 
instead of globally}}

{{_}}

{quote} CompactionController:232 any reason not to return an immutable 
set?{quote}
I tried to change everything to an ImmutableSet but it breaks a lot of tests.

N.B: I tried to apply the syle guide found in {{.idea/codeStyleSettings.xml}} 
but it is changing me a lot of things. Do you know if it is up to date ?

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired on

[jira] [Commented] (CASSANDRA-9375) setting timeouts to 1ms prevents startup

2017-08-22 Thread Jason Brown (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16136673#comment-16136673
 ] 

Jason Brown commented on CASSANDRA-9375:


Taking Jeff's opinion one step further, I think 
{{checkForLowestAcceptedTimeouts}} should be moved into {{DatabaseDescriptor}}, 
and can then be invoked from {{#applyAll()}}. This will make it similar to how 
and where to already do many of the config checks. 

While the code is rather straightforward, it would be great to add a test (to 
{{DatabaseDescriptorTest}} if that's where we put this logic) to future-proof 
this functionality.

> setting timeouts to 1ms prevents startup
> 
>
> Key: CASSANDRA-9375
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9375
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Brandon Williams
>Assignee: Varun Barala
>Priority: Trivial
>  Labels: patch
> Fix For: 2.1.x
>
> Attachments: CASSANDRA-9375_after_review, CASSANDRA-9375.patch
>
>
> Granted, this is a nonsensical setting, but the error message makes it tough 
> to discern what's wrong:
> {noformat}
> ERROR 17:13:28,726 Exception encountered during startup
> java.lang.ExceptionInInitializerError
>  at 
> org.apache.cassandra.net.MessagingService.instance(MessagingService.java:310)
>  at 
> org.apache.cassandra.service.StorageService.(StorageService.java:233)
>  at 
> org.apache.cassandra.service.StorageService.(StorageService.java:141)
>  at 
> org.apache.cassandra.locator.DynamicEndpointSnitch.(DynamicEndpointSnitch.java:87)
>  at 
> org.apache.cassandra.locator.DynamicEndpointSnitch.(DynamicEndpointSnitch.java:63)
>  at 
> org.apache.cassandra.config.DatabaseDescriptor.createEndpointSnitch(DatabaseDescriptor.java:518)
>  at 
> org.apache.cassandra.config.DatabaseDescriptor.applyConfig(DatabaseDescriptor.java:350)
>  at 
> org.apache.cassandra.config.DatabaseDescriptor.(DatabaseDescriptor.java:112)
>  at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:213)
>  at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:567)
>  at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:656)
> Caused by: java.lang.IllegalArgumentException
>  at 
> java.util.concurrent.ScheduledThreadPoolExecutor.scheduleWithFixedDelay(ScheduledThreadPoolExecutor.java:586)
>  at 
> org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor.scheduleWithFixedDelay(DebuggableScheduledThreadPoolExecutor.java:64)
>  at org.apache.cassandra.utils.ExpiringMap.(ExpiringMap.java:103)
>  at 
> org.apache.cassandra.net.MessagingService.(MessagingService.java:360)
>  at org.apache.cassandra.net.MessagingService.(MessagingService.java:68)
>  at 
> org.apache.cassandra.net.MessagingService$MSHandle.(MessagingService.java:306)
>  ... 11 more
> java.lang.ExceptionInInitializerError
>  at 
> org.apache.cassandra.net.MessagingService.instance(MessagingService.java:310)
>  at 
> org.apache.cassandra.service.StorageService.(StorageService.java:233)
>  at 
> org.apache.cassandra.service.StorageService.(StorageService.java:141)
>  at 
> org.apache.cassandra.locator.DynamicEndpointSnitch.(DynamicEndpointSnitch.java:87)
>  at 
> org.apache.cassandra.locator.DynamicEndpointSnitch.(DynamicEndpointSnitch.java:63)
>  at 
> org.apache.cassandra.config.DatabaseDescriptor.createEndpointSnitch(DatabaseDescriptor.java:518)
>  at 
> org.apache.cassandra.config.DatabaseDescriptor.applyConfig(DatabaseDescriptor.java:350)
>  at 
> org.apache.cassandra.config.DatabaseDescriptor.(DatabaseDescriptor.java:112)
>  at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:213)
>  at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:567)
>  at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:656)
> Caused by: java.lang.IllegalArgumentException
>  at 
> java.util.concurrent.ScheduledThreadPoolExecutor.scheduleWithFixedDelay(ScheduledThreadPoolExecutor.java:586)
>  at 
> org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor.scheduleWithFixedDelay(DebuggableScheduledThreadPoolExecutor.java:64)
>  at org.apache.cassandra.utils.ExpiringMap.(ExpiringMap.java:103)
>  at 
> org.apache.cassandra.net.MessagingService.(MessagingService.java:360)
>  at org.apache.cassandra.net.MessagingService.(MessagingService.java:68)
>  at 
> org.apache.cassandra.net.MessagingService$MSHandle.(MessagingService.java:306)
>  ... 11 more
> Exception encountered during startup: null
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-ma

[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-22 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16136652#comment-16136652
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 8/22/17 11:45 AM:
-

New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you 
wish, I don't have any strong opinion about it, just say so

{{_}}

{quote}
I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated
---
Do we want this? It feels like if we expect to be able to drop entire sstables 
due to being expired, it would be pretty wasteful to run a single sstable 
tombstone compaction when there are 20% tombstones in the sstable? We would 
probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't confortable to ignore overlaps when checking for 
fully expired sstables but not ignoring them when looking for sstables to 
compact, as in both case it will result in not doing the job due to checking 
globally instead of just locally to the sstable. I was willing to enforce the 
{{if you want to drop stuff and ignoreOverlaps is activated then look locally 
instead of globally}}

{{_}}

{quote} CompactionController:232 any reason not to return an immutable 
set?{quote}
I tried to change everything to an ImmutableSet but it breaks a lot of tests.

N.B: I tried to apply the syle guide found in {{.idea/codeStyleSettings.xml}} 
but it is changing me a lot of things. Do you know if it is up to date ?


was (Author: rgerard):
New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you 
wish, I don't have any strong opinion about it, just say so

{{}}

{quote}
I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated
---
Do we want this? It feels like if we expect to be able to drop entire sstables 
due to being expired, it would be pretty wasteful to run a single sstable 
tombstone compaction when there are 20% tombstones in the sstable? We would 
probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't confortable to ignore overlaps when checking for 
fully expired sstables but not ignoring them when looking for sstables to 
compact, as in both case it will result in not doing the job due to checking 
globally instead of just locally to the sstable. I was willing to enforce the 
{{if you want to drop stuff and ignoreOverlaps is activated then look locally 
instead of globally}}

{{}}

{quote} CompactionController:232 any reason not to return an immutable 
set?{quote}
I tried to change everything to an ImmutableSet but it breaks a lot of tests.

N.B: I tried to apply the syle guide found in {{.idea/codeStyleSettings.xml}} 
but it is changing me a lot of things. Do you know if it is up to date ?

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> 

[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-22 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16136652#comment-16136652
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 8/22/17 11:45 AM:
-

New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you 
wish, I don't have any strong opinion about it, just say so

{{}}

{quote}
I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated
---
Do we want this? It feels like if we expect to be able to drop entire sstables 
due to being expired, it would be pretty wasteful to run a single sstable 
tombstone compaction when there are 20% tombstones in the sstable? We would 
probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't confortable to ignore overlaps when checking for 
fully expired sstables but not ignoring them when looking for sstables to 
compact, as in both case it will result in not doing the job due to checking 
globally instead of just locally to the sstable. I was willing to enforce the 
{{if you want to drop stuff and ignoreOverlaps is activated then look locally 
instead of globally}}

{{}}

{quote} CompactionController:232 any reason not to return an immutable 
set?{quote}
I tried to change everything to an ImmutableSet but it breaks a lot of tests.

N.B: I tried to apply the syle guide found in {{.idea/codeStyleSettings.xml}} 
but it is changing me a lot of things. Do you know if it is up to date ?


was (Author: rgerard):
New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you 
wish, I don't have any strong opinion about it, just say so


{quote}
I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated
---
Do we want this? It feels like if we expect to be able to drop entire sstables 
due to being expired, it would be pretty wasteful to run a single sstable 
tombstone compaction when there are 20% tombstones in the sstable? We would 
probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't confortable to ignore overlaps when checking for 
fully expired sstables but not ignoring them when looking for sstables to 
compact, as in both case it will result in not doing the job due to checking 
globally instead of just locally to the sstable. I was willing to enforce the 
{{if you want to drop stuff and ignoreOverlaps is activated then look locally 
instead of globally}}


{quote} CompactionController:232 any reason not to return an immutable 
set?{quote}
I tried to change everything to an ImmutableSet but it breaks a lot of tests.

N.B: I tried to apply the syle guide found in {{.idea/codeStyleSettings.xml}} 
but it is changing me a lot of things. Do you know if it is up to date ?

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the quest

[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-22 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16136652#comment-16136652
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 8/22/17 11:44 AM:
-

New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you 
wish, I don't have any strong opinion about it, just say so


{quote}
I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated
---
Do we want this? It feels like if we expect to be able to drop entire sstables 
due to being expired, it would be pretty wasteful to run a single sstable 
tombstone compaction when there are 20% tombstones in the sstable? We would 
probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't confortable to ignore overlaps when checking for 
fully expired sstables but not ignoring them when looking for sstables to 
compact, as in both case it will result in not doing the job due to checking 
globally instead of just locally to the sstable. I was willing to enforce the 
{{if you want to drop stuff and ignoreOverlaps is activated then look locally 
instead of globally}}


* CompactionController:232 any reason not to return an immutable set?
I tried to change everything to an ImmutableSet but it breaks a lot of tests.

N.B: I tried to apply the syle guide found in {{.idea/codeStyleSettings.xml}} 
but it is changing me a lot of things. Do you know if it is up to date ?


was (Author: rgerard):
New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you 
wish, I don't have any strong opinion about it, just say so


{quote}
I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated
---
Do we want this? It feels like if we expect to be able to drop entire sstables 
due to being expired, it would be pretty wasteful to run a single sstable 
tombstone compaction when there are 20% tombstones in the sstable? We would 
probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't confortable to ignore overlaps when checking for 
fully expired sstables but not ignoring them when looking for sstables to 
compact, as in both case it will result in not doing the job due to checking 
globally instead of just locally to the sstable. I was willing to enforce the 
{{if you want to drop stuff and ignoreOverlaps is activated then look locally 
instead of globally}}

N.B: I tried to apply the syle guide found in {{.idea/codeStyleSettings.xml}} 
but it is changing me a lot of things. Do you know if it is up to date ?

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doin

[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-22 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16136652#comment-16136652
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 8/22/17 11:44 AM:
-

New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you 
wish, I don't have any strong opinion about it, just say so


{quote}
I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated
---
Do we want this? It feels like if we expect to be able to drop entire sstables 
due to being expired, it would be pretty wasteful to run a single sstable 
tombstone compaction when there are 20% tombstones in the sstable? We would 
probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't confortable to ignore overlaps when checking for 
fully expired sstables but not ignoring them when looking for sstables to 
compact, as in both case it will result in not doing the job due to checking 
globally instead of just locally to the sstable. I was willing to enforce the 
{{if you want to drop stuff and ignoreOverlaps is activated then look locally 
instead of globally}}


{quote} CompactionController:232 any reason not to return an immutable 
set?{quote}
I tried to change everything to an ImmutableSet but it breaks a lot of tests.

N.B: I tried to apply the syle guide found in {{.idea/codeStyleSettings.xml}} 
but it is changing me a lot of things. Do you know if it is up to date ?


was (Author: rgerard):
New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you 
wish, I don't have any strong opinion about it, just say so


{quote}
I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated
---
Do we want this? It feels like if we expect to be able to drop entire sstables 
due to being expired, it would be pretty wasteful to run a single sstable 
tombstone compaction when there are 20% tombstones in the sstable? We would 
probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't confortable to ignore overlaps when checking for 
fully expired sstables but not ignoring them when looking for sstables to 
compact, as in both case it will result in not doing the job due to checking 
globally instead of just locally to the sstable. I was willing to enforce the 
{{if you want to drop stuff and ignoreOverlaps is activated then look locally 
instead of globally}}


* CompactionController:232 any reason not to return an immutable set?
I tried to change everything to an ImmutableSet but it breaks a lot of tests.

N.B: I tried to apply the syle guide found in {{.idea/codeStyleSettings.xml}} 
but it is changing me a lot of things. Do you know if it is up to date ?

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need 

[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-22 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16136652#comment-16136652
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 8/22/17 11:39 AM:
-

New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you 
wish, I don't have any strong opinion about it, just say so


{quote}
I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated
---
Do we want this? It feels like if we expect to be able to drop entire sstables 
due to being expired, it would be pretty wasteful to run a single sstable 
tombstone compaction when there are 20% tombstones in the sstable? We would 
probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't confortable to ignore overlaps when checking for 
fully expired sstables but not ignoring them when looking for sstables to 
compact, as in both case it will result in not doing the job due to checking 
globally instead of just locally to the sstable. I was willing to enforce the 
{{if you want to drop stuff and ignoreOverlaps is activated then look locally 
instead of globally}}

N.B: I tried to apply the syle guide found in {{.idea/codeStyleSettings.xml}} 
but it is changing me a lot of things. Do you know if it is up to date ?


was (Author: rgerard):
New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you 
wish, I don't have any strong opinion about it, just say so


{quote}
I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated
---
Do we want this? It feels like if we expect to be able to drop entire sstables 
due to being expired, it would be pretty wasteful to run a single sstable 
tombstone compaction when there are 20% tombstones in the sstable? We would 
probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't confortable to ignore overlaps when checking for 
fully expired sstables but not ignoring it when looking for sstables to 
compact, as in both case it will result in not doing the job due to checking 
globally instead of just locally to the sstable. I was willing to enforce the 
{{if you want to drop stuff and ignoreOverlaps is activated then look locally 
instead of globally}}

N.B: I tried to apply the syle guide found in {{.idea/codeStyleSettings.xml}} 
but it is changing me a lot of things. Do you know if it is up to date ?

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to h

[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-22 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16136652#comment-16136652
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 8/22/17 11:36 AM:
-

New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you 
wish, I don't have any strong opinion about it, just say so


{quote}
I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated
---
Do we want this? It feels like if we expect to be able to drop entire sstables 
due to being expired, it would be pretty wasteful to run a single sstable 
tombstone compaction when there are 20% tombstones in the sstable? We would 
probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't confortable to ignore overlaps when checking for 
fully expired sstables but not ignoring it when looking for sstables to 
compact, as in both case it will result in not doing the job due to checking 
globally instead of just locally to the sstable. I was willing to enforce the 
{{if you want to drop stuff and ignoreOverlaps is activated then look locally 
instead of globally}}

N.B: I tried to apply the syle guide found in {{.idea/codeStyleSettings.xml}} 
but it is changing me a lot of things. Do you know if it is up to date ?


was (Author: rgerard):
New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you 
wish, I don't have any strong opinion about it


{quote}
I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated
---
Do we want this? It feels like if we expect to be able to drop entire sstables 
due to being expired, it would be pretty wasteful to run a single sstable 
tombstone compaction when there are 20% tombstones in the sstable? We would 
probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't confortable to ignore overlaps when checking for 
fully expired sstables but not ignoring it when looking for sstables to 
compact, as in both case it will result in not doing the job due to checking 
globally instead of just locally to the sstable. I was willing to enforce the 
{{if you want to drop stuff and ignoreOverlaps is activated then look locally 
instead of globally}}

N.B: I tried to apply the syle guide found in {{.idea/codeStyleSettings.xml}} 
but it is changing me a lot of things. Do you know if it is up to date ?

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard

[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-22 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16136652#comment-16136652
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 8/22/17 11:34 AM:
-

New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you 
wish, I don't have any strong opinion about it


{quote}
I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated
---
Do we want this? It feels like if we expect to be able to drop entire sstables 
due to being expired, it would be pretty wasteful to run a single sstable 
tombstone compaction when there are 20% tombstones in the sstable? We would 
probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't confortable to ignore overlaps when checking for 
fully expired sstables but not ignoring it when looking for sstables to 
compact, as in both case it will result in not doing the job due to checking 
globally instead of just locally to the sstable. I was willing to enforce the 
{{if you want to drop stuff and ignoreOverlaps is activated then look locally 
instead of globally}}

N.B: I tried to apply the syle guide found in {{.idea/codeStyleSettings.xml}} 
but it is changing me a lot of things. Do you know if it is up to date ?


was (Author: rgerard):
New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you 
wish, I don't have any strong opinion about it


{quote}
I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated
---
Do we want this? It feels like if we expect to be able to drop entire sstables 
due to being expired, it would be pretty wasteful to run a single sstable 
tombstone compaction when there are 20% tombstones in the sstable? We would 
probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't confortable to ignore overlaps when checking for 
fully expired sstables but not ignoring it when looking for sstables to 
compact, as in both case it will result in not doing the job due to checking 
globally instead of just locally to the sstable. I was willing to enforce the 
{{if you want to drop stuff and ignoreOverlaps is activated then look locally 
instead of globally}}

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patc

[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-22 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16136652#comment-16136652
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 8/22/17 11:32 AM:
-

New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you 
wish, I don't have any strong opinion about it


{quote}
I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated
---
Do we want this? It feels like if we expect to be able to drop entire sstables 
due to being expired, it would be pretty wasteful to run a single sstable 
tombstone compaction when there are 20% tombstones in the sstable? We would 
probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't confortable to ignore overlaps when checking for 
fully expired sstables but not ignoring it when looking for sstables to 
compact, as in both case it will result in not doing the job due to checking 
globally instead of just locally to the sstable. I was willing to enforce the 
{{if you want to drop stuff and ignoreOverlaps is activated then look locally 
instead of globally}}


was (Author: rgerard):
New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you 
wish, I don't have any strong opinion about it


{quote}
I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated
---
Do we want this? It feels like if we expect to be able to drop entire sstables 
due to being expired, it would be pretty wasteful to run a single sstable 
tombstone compaction when there are 20% tombstones in the sstable? We would 
probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't confortable to ignore overlaps when checking for 
fully expired sstables but not ignoring it when looking for sstables to 
compact, as in both case it will result in not doing the job due to checking 
globally instead of just locally to the sstable. I was willing to enforce the 
{{if you want to drop stuff if ignoreOverlaps is activated look locally instead 
of globally}}

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues al

[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-22 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16136652#comment-16136652
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 8/22/17 11:31 AM:
-

New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you 
wish, I don't have any strong opinion about it


{quote}
I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated
---
Do we want this? It feels like if we expect to be able to drop entire sstables 
due to being expired, it would be pretty wasteful to run a single sstable 
tombstone compaction when there are 20% tombstones in the sstable? We would 
probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't confortable to ignore overlaps when checking for 
fully expired sstables but not ignoring it when looking for sstables to 
compact, as in both case it will result in not doing the job due to checking 
globally instead of just locally to the sstable. I was willing to enforce the 
{{if you want to drop stuff if ignoreOverlaps is activated look locally instead 
of globally}}


was (Author: rgerard):
New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you 
wish, I don't have any strong opinion about it


{quote}
I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated
---
Do we want this? It feels like if we expect to be able to drop entire sstables 
due to being expired, it would be pretty wasteful to run a single sstable 
tombstone compaction when there are 20% tombstones in the sstable? We would 
probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't confortable to ignore overlaps when checking for 
fully expired sstables but not ignoring it when looking for sstables to 
compact, as in both case it will result in not doing the job due to checking 
globally instead of just locally to the sstable.

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-

[jira] [Created] (CASSANDRA-13784) mismatched input 'default' expecting for CREATE TABLE

2017-08-22 Thread timur (JIRA)
timur created CASSANDRA-13784:
-

 Summary: mismatched input 'default' expecting for CREATE TABLE
 Key: CASSANDRA-13784
 URL: https://issues.apache.org/jira/browse/CASSANDRA-13784
 Project: Cassandra
  Issue Type: Bug
  Components: CQL
 Environment: 
{code}
[root@localhost config]# cat /etc/os-release
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/";
BUG_REPORT_URL="https://bugs.centos.org/";

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"

[root@localhost config]# rpm -qa | grep cassandra
cassandra-3.11.0-1.noarch

[root@localhost config]# cassandra -v
3.11.0
{code}

Reporter: timur


Got following error for table create if column name is "default" on 3.11. If I 
change column name for  example to "defaults" table successfully creates.
For 3.9 version table with "default" column name successfully creates.  

{code:sql}
cqlsh:api_production> CREATE TABLE tasks (id uuid PRIMARY KEY, default boolean, 
name text, tenant_id uuid) ;
SyntaxException: line 1:41 mismatched input 'default' expecting ')' (...(id 
uuid PRIMARY KEY, [default]...)
{code}




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-22 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16136652#comment-16136652
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 8/22/17 11:26 AM:
-

New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you 
wish, I don't have any strong opinion about it


{quote}
I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated
---
Do we want this? It feels like if we expect to be able to drop entire sstables 
due to being expired, it would be pretty wasteful to run a single sstable 
tombstone compaction when there are 20% tombstones in the sstable? We would 
probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't confortable to ignore overlaps when checking for 
fully expired sstables but not ignoring it when looking for sstables to 
compact, as in both case it will result in not doing the job due to checking 
globally instead of just locally to the sstable.


was (Author: rgerard):
New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you 
wish, I don't have any strong opinion about it


{quote}
I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated
---
Do we want this? It feels like if we expect to be able to drop entire sstables 
due to being expired, it would be pretty wasteful to run a single sstable 
tombstone compaction when there are 20% tombstones in the sstable? We would 
probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't comfortable to ignore overlaps when checking for 
fully expired sstables but not ignoring it when looking for sstables to 
compact. 

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-22 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16136652#comment-16136652
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 8/22/17 11:23 AM:
-

New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you 
wish, I don't have any strong opinion about it


{quote}
I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated
---
Do we want this? It feels like if we expect to be able to drop entire sstables 
due to being expired, it would be pretty wasteful to run a single sstable 
tombstone compaction when there are 20% tombstones in the sstable? We would 
probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't comfortable to ignore overlaps when checking for 
fully expired sstables but not ignoring it when looking for sstables to 
compact. 


was (Author: rgerard):
New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you 
wish, I don't have any strong opinion about it


{quote}Do we want this? It feels like if we expect to be able to drop entire 
sstables due to being expired, it would be pretty wasteful to run a single 
sstable tombstone compaction when there are 20% tombstones in the sstable? We 
would probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't comfortable to ignore overlaps when checking for 
fully expired sstables but not ignoring it when looking for sstables to 
compact. 

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-22 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16136652#comment-16136652
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 8/22/17 11:22 AM:
-

New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove `TWCSCompactionController.getFullyExpiredSSTables(..)` if you 
wish, I don't any strong opinion about it


{quote}Do we want this? It feels like if we expect to be able to drop entire 
sstables due to being expired, it would be pretty wasteful to run a single 
sstable tombstone compaction when there are 20% tombstones in the sstable? We 
would probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't comfortable to ignore overlaps when checking for 
fully expired sstables but not ignoring it when looking for sstables to 
compact. 


was (Author: rgerard):
New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove `TWCSCompactionController.getFullyExpiredSSTables(..)` if you 
wish, I don't any strong opinion about it


{quote}Do we want this? It feels like if we expect to be able to drop entire 
sstables due to being expired, it would be pretty wasteful to run a single 
sstable tombstone compaction when there are 20% tombstones in the sstable? We 
would probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't comfortable to ignore overlaps when checking for 
fully expired sstables but not ignoring it when looking for sstables to 
compact. 

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-22 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16136652#comment-16136652
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 8/22/17 11:22 AM:
-

New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove {{TWCSCompactionController.getFullyExpiredSSTables(..)}} if you 
wish, I don't have any strong opinion about it


{quote}Do we want this? It feels like if we expect to be able to drop entire 
sstables due to being expired, it would be pretty wasteful to run a single 
sstable tombstone compaction when there are 20% tombstones in the sstable? We 
would probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't comfortable to ignore overlaps when checking for 
fully expired sstables but not ignoring it when looking for sstables to 
compact. 


was (Author: rgerard):
New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove `TWCSCompactionController.getFullyExpiredSSTables(..)` if you 
wish, I don't have any strong opinion about it


{quote}Do we want this? It feels like if we expect to be able to drop entire 
sstables due to being expired, it would be pretty wasteful to run a single 
sstable tombstone compaction when there are 20% tombstones in the sstable? We 
would probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't comfortable to ignore overlaps when checking for 
fully expired sstables but not ignoring it when looking for sstables to 
compact. 

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-22 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16136652#comment-16136652
 ] 

Romain GERARD edited comment on CASSANDRA-13418 at 8/22/17 11:22 AM:
-

New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove `TWCSCompactionController.getFullyExpiredSSTables(..)` if you 
wish, I don't have any strong opinion about it


{quote}Do we want this? It feels like if we expect to be able to drop entire 
sstables due to being expired, it would be pretty wasteful to run a single 
sstable tombstone compaction when there are 20% tombstones in the sstable? We 
would probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't comfortable to ignore overlaps when checking for 
fully expired sstables but not ignoring it when looking for sstables to 
compact. 


was (Author: rgerard):
New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove `TWCSCompactionController.getFullyExpiredSSTables(..)` if you 
wish, I don't any strong opinion about it


{quote}Do we want this? It feels like if we expect to be able to drop entire 
sstables due to being expired, it would be pretty wasteful to run a single 
sstable tombstone compaction when there are 20% tombstones in the sstable? We 
would probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't comfortable to ignore overlaps when checking for 
fully expired sstables but not ignoring it when looking for sstables to 
compact. 

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-22 Thread Romain GERARD (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16136652#comment-16136652
 ] 

Romain GERARD commented on CASSANDRA-13418:
---

New version here 
https://github.com/criteo-forks/cassandra/commit/cfabb2ddd31f16ae127d4b22e0c02a1676ba336b
* I can remove `TWCSCompactionController.getFullyExpiredSSTables(..)` if you 
wish, I don't any strong opinion about it


{quote}Do we want this? It feels like if we expect to be able to drop entire 
sstables due to being expired, it would be pretty wasteful to run a single 
sstable tombstone compaction when there are 20% tombstones in the sstable? We 
would probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?{quote}

In my case you are right, activating disableTombstoneCompaction or setting the 
tombstoneThresold high enough should be better performance wise. My intention 
when activating the option is to guarantee a consistent behavior for 
overlapping checks. I wasn't comfortable to ignore overlaps when checking for 
fully expired sstables but not ignoring it when looking for sstables to 
compact. 

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13756) StreamingHistogram is not thread safe

2017-08-22 Thread JIRA

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16136625#comment-16136625
 ] 

Hannu Kröger commented on CASSANDRA-13756:
--

Created a branch in https://issues.apache.org/jira/browse/CASSANDRA-13752 for 
serialization fix.

> StreamingHistogram is not thread safe
> -
>
> Key: CASSANDRA-13756
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13756
> Project: Cassandra
>  Issue Type: Bug
>Reporter: xiangzhou xia
>Assignee: Jeff Jirsa
> Fix For: 3.0.x, 3.11.x
>
>
> When we test C*3 in shadow cluster, we notice after a period of time, several 
> data node suddenly run into 100% cpu and stop process query anymore.
> After investigation, we found that threads are stuck on the sum() in 
> streaminghistogram class. Those are jmx threads that working on expose 
> getTombStoneRatio metrics (since jmx is kicked off every 3 seconds, there is 
> a chance that multiple jmx thread is access streaminghistogram at the same 
> time).  
> After further investigation, we find that the optimization in CASSANDRA-13038 
> led to a spool flush every time when we call sum(). Since TreeMap is not 
> thread safe, threads will be stuck when multiple threads visit sum() at the 
> same time.
> There are two approaches to solve this issue. 
> The first one is to add a lock to the flush in sum() which will introduce 
> some extra overhead to streaminghistogram.
> The second one is to avoid streaminghistogram to be access by multiple 
> threads. For our specific case, is to remove the metrics we added.  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13752) Corrupted SSTables created in 3.11

2017-08-22 Thread JIRA

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16136624#comment-16136624
 ] 

Hannu Kröger commented on CASSANDRA-13752:
--

I created a branch with potential fix for this particular problem

||Branch||utest||dtest||
|[3.11|https://circleci.com/gh/hkroger/cassandra/tree/cassandra-3.11-13752]|[3.11
 
circle|https://circleci.com/gh/hkroger/cassandra/tree/cassandra-3.11-13752]|???|


> Corrupted SSTables created in 3.11
> --
>
> Key: CASSANDRA-13752
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13752
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Hannu Kröger
>Priority: Blocker
>
> We have discovered issues with corrupted SSTables. 
> {code}
> ERROR [SSTableBatchOpen:22] 2017-08-03 20:19:53,195 SSTableReader.java:577 - 
> Cannot read sstable 
> /cassandra/data/mykeyspace/mytable-7a4992800d5611e7b782cb90016f2d17/mc-35556-big=[Data.db,
>  Statistics.db, Summary.db, Digest.crc32, CompressionInfo.db, TOC.txt, 
> Index.db, Filter.db]; other IO error, skipping table
> java.io.EOFException: EOF after 1898 bytes out of 21093
> at 
> org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:68)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:60)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:402) 
> ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.utils.ByteBufferUtil.readWithShortLength(ByteBufferUtil.java:377)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.io.sstable.metadata.StatsMetadata$StatsMetadataSerializer.deserialize(StatsMetadata.java:325)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.io.sstable.metadata.StatsMetadata$StatsMetadataSerializer.deserialize(StatsMetadata.java:231)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:122)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:93)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.io.sstable.format.SSTableReader.open(SSTableReader.java:488)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.io.sstable.format.SSTableReader.open(SSTableReader.java:396)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.io.sstable.format.SSTableReader$5.run(SSTableReader.java:561)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [na:1.8.0_111]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> [na:1.8.0_111]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  [na:1.8.0_111]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_111]
> at 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81)
>  [apache-cassandra-3.11.0.jar:3.11.0]
> {code}
> Files look like this:
> {code}
> -rw-r--r--. 1 cassandra cassandra 3899251 Aug  7 08:37 
> mc-6166-big-CompressionInfo.db
> -rw-r--r--. 1 cassandra cassandra 16874421686 Aug  7 08:37 mc-6166-big-Data.db
> -rw-r--r--. 1 cassandra cassandra  10 Aug  7 08:37 
> mc-6166-big-Digest.crc32
> -rw-r--r--. 1 cassandra cassandra 2930904 Aug  7 08:37 
> mc-6166-big-Filter.db
> -rw-r--r--. 1 cassandra cassandra   75880 Aug  7 08:37 
> mc-6166-big-Index.db
> -rw-r--r--. 1 cassandra cassandra   13762 Aug  7 08:37 
> mc-6166-big-Statistics.db
> -rw-r--r--. 1 cassandra cassandra  882008 Aug  7 08:37 
> mc-6166-big-Summary.db
> -rw-r--r--. 1 cassandra cassandra  92 Aug  7 08:37 mc-6166-big-TOC.txt
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13719) Potential AssertionError during ReadRepair of range tombstone and partition deletions

2017-08-22 Thread Branimir Lambov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16136596#comment-16136596
 ] 

Branimir Lambov commented on CASSANDRA-13719:
-

Is it not possible to have both a partition deletion repair and a condition 
suitable for the [{{markerToRepair == null}} 
branch|https://github.com/apache/cassandra/commit/8900a8dd9c2a66dfb601031f0905edffc427557f#diff-8781f9483cca1cfc87145c767295cc79R360]?
 E.g. partition deletion with time 10, range tombstone with time 11 between 1 
and 10, with the other source having only a range tombstone with time 11 
between 2 and 3? Moreover, if the second source includes a range tombstone with 
time 10 between 4 and 5 we may get trouble from [the other side of that 
branch|https://github.com/apache/cassandra/commit/8900a8dd9c2a66dfb601031f0905edffc427557f#diff-8781f9483cca1cfc87145c767295cc79R375].

 I'm not sure this can actually happen in practice, but it does not look too 
hard to fix: I think we need to check that the current deletion is not the same 
as the partition level deletion repair in all cases that generate tombstones.

> Potential AssertionError during ReadRepair of range tombstone and partition 
> deletions
> -
>
> Key: CASSANDRA-13719
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13719
> Project: Cassandra
>  Issue Type: Bug
>  Components: Coordination
>Reporter: Sylvain Lebresne
>Assignee: Sylvain Lebresne
> Fix For: 3.0.x, 3.11.x
>
>
> When reconciling range tombstones for read repair in 
> {{DataResolver.RepairMergeListener.MergeListener}}, when we check if there is 
> ongoing deletion repair for a source, we don't look for partition level 
> deletions which throw off the logic and can throw an AssertionError.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13640) CQLSH error when using 'login' to switch users

2017-08-22 Thread Sergio Bossa (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Bossa updated CASSANDRA-13640:
-
Reviewer: ZhaoYang

> CQLSH error when using 'login' to switch users
> --
>
> Key: CASSANDRA-13640
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13640
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL
>Reporter: Andrés de la Peña
>Assignee: Andrés de la Peña
>Priority: Minor
> Fix For: 3.0.x
>
>
> Using {{PasswordAuthenticator}} and {{CassandraAuthorizer}}:
> {code}
> bin/cqlsh -u cassandra -p cassandra
> Connected to Test Cluster at 127.0.0.1:9042.
> [cqlsh 5.0.1 | Cassandra 3.0.14-SNAPSHOT | CQL spec 3.4.0 | Native protocol 
> v4]
> Use HELP for help.
> cassandra@cqlsh> create role super with superuser = true and password = 'p' 
> and login = true;
> cassandra@cqlsh> login super;
> Password:
> super@cqlsh> list roles;
> 'Row' object has no attribute 'values'
> {code}
> When we initialize the Shell, we configure certain settings on the session 
> object such as
> {code}
> self.session.default_timeout = request_timeout
> self.session.row_factory = ordered_dict_factory
> self.session.default_consistency_level = cassandra.ConsistencyLevel.ONE
> {code}
> However, once we perform a LOGIN cmd, which calls do_login(..), we create a 
> new cluster/session object but actually never set those settings on the new 
> session.
> It isn't failing on 3.x. 
> As a workaround, it is possible to logout and log back in and things work 
> correctly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13464) Failed to create Materialized view with a specific token range

2017-08-22 Thread Sergio Bossa (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Bossa updated CASSANDRA-13464:
-
Reviewer: ZhaoYang

> Failed to create Materialized view with a specific token range
> --
>
> Key: CASSANDRA-13464
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13464
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Natsumi Kojima
>Assignee: Krishna Dattu Koneru
>Priority: Minor
>  Labels: materializedviews
>
> Failed to create Materialized view with a specific token range.
> Example :
> {code:java}
> $ ccm create "MaterializedView" -v 3.0.13
> $ ccm populate  -n 3
> $ ccm start
> $ ccm status
> Cluster: 'MaterializedView'
> ---
> node1: UP
> node3: UP
> node2: UP
> $ccm node1 cqlsh
> Connected to MaterializedView at 127.0.0.1:9042.
> [cqlsh 5.0.1 | Cassandra 3.0.13 | CQL spec 3.4.0 | Native protocol v4]
> Use HELP for help.
> cqlsh> CREATE KEYSPACE test WITH replication = {'class':'SimpleStrategy', 
> 'replication_factor':3};
> cqlsh> CREATE TABLE test.test ( id text PRIMARY KEY , value1 text , value2 
> text, value3 text);
> $ccm node1 ring test 
> Datacenter: datacenter1
> ==
> AddressRackStatus State   LoadOwns
> Token
>   
> 3074457345618258602
> 127.0.0.1  rack1   Up Normal  64.86 KB100.00% 
> -9223372036854775808
> 127.0.0.2  rack1   Up Normal  86.49 KB100.00% 
> -3074457345618258603
> 127.0.0.3  rack1   Up Normal  89.04 KB100.00% 
> 3074457345618258602
> $ ccm node1 cqlsh
> cqlsh> INSERT INTO test.test (id, value1 , value2, value3 ) VALUES ('aaa', 
> 'aaa', 'aaa' ,'aaa');
> cqlsh> INSERT INTO test.test (id, value1 , value2, value3 ) VALUES ('bbb', 
> 'bbb', 'bbb' ,'bbb');
> cqlsh> SELECT token(id),id,value1 FROM test.test;
>  system.token(id) | id  | value1
> --+-+
>  -4737872923231490581 | aaa |aaa
>  -3071845237020185195 | bbb |bbb
> (2 rows)
> cqlsh> CREATE MATERIALIZED VIEW test.test_view AS SELECT value1, id FROM 
> test.test WHERE id IS NOT NULL AND value1 IS NOT NULL AND TOKEN(id) > 
> -9223372036854775808 AND TOKEN(id) < -3074457345618258603 PRIMARY KEY(value1, 
> id) WITH CLUSTERING ORDER BY (id ASC);
> ServerError: java.lang.ClassCastException: 
> org.apache.cassandra.cql3.TokenRelation cannot be cast to 
> org.apache.cassandra.cql3.SingleColumnRelation
> {code}
> Stacktrace :
> {code:java}
> INFO  [MigrationStage:1] 2017-04-19 18:32:48,131 ColumnFamilyStore.java:389 - 
> Initializing test.test
> WARN  [SharedPool-Worker-1] 2017-04-19 18:44:07,263 FBUtilities.java:337 - 
> Trigger directory doesn't exist, please create it and try again.
> ERROR [SharedPool-Worker-1] 2017-04-19 18:46:10,072 QueryMessage.java:128 - 
> Unexpected error during query
> java.lang.ClassCastException: org.apache.cassandra.cql3.TokenRelation cannot 
> be cast to org.apache.cassandra.cql3.SingleColumnRelation
>   at 
> org.apache.cassandra.db.view.View.relationsToWhereClause(View.java:275) 
> ~[apache-cassandra-3.0.13.jar:3.0.13]
>   at 
> org.apache.cassandra.cql3.statements.CreateViewStatement.announceMigration(CreateViewStatement.java:219)
>  ~[apache-cassandra-3.0.13.jar:3.0.13]
>   at 
> org.apache.cassandra.cql3.statements.SchemaAlteringStatement.execute(SchemaAlteringStatement.java:93)
>  ~[apache-cassandra-3.0.13.jar:3.0.13]
>   at 
> org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:206)
>  ~[apache-cassandra-3.0.13.jar:3.0.13]
>   at 
> org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:237) 
> ~[apache-cassandra-3.0.13.jar:3.0.13]
>   at 
> org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:222) 
> ~[apache-cassandra-3.0.13.jar:3.0.13]
>   at 
> org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:115)
>  ~[apache-cassandra-3.0.13.jar:3.0.13]
>   at 
> org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:513)
>  [apache-cassandra-3.0.13.jar:3.0.13]
>   at 
> org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:407)
>  [apache-cassandra-3.0.13.jar:3.0.13]
>   at 
> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
>  [netty-all-4.0.44.Final.jar:4.0.44.Final]
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
>  [netty-all-4.0.44.Final.jar:4.0.44.Final]
>   at 
> io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:35)
>  [netty

[jira] [Commented] (CASSANDRA-13418) Allow TWCS to ignore overlaps when dropping fully expired sstables

2017-08-22 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16136544#comment-16136544
 ] 

Marcus Eriksson commented on CASSANDRA-13418:
-

bq. I enable uncheckedTombstoneCompaction when ignoreOverlaps is activated
Do we want this? It feels like if we expect to be able to drop entire sstables 
due to being expired, it would be pretty wasteful to run a single sstable 
tombstone compaction when there are 20% tombstones in the sstable? We would 
probably be better off waiting until 100% is expired and drop the entire 
sstable without compaction?

> Allow TWCS to ignore overlaps when dropping fully expired sstables
> --
>
> Key: CASSANDRA-13418
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13418
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Corentin Chary
>  Labels: twcs
> Attachments: twcs-cleanup.png
>
>
> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If 
> you really want read-repairs you're going to have sstables blocking the 
> expiration of other fully expired SSTables because they overlap.
> You can set unchecked_tombstone_compaction = true or tombstone_threshold to a 
> very low value and that will purge the blockers of old data that should 
> already have expired, thus removing the overlaps and allowing the other 
> SSTables to expire.
> The thing is that this is rather CPU intensive and not optimal. If you have 
> time series, you might not care if all your data doesn't exactly expire at 
> the right time, or if data re-appears for some time, as long as it gets 
> deleted as soon as it can. And in this situation I believe it would be really 
> beneficial to allow users to simply ignore overlapping SSTables when looking 
> for fully expired ones.
> To the question: why would you need read-repairs ?
> - Full repairs basically take longer than the TTL of the data on my dataset, 
> so this isn't really effective.
> - Even with a 10% chances of doing a repair, we found out that this would be 
> enough to greatly reduce entropy of the most used data (and if you have 
> timeseries, you're likely to have a dashboard doing the same important 
> queries over and over again).
> - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow.
> I'll try to come up with a patch demonstrating how this would work, try it on 
> our system and report the effects.
> cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-8457) nio MessagingService

2017-08-22 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16136520#comment-16136520
 ] 

Sylvain Lebresne commented on CASSANDRA-8457:
-

Yes, I'm +1 on the patch on this specific ticket, and great job on that btw! 
But I do want to clarify that imo CASSANDRA-13630 should  kind of be finished 
as well to call this all to be complete/committable (I've already mentioned 
that to Jason offline and we seem on the same page, just mentioning it for the 
records).

> nio MessagingService
> 
>
> Key: CASSANDRA-8457
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8457
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Jonathan Ellis
>Assignee: Jason Brown
>Priority: Minor
>  Labels: netty, performance
> Fix For: 4.x
>
> Attachments: 8457-load.tgz
>
>
> Thread-per-peer (actually two each incoming and outbound) is a big 
> contributor to context switching, especially for larger clusters.  Let's look 
> at switching to nio, possibly via Netty.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13741) Replace Cassandra's lz4-1.3.0.jar with lz4-java-1.4.0.jar

2017-08-22 Thread Amitkumar Ghatwal (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16136517#comment-16136517
 ] 

Amitkumar Ghatwal commented on CASSANDRA-13741:
---

Any updates here  on tests  - [~mkjellman] ?

> Replace Cassandra's lz4-1.3.0.jar with lz4-java-1.4.0.jar
> -
>
> Key: CASSANDRA-13741
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13741
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Libraries
>Reporter: Amitkumar Ghatwal
>Assignee: Michael Kjellman
> Fix For: 4.x
>
>
> Hi All,
> The latest lz4-java library has been released 
> (https://github.com/lz4/lz4-java/releases) and uploaded to maven central . 
> Please replace in mainline the current version ( 1.3.0) with the latest one ( 
> 1.4.0) from here - http://repo1.maven.org/maven2/org/lz4/lz4-java/1.4.0/
> Adding : [~ReiOdaira].
> Regards,
> Amit



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org