[jira] [Updated] (CASSANDRA-15463) Fix in-jvm dtest java 11 compatibility

2019-12-20 Thread Blake Eggleston (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15463:

Reviewers: Alex Petrov

> Fix in-jvm dtest java 11 compatibility
> --
>
> Key: CASSANDRA-15463
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15463
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 4.0-alpha
>
>
> The url classloader used by the in jvm dtests is not accessible by default in 
> java 11.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15463) Fix in-jvm dtest java 11 compatibility

2019-12-20 Thread Blake Eggleston (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15463:

Test and Documentation Plan: circle
 Status: Patch Available  (was: Open)

[4.0|https://github.com/bdeggleston/cassandra/tree/15463-4.0]
[circle|https://circleci.com/workflow-run/f1845b60-c8be-41d1-8cea-b03f8a8bc064]\

Added a flag to ant, and reworked a test. The failing repair test is also 
failing on trunk.

> Fix in-jvm dtest java 11 compatibility
> --
>
> Key: CASSANDRA-15463
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15463
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 4.0-alpha
>
>
> The url classloader used by the in jvm dtests is not accessible by default in 
> java 11.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-15443) Add data modeling documentation to docs

2019-12-20 Thread Jon Haddad (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jon Haddad reassigned CASSANDRA-15443:
--

Reviewers: Jon Haddad
 Assignee: Jeffrey Carpenter

> Add data modeling documentation to docs
> ---
>
> Key: CASSANDRA-15443
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15443
> Project: Cassandra
>  Issue Type: Task
>  Components: Documentation/Blog
>Reporter: Nate McCall
>Assignee: Jeffrey Carpenter
>Priority: Normal
> Attachments: 0001-data-modeling-documentation.patch
>
>
> Jeff Carpenter and O'Reilly have offered to contribute a complete chapter on 
> data modeling from Cassandra, The Definitive Guide as a patch and thus under 
> ASFv2.0 license. 
> We've had this stubbed out for some time on our site so this will be a 
> fantastic addition:
> http://cassandra.apache.org/doc/latest/data_modeling/index.html
> This issue will be for converting the text to our site format. 
> For some background, see LEGAL-486. The consensus on a follow up thread on 
> legal-discuss was that Jeff C. signing a ICLA and accepting this as a patch 
> was good enough: 
> https://lists.apache.org/thread.html/86485fd59bdb8d6b7932447c7cd6e1d50d23bb91aaf2680153855597%40%3Clegal-discuss.apache.org%3E



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15443) Add data modeling documentation to docs

2019-12-20 Thread Jeffrey Carpenter (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17001192#comment-17001192
 ] 

Jeffrey Carpenter commented on CASSANDRA-15443:
---

I've attached a patch with the text of the chapter converted to .rst format and 
broken into multiple files. Images are included. I've removed references to 
other parts of the book and replaced with links to other parts in the 
documentation site where applicable.

> Add data modeling documentation to docs
> ---
>
> Key: CASSANDRA-15443
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15443
> Project: Cassandra
>  Issue Type: Task
>  Components: Documentation/Blog
>Reporter: Nate McCall
>Priority: Normal
> Attachments: 0001-data-modeling-documentation.patch
>
>
> Jeff Carpenter and O'Reilly have offered to contribute a complete chapter on 
> data modeling from Cassandra, The Definitive Guide as a patch and thus under 
> ASFv2.0 license. 
> We've had this stubbed out for some time on our site so this will be a 
> fantastic addition:
> http://cassandra.apache.org/doc/latest/data_modeling/index.html
> This issue will be for converting the text to our site format. 
> For some background, see LEGAL-486. The consensus on a follow up thread on 
> legal-discuss was that Jeff C. signing a ICLA and accepting this as a patch 
> was good enough: 
> https://lists.apache.org/thread.html/86485fd59bdb8d6b7932447c7cd6e1d50d23bb91aaf2680153855597%40%3Clegal-discuss.apache.org%3E



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15443) Add data modeling documentation to docs

2019-12-20 Thread Jeffrey Carpenter (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey Carpenter updated CASSANDRA-15443:
--
Attachment: 0001-data-modeling-documentation.patch

> Add data modeling documentation to docs
> ---
>
> Key: CASSANDRA-15443
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15443
> Project: Cassandra
>  Issue Type: Task
>  Components: Documentation/Blog
>Reporter: Nate McCall
>Priority: Normal
> Attachments: 0001-data-modeling-documentation.patch
>
>
> Jeff Carpenter and O'Reilly have offered to contribute a complete chapter on 
> data modeling from Cassandra, The Definitive Guide as a patch and thus under 
> ASFv2.0 license. 
> We've had this stubbed out for some time on our site so this will be a 
> fantastic addition:
> http://cassandra.apache.org/doc/latest/data_modeling/index.html
> This issue will be for converting the text to our site format. 
> For some background, see LEGAL-486. The consensus on a follow up thread on 
> legal-discuss was that Jeff C. signing a ICLA and accepting this as a patch 
> was good enough: 
> https://lists.apache.org/thread.html/86485fd59bdb8d6b7932447c7cd6e1d50d23bb91aaf2680153855597%40%3Clegal-discuss.apache.org%3E



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-15397) IntervalTree performance comparison with Linear Walk and Binary Search based Elimination.

2019-12-20 Thread Chandrasekhar Thumuluru (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17001037#comment-17001037
 ] 

Chandrasekhar Thumuluru edited comment on CASSANDRA-15397 at 12/20/19 5:50 PM:
---

[~benedict] — Thanks for your inputs. 
 * I'll rename the class. I intentionally didn't do it in the first version of 
PR so it looks less distracting. 
 * I'll definitely do the performance comparison with million+ SSTables. Please 
note, my previous tests were not produced from read SSTables. The SSTable 
metadata was generated with random distributions. You can refer to the test 
files attached and let me know if you have any suggestions. I guess not using 
the real SSTables is fair to compare the performance of IntervalTree?
 *  I definitely share your concern on potential slowness due to linear scan, 
but I shared some code references in this 
[doc|https://docs.google.com/document/d/1vwo9ArZbtgWUwJcvZGes_69YVh4yiP9c7NQFgI0iynQ/edit?usp=sharing]
  which makes me believe we are still good. Let me know your thought on that 
too. 
* I'm willing to try the improvement proposed to the algorithm. I'll talk to my 
team to gather context around what you are suggesting and get back to you if 
I've any questions. 
* I'm definitely willing to try the proposed changes and don't mind even it the 
assumption turns out to be wrong. 


was (Author: cthumuluru):
[~benedict] — Thanks for your inputs. 
 * I'll rename the class. I intentionally didn't do it in the first version of 
PR so it looks less distracting. 
 * I'll definitely do the performance comparison with million+ SSTables. Please 
note, my previous tests were not produced from read SSTables. The SSTable 
metadata was generated with random distributions. You can refer to the test 
files attached and let me know if you have any suggestions. I guess not using 
the real SSTables is fair to compare the performance of IntervalTree?
 *  I definitely share your concern on potential slowness due to linear scan, 
but I shared some code reference in this 
[doc|https://docs.google.com/document/d/1vwo9ArZbtgWUwJcvZGes_69YVh4yiP9c7NQFgI0iynQ/edit?usp=sharing]
  which makes me believe we are still good. Let me know your thought on that 
too. 
* I'm willing to try the improvement proposed to the algorithm. I'll talk to my 
team to gather context around what you are suggesting and get back to you if 
I've any questions. 
* I'm definitely willing to try the proposed changes and don't mind even it the 
assumption turns out to be wrong. 

> IntervalTree performance comparison with Linear Walk and Binary Search based 
> Elimination. 
> --
>
> Key: CASSANDRA-15397
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15397
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/SSTable
>Reporter: Chandrasekhar Thumuluru
>Assignee: Chandrasekhar Thumuluru
>Priority: Low
>  Labels: pull-request-available
> Attachments: 95p_1_SSTable_with_5000_Searches.png, 
> 95p_15000_SSTable_with_5000_Searches.png, 
> 95p_2_SSTable_with_5000_Searches.png, 
> 95p_25000_SSTable_with_5000_Searches.png, 
> 95p_3_SSTable_with_5000_Searches.png, 
> 95p_5000_SSTable_with_5000_Searches.png, 
> 99p_1_SSTable_with_5000_Searches.png, 
> 99p_15000_SSTable_with_5000_Searches.png, 
> 99p_2_SSTable_with_5000_Searches.png, 
> 99p_25000_SSTable_with_5000_Searches.png, 
> 99p_3_SSTable_with_5000_Searches.png, 
> 99p_5000_SSTable_with_5000_Searches.png, IntervalList.java, 
> IntervalListWithElimination.java, IntervalTreeSimplified.java, 
> Mean_1_SSTable_with_5000_Searches.png, 
> Mean_15000_SSTable_with_5000_Searches.png, 
> Mean_2_SSTable_with_5000_Searches.png, 
> Mean_25000_SSTable_with_5000_Searches.png, 
> Mean_3_SSTable_with_5000_Searches.png, 
> Mean_5000_SSTable_with_5000_Searches.png, TESTS-TestSuites.xml.lz4, 
> replace_intervaltree_with_intervallist.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Cassandra uses IntervalTrees to identify the SSTables that overlap with 
> search interval. In Cassandra, IntervalTrees are not mutated. They are 
> recreated each time a mutation is required. This can be an issue during 
> repairs. In fact we noticed such issues during repair. 
> Since lists are cache friendly compared to linked lists and trees, I decided 
> to compare the search performance with:
> * Linear Walk.
> * Elimination using Binary Search (idea is to eliminate intervals using start 
> and end points of search interval). 
> Based on the tests I ran, I noticed Binary Search based elimination almost 
> always performs similar to IntervalTree or out performs IntervalTree based 
> search. The cost of IntervalTree 

[jira] [Comment Edited] (CASSANDRA-15397) IntervalTree performance comparison with Linear Walk and Binary Search based Elimination.

2019-12-20 Thread Chandrasekhar Thumuluru (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17001037#comment-17001037
 ] 

Chandrasekhar Thumuluru edited comment on CASSANDRA-15397 at 12/20/19 5:50 PM:
---

[~benedict] — Thanks for your inputs. 
 * I'll rename the class. I intentionally didn't do it in the first version of 
PR so it looks less distracting. 
 * I'll definitely do the performance comparison with million+ SSTables. Please 
note, my previous tests were not produced from read SSTables. The SSTable 
metadata was generated with random distributions. You can refer to the test 
files attached and let me know if you have any suggestions. I guess not using 
the real SSTables is fair to compare the performance of IntervalTree?
 *  I definitely share your concern on potential slowness due to linear scan, 
but I shared some code references in this 
[doc|https://docs.google.com/document/d/1vwo9ArZbtgWUwJcvZGes_69YVh4yiP9c7NQFgI0iynQ/edit?usp=sharing]
  which makes me believe we are still good. Let me know your thought on that 
too. 
* I'm willing to try the improvement proposed to the algorithm. I'll talk to my 
team to gather context around what you are suggesting and get back to you if 
I've any questions. 
* I'm definitely willing to try the proposed changes and don't mind even if the 
assumption turns out to be wrong. 


was (Author: cthumuluru):
[~benedict] — Thanks for your inputs. 
 * I'll rename the class. I intentionally didn't do it in the first version of 
PR so it looks less distracting. 
 * I'll definitely do the performance comparison with million+ SSTables. Please 
note, my previous tests were not produced from read SSTables. The SSTable 
metadata was generated with random distributions. You can refer to the test 
files attached and let me know if you have any suggestions. I guess not using 
the real SSTables is fair to compare the performance of IntervalTree?
 *  I definitely share your concern on potential slowness due to linear scan, 
but I shared some code references in this 
[doc|https://docs.google.com/document/d/1vwo9ArZbtgWUwJcvZGes_69YVh4yiP9c7NQFgI0iynQ/edit?usp=sharing]
  which makes me believe we are still good. Let me know your thought on that 
too. 
* I'm willing to try the improvement proposed to the algorithm. I'll talk to my 
team to gather context around what you are suggesting and get back to you if 
I've any questions. 
* I'm definitely willing to try the proposed changes and don't mind even it the 
assumption turns out to be wrong. 

> IntervalTree performance comparison with Linear Walk and Binary Search based 
> Elimination. 
> --
>
> Key: CASSANDRA-15397
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15397
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/SSTable
>Reporter: Chandrasekhar Thumuluru
>Assignee: Chandrasekhar Thumuluru
>Priority: Low
>  Labels: pull-request-available
> Attachments: 95p_1_SSTable_with_5000_Searches.png, 
> 95p_15000_SSTable_with_5000_Searches.png, 
> 95p_2_SSTable_with_5000_Searches.png, 
> 95p_25000_SSTable_with_5000_Searches.png, 
> 95p_3_SSTable_with_5000_Searches.png, 
> 95p_5000_SSTable_with_5000_Searches.png, 
> 99p_1_SSTable_with_5000_Searches.png, 
> 99p_15000_SSTable_with_5000_Searches.png, 
> 99p_2_SSTable_with_5000_Searches.png, 
> 99p_25000_SSTable_with_5000_Searches.png, 
> 99p_3_SSTable_with_5000_Searches.png, 
> 99p_5000_SSTable_with_5000_Searches.png, IntervalList.java, 
> IntervalListWithElimination.java, IntervalTreeSimplified.java, 
> Mean_1_SSTable_with_5000_Searches.png, 
> Mean_15000_SSTable_with_5000_Searches.png, 
> Mean_2_SSTable_with_5000_Searches.png, 
> Mean_25000_SSTable_with_5000_Searches.png, 
> Mean_3_SSTable_with_5000_Searches.png, 
> Mean_5000_SSTable_with_5000_Searches.png, TESTS-TestSuites.xml.lz4, 
> replace_intervaltree_with_intervallist.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Cassandra uses IntervalTrees to identify the SSTables that overlap with 
> search interval. In Cassandra, IntervalTrees are not mutated. They are 
> recreated each time a mutation is required. This can be an issue during 
> repairs. In fact we noticed such issues during repair. 
> Since lists are cache friendly compared to linked lists and trees, I decided 
> to compare the search performance with:
> * Linear Walk.
> * Elimination using Binary Search (idea is to eliminate intervals using start 
> and end points of search interval). 
> Based on the tests I ran, I noticed Binary Search based elimination almost 
> always performs similar to IntervalTree or out performs IntervalTree based 
> search. The cost of IntervalTree 

[jira] [Commented] (CASSANDRA-15397) IntervalTree performance comparison with Linear Walk and Binary Search based Elimination.

2019-12-20 Thread Chandrasekhar Thumuluru (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17001037#comment-17001037
 ] 

Chandrasekhar Thumuluru commented on CASSANDRA-15397:
-

[~benedict] — Thanks for your inputs. 
 * I'll rename the class. I intentionally didn't do it in the first version of 
PR so it looks less distracting. 
 * I'll definitely do the performance comparison with million+ SSTables. Please 
note, my previous tests were not produced from read SSTables. The SSTable 
metadata was generated with random distributions. You can refer to the test 
files attached and let me know if you have any suggestions. I guess not using 
the real SSTables is fair to compare the performance of IntervalTree?
 *  I definitely share your concern on potential slowness due to linear scan, 
but I shared some code reference in this 
[doc|https://docs.google.com/document/d/1vwo9ArZbtgWUwJcvZGes_69YVh4yiP9c7NQFgI0iynQ/edit?usp=sharing]
  which makes me believe we are still good. Let me know your thought on that 
too. 
* I'm willing to try the improvement proposed to the algorithm. I'll talk to my 
team to gather context around what you are suggesting and get back to you if 
I've any questions. 
* I'm definitely willing to try the proposed changes and don't mind even it the 
assumption turns out to be wrong. 

> IntervalTree performance comparison with Linear Walk and Binary Search based 
> Elimination. 
> --
>
> Key: CASSANDRA-15397
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15397
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/SSTable
>Reporter: Chandrasekhar Thumuluru
>Assignee: Chandrasekhar Thumuluru
>Priority: Low
>  Labels: pull-request-available
> Attachments: 95p_1_SSTable_with_5000_Searches.png, 
> 95p_15000_SSTable_with_5000_Searches.png, 
> 95p_2_SSTable_with_5000_Searches.png, 
> 95p_25000_SSTable_with_5000_Searches.png, 
> 95p_3_SSTable_with_5000_Searches.png, 
> 95p_5000_SSTable_with_5000_Searches.png, 
> 99p_1_SSTable_with_5000_Searches.png, 
> 99p_15000_SSTable_with_5000_Searches.png, 
> 99p_2_SSTable_with_5000_Searches.png, 
> 99p_25000_SSTable_with_5000_Searches.png, 
> 99p_3_SSTable_with_5000_Searches.png, 
> 99p_5000_SSTable_with_5000_Searches.png, IntervalList.java, 
> IntervalListWithElimination.java, IntervalTreeSimplified.java, 
> Mean_1_SSTable_with_5000_Searches.png, 
> Mean_15000_SSTable_with_5000_Searches.png, 
> Mean_2_SSTable_with_5000_Searches.png, 
> Mean_25000_SSTable_with_5000_Searches.png, 
> Mean_3_SSTable_with_5000_Searches.png, 
> Mean_5000_SSTable_with_5000_Searches.png, TESTS-TestSuites.xml.lz4, 
> replace_intervaltree_with_intervallist.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Cassandra uses IntervalTrees to identify the SSTables that overlap with 
> search interval. In Cassandra, IntervalTrees are not mutated. They are 
> recreated each time a mutation is required. This can be an issue during 
> repairs. In fact we noticed such issues during repair. 
> Since lists are cache friendly compared to linked lists and trees, I decided 
> to compare the search performance with:
> * Linear Walk.
> * Elimination using Binary Search (idea is to eliminate intervals using start 
> and end points of search interval). 
> Based on the tests I ran, I noticed Binary Search based elimination almost 
> always performs similar to IntervalTree or out performs IntervalTree based 
> search. The cost of IntervalTree construction is also substantial and 
> produces lot of garbage during repairs. 
> I ran the tests using random intervals to build the tree/lists and another 
> randomly generated search interval with 5000 iterations. I'm attaching all 
> the relevant graphs. The x-axis in the graphs is the search interval 
> coverage. 10p means the search interval covered 10% of the intervals. The 
> y-axis is the time the search took in nanos. 
> PS: 
> # For the purpose of test, I simplified the IntervalTree by removing the data 
> portion of the interval.  Modified the template version (Java generics) to a 
> specialized version. 
> # I used the code from Cassandra version _3.11_.
> # Time in the graph is in nanos. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14688) Update protocol spec and class level doc with protocol checksumming details

2019-12-20 Thread Sam Tunnicliffe (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17000784#comment-17000784
 ] 

Sam Tunnicliffe commented on CASSANDRA-14688:
-

[~csplinter] the linked commit reflects the protocol as it is currently, but 
it's going to need revising again when CASSANDRA-15299 lands, which should be 
early in the new year so I was holding off on this until then.

> Update protocol spec and class level doc with protocol checksumming details
> ---
>
> Key: CASSANDRA-14688
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14688
> Project: Cassandra
>  Issue Type: Task
>  Components: Legacy/Documentation and Website
>Reporter: Sam Tunnicliffe
>Assignee: Sam Tunnicliffe
>Priority: Normal
>  Labels: protocolv5
> Fix For: 4.0, 4.0-beta
>
>
> CASSANDRA-13304 provides an option to add checksumming to the frame body of 
> native protocol messages. The native protocol spec needs to be updated to 
> reflect this ASAP. We should also verify that the javadoc comments describing 
> the on-wire format in 
> {{o.a.c.transport.frame.checksum.ChecksummingTransformer}} are up to date.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org