[jira] [Commented] (CASSANDRA-14346) Scheduled Repair in Cassandra

2018-08-20 Thread mck (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586930#comment-16586930
 ] 

mck commented on CASSANDRA-14346:
-

[~kohlisankalp],
> any timeline when we can expect a patch for it? 

We hope to have an answer to this by next week.

> Scheduled Repair in Cassandra
> -
>
> Key: CASSANDRA-14346
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14346
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Repair
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Major
>  Labels: 4.0-feature-freeze-review-requested, 
> CommunityFeedbackRequested
> Fix For: 4.0
>
> Attachments: ScheduledRepairV1_20180327.pdf
>
>
> There have been many attempts to automate repair in Cassandra, which makes 
> sense given that it is necessary to give our users eventual consistency. Most 
> recently CASSANDRA-10070, CASSANDRA-8911 and CASSANDRA-13924 have all looked 
> for ways to solve this problem.
> At Netflix we've built a scheduled repair service within Priam (our sidecar), 
> which we spoke about last year at NGCC. Given the positive feedback at NGCC 
> we focussed on getting it production ready and have now been using it in 
> production to repair hundreds of clusters, tens of thousands of nodes, and 
> petabytes of data for the past six months. Also based on feedback at NGCC we 
> have invested effort in figuring out how to integrate this natively into 
> Cassandra rather than open sourcing it as an external service (e.g. in Priam).
> As such, [~vinaykumarcse] and I would like to re-work and merge our 
> implementation into Cassandra, and have created a [design 
> document|https://docs.google.com/document/d/1RV4rOrG1gwlD5IljmrIq_t45rz7H3xs9GbFSEyGzEtM/edit?usp=sharing]
>  showing how we plan to make it happen, including the the user interface.
> As we work on the code migration from Priam to Cassandra, any feedback would 
> be greatly appreciated about the interface or v1 implementation features. I 
> have tried to call out in the document features which we explicitly consider 
> future work (as well as a path forward to implement them in the future) 
> because I would very much like to get this done before the 4.0 merge window 
> closes, and to do that I think aggressively pruning scope is going to be a 
> necessity.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14346) Scheduled Repair in Cassandra

2018-08-20 Thread Jeff Jirsa (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586848#comment-16586848
 ] 

Jeff Jirsa commented on CASSANDRA-14346:


For the record, GPL is [category X (excluded / 
banned)|https://www.apache.org/legal/resolved.html#category-x] , so that's not 
an option. The category B CDDL option here is the right one, which is [what 
hadoop 
does|https://github.com/apache/hadoop/blob/00013d6ef7fdf65fa8a0f6eb56c0aef2f6e19444/LICENSE.txt#L941-L951]






> Scheduled Repair in Cassandra
> -
>
> Key: CASSANDRA-14346
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14346
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Repair
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Major
>  Labels: 4.0-feature-freeze-review-requested, 
> CommunityFeedbackRequested
> Fix For: 4.0
>
> Attachments: ScheduledRepairV1_20180327.pdf
>
>
> There have been many attempts to automate repair in Cassandra, which makes 
> sense given that it is necessary to give our users eventual consistency. Most 
> recently CASSANDRA-10070, CASSANDRA-8911 and CASSANDRA-13924 have all looked 
> for ways to solve this problem.
> At Netflix we've built a scheduled repair service within Priam (our sidecar), 
> which we spoke about last year at NGCC. Given the positive feedback at NGCC 
> we focussed on getting it production ready and have now been using it in 
> production to repair hundreds of clusters, tens of thousands of nodes, and 
> petabytes of data for the past six months. Also based on feedback at NGCC we 
> have invested effort in figuring out how to integrate this natively into 
> Cassandra rather than open sourcing it as an external service (e.g. in Priam).
> As such, [~vinaykumarcse] and I would like to re-work and merge our 
> implementation into Cassandra, and have created a [design 
> document|https://docs.google.com/document/d/1RV4rOrG1gwlD5IljmrIq_t45rz7H3xs9GbFSEyGzEtM/edit?usp=sharing]
>  showing how we plan to make it happen, including the the user interface.
> As we work on the code migration from Priam to Cassandra, any feedback would 
> be greatly appreciated about the interface or v1 implementation features. I 
> have tried to call out in the document features which we explicitly consider 
> future work (as well as a path forward to implement them in the future) 
> because I would very much like to get this done before the 4.0 merge window 
> closes, and to do that I think aggressively pruning scope is going to be a 
> necessity.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14631) Add RSS support for Cassandra blog

2018-08-20 Thread Dinesh Joshi (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586672#comment-16586672
 ] 

Dinesh Joshi edited comment on CASSANDRA-14631 at 8/20/18 11:52 PM:


The weird <%HTML%> issue seems to be a bug in the RSS feed reader that I was 
using. I tried a different feed reader and that worked fine.

The alignment looks good. Regarding the timestamp, the feed uses the post's 
timestamp so we should be good.

I'm +1 on this.

[~zznate] could you please help commit this? :)


was (Author: djoshi3):
The alignment looks good. Regarding the timestamp, the feed uses the post's 
timestamp so we should be good.

I'm +1 on this.


[~zznate] could you please help commit this? :)

> Add RSS support for Cassandra blog
> --
>
> Key: CASSANDRA-14631
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14631
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Documentation and Website
>Reporter: Jacques-Henri Berthemet
>Assignee: Jeff Beck
>Priority: Major
> Attachments: 14631-site.txt, Screen Shot 2018-08-17 at 5.32.08 
> PM.png, Screen Shot 2018-08-17 at 5.32.25 PM.png
>
>
> It would be convenient to add RSS support to Cassandra blog:
> [http://cassandra.apache.org/blog/2018/08/07/faster_streaming_in_cassandra.html]
> And maybe also for other resources like new versions, but this ticket is 
> about blog.
>  
> {quote}From: Scott Andreas
> Sent: Wednesday, August 08, 2018 6:53 PM
> To: [d...@cassandra.apache.org|mailto:d...@cassandra.apache.org]
> Subject: Re: Apache Cassandra Blog is now live
>  
> Please feel free to file a ticket (label: Documentation and Website).
>  
> It looks like Jekyll, the static site generator used to build the website, 
> has a plugin that generates Atom feeds if someone would like to work on 
> adding one: [https://github.com/jekyll/jekyll-feed]
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14631) Add RSS support for Cassandra blog

2018-08-20 Thread Dinesh Joshi (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dinesh Joshi updated CASSANDRA-14631:
-
Status: Ready to Commit  (was: Patch Available)

> Add RSS support for Cassandra blog
> --
>
> Key: CASSANDRA-14631
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14631
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Documentation and Website
>Reporter: Jacques-Henri Berthemet
>Assignee: Jeff Beck
>Priority: Major
> Attachments: 14631-site.txt, Screen Shot 2018-08-17 at 5.32.08 
> PM.png, Screen Shot 2018-08-17 at 5.32.25 PM.png
>
>
> It would be convenient to add RSS support to Cassandra blog:
> [http://cassandra.apache.org/blog/2018/08/07/faster_streaming_in_cassandra.html]
> And maybe also for other resources like new versions, but this ticket is 
> about blog.
>  
> {quote}From: Scott Andreas
> Sent: Wednesday, August 08, 2018 6:53 PM
> To: [d...@cassandra.apache.org|mailto:d...@cassandra.apache.org]
> Subject: Re: Apache Cassandra Blog is now live
>  
> Please feel free to file a ticket (label: Documentation and Website).
>  
> It looks like Jekyll, the static site generator used to build the website, 
> has a plugin that generates Atom feeds if someone would like to work on 
> adding one: [https://github.com/jekyll/jekyll-feed]
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14631) Add RSS support for Cassandra blog

2018-08-20 Thread Dinesh Joshi (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586672#comment-16586672
 ] 

Dinesh Joshi commented on CASSANDRA-14631:
--

The alignment looks good. Regarding the timestamp, the feed uses the post's 
timestamp so we should be good.

I'm +1 on this.


[~zznate] could you please help commit this? :)

> Add RSS support for Cassandra blog
> --
>
> Key: CASSANDRA-14631
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14631
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Documentation and Website
>Reporter: Jacques-Henri Berthemet
>Assignee: Jeff Beck
>Priority: Major
> Attachments: 14631-site.txt, Screen Shot 2018-08-17 at 5.32.08 
> PM.png, Screen Shot 2018-08-17 at 5.32.25 PM.png
>
>
> It would be convenient to add RSS support to Cassandra blog:
> [http://cassandra.apache.org/blog/2018/08/07/faster_streaming_in_cassandra.html]
> And maybe also for other resources like new versions, but this ticket is 
> about blog.
>  
> {quote}From: Scott Andreas
> Sent: Wednesday, August 08, 2018 6:53 PM
> To: [d...@cassandra.apache.org|mailto:d...@cassandra.apache.org]
> Subject: Re: Apache Cassandra Blog is now live
>  
> Please feel free to file a ticket (label: Documentation and Website).
>  
> It looks like Jekyll, the static site generator used to build the website, 
> has a plugin that generates Atom feeds if someone would like to work on 
> adding one: [https://github.com/jekyll/jekyll-feed]
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14346) Scheduled Repair in Cassandra

2018-08-20 Thread sankalp kohli (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586492#comment-16586492
 ] 

sankalp kohli commented on CASSANDRA-14346:
---

As per dev mailing list, Reaper is also being considered for this which is 
great news. Lets see how we can get the best out of these implementations 

[~michaelsembwever] any timeline when we can expect a patch for it? 

> Scheduled Repair in Cassandra
> -
>
> Key: CASSANDRA-14346
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14346
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Repair
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Major
>  Labels: 4.0-feature-freeze-review-requested, 
> CommunityFeedbackRequested
> Fix For: 4.0
>
> Attachments: ScheduledRepairV1_20180327.pdf
>
>
> There have been many attempts to automate repair in Cassandra, which makes 
> sense given that it is necessary to give our users eventual consistency. Most 
> recently CASSANDRA-10070, CASSANDRA-8911 and CASSANDRA-13924 have all looked 
> for ways to solve this problem.
> At Netflix we've built a scheduled repair service within Priam (our sidecar), 
> which we spoke about last year at NGCC. Given the positive feedback at NGCC 
> we focussed on getting it production ready and have now been using it in 
> production to repair hundreds of clusters, tens of thousands of nodes, and 
> petabytes of data for the past six months. Also based on feedback at NGCC we 
> have invested effort in figuring out how to integrate this natively into 
> Cassandra rather than open sourcing it as an external service (e.g. in Priam).
> As such, [~vinaykumarcse] and I would like to re-work and merge our 
> implementation into Cassandra, and have created a [design 
> document|https://docs.google.com/document/d/1RV4rOrG1gwlD5IljmrIq_t45rz7H3xs9GbFSEyGzEtM/edit?usp=sharing]
>  showing how we plan to make it happen, including the the user interface.
> As we work on the code migration from Priam to Cassandra, any feedback would 
> be greatly appreciated about the interface or v1 implementation features. I 
> have tried to call out in the document features which we explicitly consider 
> future work (as well as a path forward to implement them in the future) 
> because I would very much like to get this done before the 4.0 merge window 
> closes, and to do that I think aggressively pruning scope is going to be a 
> necessity.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14409) Transient Replication: Support ring changes when transient replication is in use (add/remove node, change RF, add/remove DC)

2018-08-20 Thread Benedict (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586489#comment-16586489
 ] 

Benedict commented on CASSANDRA-14409:
--

I've pushed a couple of new commits, to include the unit tests that I forgot 
to.  These also exposed some issues with consistency of the semantics, that 
would not have affected any of the current use cases but were suboptimal, and 
worth fixing to avoid any future surprises.  Specifically in cases of filtering 
an immutable view of a Mutable, with a predicate that matches all items (and 
some similar scenario where the subset is the same as the collection being 
operated on).

To fix this, I've introduced a new variable {{isSnapshot}} that tracks if the 
collection is a view or a snapshot (i.e. can be expected to be truly immutable 
/ stable)

> Transient Replication: Support ring changes when transient replication is in 
> use (add/remove node, change RF, add/remove DC)
> 
>
> Key: CASSANDRA-14409
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14409
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Coordination, Core, Documentation and Website
>Reporter: Ariel Weisberg
>Assignee: Ariel Weisberg
>Priority: Major
> Fix For: 4.0
>
>
> The additional state transitions that transient replication introduces 
> require streaming and nodetool cleanup to behave differently. We already have 
> code that does the streaming, but in some cases we shouldn't stream any data 
> and in others when we stream to receive data we have to make sure we stream 
> from a full replica and not a transient replica.
> Transitioning from not replicated to transiently replicated means that a node 
> must stay pending until the next incremental repair completes at which point 
> the data for that range is known to be available at full replicas.
> Transitioning from transiently replicated to fully replicated requires 
> streaming from a full replica and is identical to how we stream from not 
> replicated to replicated. The transition must be managed so the transient 
> replica is not read from as a full replica until streaming completes. It can 
> be used immediately for a write quorum.
> Transitioning from fully replicated to transiently replicated requires 
> cleanup to remove repaired data from the transiently replicated range to 
> reclaim space. It can be used immediately for a write quorum.
> Transitioning from transiently replicated to not replicated requires cleanup 
> to be run to remove the formerly transiently replicated data.
> nodetool move, removenode, cleanup, decommission, and rebuild need to handle 
> these issues as does bootstrap.
> Update web site, documentation, NEWS.txt with a description of the steps for 
> doing common operations. Add/remove DC, Add/remove node(s), replace node, 
> change RF.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14631) Add RSS support for Cassandra blog

2018-08-20 Thread Jeff Beck (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586463#comment-16586463
 ] 

Jeff Beck commented on CASSANDRA-14631:
---

Updated the patch with better alignment. 

The main concern I have is when ever I generate the site locally the blog 
actually changes due to the different dates in the post and the published 
versions are you seeing that or is it working correctly for you?

> Add RSS support for Cassandra blog
> --
>
> Key: CASSANDRA-14631
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14631
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Documentation and Website
>Reporter: Jacques-Henri Berthemet
>Assignee: Jeff Beck
>Priority: Major
> Attachments: 14631-site.txt, Screen Shot 2018-08-17 at 5.32.08 
> PM.png, Screen Shot 2018-08-17 at 5.32.25 PM.png
>
>
> It would be convenient to add RSS support to Cassandra blog:
> [http://cassandra.apache.org/blog/2018/08/07/faster_streaming_in_cassandra.html]
> And maybe also for other resources like new versions, but this ticket is 
> about blog.
>  
> {quote}From: Scott Andreas
> Sent: Wednesday, August 08, 2018 6:53 PM
> To: [d...@cassandra.apache.org|mailto:d...@cassandra.apache.org]
> Subject: Re: Apache Cassandra Blog is now live
>  
> Please feel free to file a ticket (label: Documentation and Website).
>  
> It looks like Jekyll, the static site generator used to build the website, 
> has a plugin that generates Atom feeds if someone would like to work on 
> adding one: [https://github.com/jekyll/jekyll-feed]
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14631) Add RSS support for Cassandra blog

2018-08-20 Thread Jeff Beck (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Beck updated CASSANDRA-14631:
--
Attachment: 14631-site.txt

> Add RSS support for Cassandra blog
> --
>
> Key: CASSANDRA-14631
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14631
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Documentation and Website
>Reporter: Jacques-Henri Berthemet
>Assignee: Jeff Beck
>Priority: Major
> Attachments: 14631-site.txt, Screen Shot 2018-08-17 at 5.32.08 
> PM.png, Screen Shot 2018-08-17 at 5.32.25 PM.png
>
>
> It would be convenient to add RSS support to Cassandra blog:
> [http://cassandra.apache.org/blog/2018/08/07/faster_streaming_in_cassandra.html]
> And maybe also for other resources like new versions, but this ticket is 
> about blog.
>  
> {quote}From: Scott Andreas
> Sent: Wednesday, August 08, 2018 6:53 PM
> To: [d...@cassandra.apache.org|mailto:d...@cassandra.apache.org]
> Subject: Re: Apache Cassandra Blog is now live
>  
> Please feel free to file a ticket (label: Documentation and Website).
>  
> It looks like Jekyll, the static site generator used to build the website, 
> has a plugin that generates Atom feeds if someone would like to work on 
> adding one: [https://github.com/jekyll/jekyll-feed]
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14631) Add RSS support for Cassandra blog

2018-08-20 Thread Jeff Beck (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Beck updated CASSANDRA-14631:
--
Attachment: (was: 14631-site.txt)

> Add RSS support for Cassandra blog
> --
>
> Key: CASSANDRA-14631
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14631
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Documentation and Website
>Reporter: Jacques-Henri Berthemet
>Assignee: Jeff Beck
>Priority: Major
> Attachments: 14631-site.txt, Screen Shot 2018-08-17 at 5.32.08 
> PM.png, Screen Shot 2018-08-17 at 5.32.25 PM.png
>
>
> It would be convenient to add RSS support to Cassandra blog:
> [http://cassandra.apache.org/blog/2018/08/07/faster_streaming_in_cassandra.html]
> And maybe also for other resources like new versions, but this ticket is 
> about blog.
>  
> {quote}From: Scott Andreas
> Sent: Wednesday, August 08, 2018 6:53 PM
> To: [d...@cassandra.apache.org|mailto:d...@cassandra.apache.org]
> Subject: Re: Apache Cassandra Blog is now live
>  
> Please feel free to file a ticket (label: Documentation and Website).
>  
> It looks like Jekyll, the static site generator used to build the website, 
> has a plugin that generates Atom feeds if someone would like to work on 
> adding one: [https://github.com/jekyll/jekyll-feed]
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13304) Add checksumming to the native protocol

2018-08-20 Thread Dinesh Joshi (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dinesh Joshi updated CASSANDRA-13304:
-
Reviewers: Dinesh Joshi, Jordan West

> Add checksumming to the native protocol
> ---
>
> Key: CASSANDRA-13304
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13304
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Michael Kjellman
>Assignee: Sam Tunnicliffe
>Priority: Blocker
>  Labels: client-impacting
> Fix For: 4.x
>
> Attachments: 13304_v1.diff, boxplot-read-throughput.png, 
> boxplot-write-throughput.png
>
>
> The native binary transport implementation doesn't include checksums. This 
> makes it highly susceptible to silently inserting corrupted data either due 
> to hardware issues causing bit flips on the sender/client side, C*/receiver 
> side, or network in between.
> Attaching an implementation that makes checksum'ing mandatory (assuming both 
> client and server know about a protocol version that supports checksums) -- 
> and also adds checksumming to clients that request compression.
> The serialized format looks something like this:
> {noformat}
>  *  1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3
>  *  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * |  Number of Compressed Chunks  | Compressed Length (e1)/
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * /  Compressed Length cont. (e1) |Uncompressed Length (e1)   /
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * | Uncompressed Length cont. (e1)| CRC32 Checksum of Lengths (e1)|
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * | Checksum of Lengths cont. (e1)|Compressed Bytes (e1)+//
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * |  CRC32 Checksum (e1) ||
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * |Compressed Length (e2) |
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * |   Uncompressed Length (e2)|
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * |CRC32 Checksum of Lengths (e2) |
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * | Compressed Bytes (e2)   +//
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * |  CRC32 Checksum (e2) ||
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * |Compressed Length (en) |
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * |   Uncompressed Length (en)|
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * |CRC32 Checksum of Lengths (en) |
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * |  Compressed Bytes (en)  +//
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * |  CRC32 Checksum (en) ||
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
> {noformat}
> The first pass here adds checksums only to the actual contents of the frame 
> body itself (and doesn't actually checksum lengths and headers). While it 
> would be great to fully add checksuming across the entire protocol, the 
> proposed implementation will ensure we at least catch corrupted data and 
> likely protect ourselves pretty well anyways.
> I didn't go to the trouble of implementing a Snappy Checksum'ed Compressor 
> implementation as it's been deprecated for a while -- is really slow and 
> crappy compared to LZ4 -- and we should do everything in our power to make 
> sure no one in the community is still using it. I left it in (for obvious 
> backwards compatibility aspects) old for clients that don't know about the 
> new protocol.
> The current protocol has a 256MB (max) frame body -- where the serialized 
> contents are simply written in to the frame body.
> If the client sends a compression option in the startup, we will install a 
> FrameCompressor inline. Unfortunately, we went with a decision to treat the 
> frame body separately from the header bits etc in a given message. So, 
> instead we put a compressor implementation in the options and then if it's 
> not null, we push the serialized bytes for the frame body 

[jira] [Commented] (CASSANDRA-14631) Add RSS support for Cassandra blog

2018-08-20 Thread Dinesh Joshi (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586308#comment-16586308
 ] 

Dinesh Joshi commented on CASSANDRA-14631:
--

Hi [~beckje01], I'm getting the same XML so thats not the issue. I think it is 
the way RSS Follower is interpreting the feed. I was concerned that other RSS 
readers may also interpret it the same way. This could be a bug with RSS 
Follower. If you could try getting the alignment issue fixed, I'm +1 on the 
patch.

> Add RSS support for Cassandra blog
> --
>
> Key: CASSANDRA-14631
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14631
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Documentation and Website
>Reporter: Jacques-Henri Berthemet
>Assignee: Jeff Beck
>Priority: Major
> Attachments: 14631-site.txt, Screen Shot 2018-08-17 at 5.32.08 
> PM.png, Screen Shot 2018-08-17 at 5.32.25 PM.png
>
>
> It would be convenient to add RSS support to Cassandra blog:
> [http://cassandra.apache.org/blog/2018/08/07/faster_streaming_in_cassandra.html]
> And maybe also for other resources like new versions, but this ticket is 
> about blog.
>  
> {quote}From: Scott Andreas
> Sent: Wednesday, August 08, 2018 6:53 PM
> To: [d...@cassandra.apache.org|mailto:d...@cassandra.apache.org]
> Subject: Re: Apache Cassandra Blog is now live
>  
> Please feel free to file a ticket (label: Documentation and Website).
>  
> It looks like Jekyll, the static site generator used to build the website, 
> has a plugin that generates Atom feeds if someone would like to work on 
> adding one: [https://github.com/jekyll/jekyll-feed]
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-8303) Create a capability limitation framework

2018-08-20 Thread Sam Tunnicliffe (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586185#comment-16586185
 ] 

Sam Tunnicliffe commented on CASSANDRA-8303:


bq. Say you have an application user with role RESTRICTED as locked down as 
possible. Now you have a tool that needs MULTI_PARTITION_READ, shouldn't you be 
able to just grant that capability based on RESTRICTED? Instead you'd have to 
create a new role, RESTRICTED_W_MULTI for that use case.

I think you'd want to be more granular than that if your intention is to have 
multiple user roles with a fair amount of overlap between their perms and 
restrictions. So for instance, you could deconstruct your RESTRICTED role into 
2 - one with the necessary permissions (r1) and the other with the most locked 
down restrictions (r2) and grant both to a third, user role (u1). For your new 
use case, you would add another role with the fewer restrictions (r3) and 
another user role (u2) which is granted r1 & r3. I can appreciate that this may 
be a little verbose, but obviously the better the decomposition, the less 
duplication required. I also think it's a more intuitive way to manage 
restrictions (see the next point).

bq. If you opt-in to use the capability system, new roles should not have any 
capabilities enabled, comparable to permissions for roles. Then, based on you 
data model and use cases, you should be able to grant the minimal set of 
capabilities for that particular use case and role.

I actually disagree with this. I think it would be much more practical for all 
capabilities to be enabled by default and then restrictions added as required. 
This is the obviously the situation right now (i.e. granting permssions allows 
a user to use them to their fullest extent) and I think that's the way that 
users will expect this to work. Plus, it makes enabling the feature on an 
existing system more straightforward and less surprising.

bq. Although it still isn't clear to me what we aim for by restricting CLs.

It can be quite common for operators to want to enforce specific CLs on 
particular tables. e.g. Ensure that latency SLAs are not breached due to 
clients blocking on cross-DC requests, help enforce availability guarantees by 
disallowing CL.ALL etc

I'm going to rebase this and give it another look over, on the off chance that 
someone has chance to review it before the 4.0 freeze.

> Create a capability limitation framework
> 
>
> Key: CASSANDRA-8303
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8303
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Distributed Metadata
>Reporter: Anupam Arora
>Priority: Major
> Fix For: 4.x
>
>
> In addition to our current Auth framework that acts as a white list, and 
> regulates access to data, functions, and roles, it would be beneficial to 
> have a different, capability limitation framework, that would be orthogonal 
> to Auth, and would act as a blacklist.
> Example uses:
> - take away the ability to TRUNCATE from all users but the admin (TRUNCATE 
> itself would still require MODIFY permission)
> - take away the ability to use ALLOW FILTERING from all users but 
> Spark/Hadoop (SELECT would still require SELECT permission)
> - take away the ability to use UNLOGGED BATCH from everyone (the operation 
> itself would still require MODIFY permission)
> - take away the ability to use certain consistency levels (make certain 
> tables LWT-only for all users, for example)
> Original description:
> Please provide a "strict mode" option in cassandra that will kick out any CQL 
> queries that are expensive, e.g. any query with ALLOWS FILTERING, 
> multi-partition queries, secondary index queries, etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14657) Handle failures in upgradesstables/cleanup/relocate

2018-08-20 Thread Marcus Eriksson (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-14657:

Reviewer: Benedict

> Handle failures in upgradesstables/cleanup/relocate
> ---
>
> Key: CASSANDRA-14657
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14657
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Major
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> If a compaction in {{parallelAllSSTableOperation}} throws exception, all 
> current transactions are closed, this can make us close a transaction that 
> has not yet finished (since we can run many of these compactions in 
> parallel). This causes this error:
> {code}
> java.lang.IllegalStateException: Cannot prepare to commit unless IN_PROGRESS; 
> state is ABORTED
> {code}
> and this can get the leveled manifest (if running LCS) in a bad state causing 
> this error message:
> {code}
> Could not acquire references for compacting SSTables ...
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14657) Handle failures in upgradesstables/cleanup/relocate

2018-08-20 Thread Marcus Eriksson (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586183#comment-16586183
 ] 

Marcus Eriksson commented on CASSANDRA-14657:
-

patch here: https://github.com/krummas/cassandra/commits/marcuse/14657

needs a couple of tests, will add soon

> Handle failures in upgradesstables/cleanup/relocate
> ---
>
> Key: CASSANDRA-14657
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14657
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Major
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> If a compaction in {{parallelAllSSTableOperation}} throws exception, all 
> current transactions are closed, this can make us close a transaction that 
> has not yet finished (since we can run many of these compactions in 
> parallel). This causes this error:
> {code}
> java.lang.IllegalStateException: Cannot prepare to commit unless IN_PROGRESS; 
> state is ABORTED
> {code}
> and this can get the leveled manifest (if running LCS) in a bad state causing 
> this error message:
> {code}
> Could not acquire references for compacting SSTables ...
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-14657) Handle failures in upgradesstables/cleanup/relocate

2018-08-20 Thread Marcus Eriksson (JIRA)
Marcus Eriksson created CASSANDRA-14657:
---

 Summary: Handle failures in upgradesstables/cleanup/relocate
 Key: CASSANDRA-14657
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14657
 Project: Cassandra
  Issue Type: New Feature
Reporter: Marcus Eriksson
Assignee: Marcus Eriksson
 Fix For: 3.0.x, 3.11.x, 4.x


If a compaction in {{parallelAllSSTableOperation}} throws exception, all 
current transactions are closed, this can make us close a transaction that has 
not yet finished (since we can run many of these compactions in parallel). This 
causes this error:
{code}
java.lang.IllegalStateException: Cannot prepare to commit unless IN_PROGRESS; 
state is ABORTED
{code}
and this can get the leveled manifest (if running LCS) in a bad state causing 
this error message:
{code}
Could not acquire references for compacting SSTables ...
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14657) Handle failures in upgradesstables/cleanup/relocate

2018-08-20 Thread Marcus Eriksson (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-14657:

Issue Type: Bug  (was: New Feature)

> Handle failures in upgradesstables/cleanup/relocate
> ---
>
> Key: CASSANDRA-14657
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14657
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Major
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> If a compaction in {{parallelAllSSTableOperation}} throws exception, all 
> current transactions are closed, this can make us close a transaction that 
> has not yet finished (since we can run many of these compactions in 
> parallel). This causes this error:
> {code}
> java.lang.IllegalStateException: Cannot prepare to commit unless IN_PROGRESS; 
> state is ABORTED
> {code}
> and this can get the leveled manifest (if running LCS) in a bad state causing 
> this error message:
> {code}
> Could not acquire references for compacting SSTables ...
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14655) Upgrade C* to use latest guava (26.0)

2018-08-20 Thread Andy Tolbert (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586142#comment-16586142
 ] 

Andy Tolbert commented on CASSANDRA-14655:
--

We are tentatively planning on releasing version 3.6.0 of the driver which will 
include guava 26 compatibility sometime in the latter part of next week.  I'll 
add a comment to this ticket when that's released (y).

> Upgrade C* to use latest guava (26.0)
> -
>
> Key: CASSANDRA-14655
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14655
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Libraries
>Reporter: Sumanth Pasupuleti
>Assignee: Sumanth Pasupuleti
>Priority: Minor
> Fix For: 4.x
>
>
> C* currently uses guava 23.3. This JIRA is about changing C* to use latest 
> guava (26.0). Originated from a discussion in the mailing list.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14649) Index summaries fail when their size gets > 2G and use more space than necessary

2018-08-20 Thread Benedict (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586097#comment-16586097
 ] 

Benedict commented on CASSANDRA-14649:
--

I think you need to set yourself up on a CircleCI account that supports larger 
instances, then modify the .circleci/config.yml

[For 
example|https://github.com/belliottsmith/cassandra/commit/b1cbd819274e3095f348402bca257ad4e6765f22]

> Index summaries fail when their size gets > 2G and use more space than 
> necessary
> 
>
> Key: CASSANDRA-14649
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14649
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>Priority: Major
>
> After building a summary, {{IndexSummaryBuilder}} tries to trim the memory 
> writers by calling {{SafeMemoryWriter.setCapacity(capacity())}}. Instead of 
> trimming, this ends up allocating at least as much extra space and failing 
> the {{Buffer.position()}} call when the size is greater than 
> {{Integer.MAX_VALUE}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14497) Add Role login cache

2018-08-20 Thread Sam Tunnicliffe (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586078#comment-16586078
 ] 

Sam Tunnicliffe commented on CASSANDRA-14497:
-

Sorry for taking so long to respond [~jay.zhuang]...
{quote}1. Why do we need login?
{quote}
Permissions can be grouped together by assigning them to roles, which can then 
be granted to other roles. {{LOGIN}} is the way to differentiate these logical 
roles from ones which represent 'real' database users.
{quote}2. Should login check be part of authN or authZ
{quote}
Both login and superuser privs are part of authz really. They can be thought of 
as permissions which aren't linked to a particular resource in the authz 
hierarchy and for this reason they're defined and managed purely at the role 
level.

FWIW, this is mostly based on the postgres approach 
([https://www.postgresql.org/docs/current/static/role-attributes.html]), though 
some role-level attributes they define *can* be easily modelled as C* 
permissions (e.g. role creation) or don't apply to us (replication permission).

bq.So would it be better to have canLogin, isSuper information in 
CredentialsCache (maybe we should change the cache name)?
Each of the various auth caches have what amounts to a 1:1 relationship with a 
(pluggable) component of the auth subsystem. The `IAuthenticator` impl is 
responsible purely for validating credentials and so for the 
`CredentialsCache`. The `IAuthorizer` handles permissions, the 
`INetworkAuthorizer`, DC access rights etc. The components are intended to be 
granular enough to allow different implementations to be swapped in 
independently (though of course, this isn't perfect). We could compose the 
caches differently so that they interact with multiple components, but that 
seems complicated and a bit unnecessary.

IMO, the important thing to address in this issue is that the current 
`RolesCache` implementation just doesn't contain all of the frequently used 
role-level info, which means we have to hit the actual role manager far too 
often.

> Add Role login cache
> 
>
> Key: CASSANDRA-14497
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14497
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Auth
>Reporter: Jay Zhuang
>Assignee: Sam Tunnicliffe
>Priority: Major
>  Labels: security
> Fix For: 4.0
>
>
> The 
> [{{ClientState.login()}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/ClientState.java#L313]
>  function is used for all auth message: 
> [{{AuthResponse.java:82}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/transport/messages/AuthResponse.java#L82].
>  But the 
> [{{role.canLogin}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/auth/CassandraRoleManager.java#L521]
>  information is not cached. So it hits the database every time: 
> [{{CassandraRoleManager.java:407}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/auth/CassandraRoleManager.java#L407].
>  For a cluster with lots of new connections, it's causing performance issue. 
> The mitigation for us is to increase the {{system_auth}} replication factor 
> to match the number of nodes, so 
> [{{local_one}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/auth/CassandraRoleManager.java#L488]
>  would be very cheap. The P99 dropped immediately, but I don't think it is 
> not a good solution.
> I would purpose to add {{Role.canLogin}} to the RolesCache to improve the 
> auth performance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14649) Index summaries fail when their size gets > 2G and use more space than necessary

2018-08-20 Thread Branimir Lambov (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586017#comment-16586017
 ] 

Branimir Lambov commented on CASSANDRA-14649:
-

CircleCI doesn't seem to like the new test but AFAICT is otherwise fine: 
[2.2|https://circleci.com/gh/blambov/workflows/cassandra/tree/14649-2.2] 
[3.0|https://circleci.com/gh/blambov/workflows/cassandra/tree/14649-3.0] 
[3.11|https://circleci.com/gh/blambov/workflows/cassandra/tree/14649-3.11] 
[trunk|https://circleci.com/gh/blambov/workflows/cassandra/tree/14649-trunk]

Should I remove the >2G test, or is there something I need to set up to be able 
to run tests needing more memory?

> Index summaries fail when their size gets > 2G and use more space than 
> necessary
> 
>
> Key: CASSANDRA-14649
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14649
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>Priority: Major
>
> After building a summary, {{IndexSummaryBuilder}} tries to trim the memory 
> writers by calling {{SafeMemoryWriter.setCapacity(capacity())}}. Instead of 
> trimming, this ends up allocating at least as much extra space and failing 
> the {{Buffer.position()}} call when the size is greater than 
> {{Integer.MAX_VALUE}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-14656) Full query log needs to log the keyspace

2018-08-20 Thread Marcus Eriksson (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson reassigned CASSANDRA-14656:
---

Assignee: Marcus Eriksson

> Full query log needs to log the keyspace
> 
>
> Key: CASSANDRA-14656
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14656
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Major
>
> If the full query log is enabled and a set of clients have already executed 
> "USE " we can't figure out which keyspace the following queries are 
> executed against.
> We need this for CASSANDRA-14618



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-14656) Full query log needs to log the keyspace

2018-08-20 Thread Marcus Eriksson (JIRA)
Marcus Eriksson created CASSANDRA-14656:
---

 Summary: Full query log needs to log the keyspace
 Key: CASSANDRA-14656
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14656
 Project: Cassandra
  Issue Type: New Feature
Reporter: Marcus Eriksson


If the full query log is enabled and a set of clients have already executed 
"USE " we can't figure out which keyspace the following queries are 
executed against.

We need this for CASSANDRA-14618



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14409) Transient Replication: Support ring changes when transient replication is in use (add/remove node, change RF, add/remove DC)

2018-08-20 Thread Benedict (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585874#comment-16585874
 ] 

Benedict commented on CASSANDRA-14409:
--

I have pushed a patch with some major refactors to {{ReplicaCollection}} 
[here|https://github.com/belliottsmith/cassandra/tree/14409-replicas-wip], in 
that it is seemingly at parity for unit/dtests.  There are some things I would 
like to improve, but we’re too pressed for time, and at the point of 
diminishing returns.  

The patch does include some minor behavioural changes to fix some unit tests 
that were broken prior to it, and some (in progress) review feedback comments 
I’m collecting.

In summary, it introduces {{EndpointsForRange}}, {{EndpointsForToken}} and 
{{RangesAtEndpoint}}.  There are also {{Endpoints}}, a super class of the first 
two, and {{AbstractReplicaCollection}} that implements the majority of 
functionality.

Main improvements:
* These classes are immutable by default; I think this actually reduces copying 
on average, and improves clarity
* There are {{Mutable}} variants, for clearly delineating where this is 
necessary
* We now declare in all locations what our expected semantics are, and we 
enforce them
* There is only one implementation for all non-boilerplate 
{{ReplicaCollection}} methods, so a reader can quickly determine the 
characteristics at a call site
* (Almost) all O(n^2) call sites have been eliminated, mostly by our stronger 
constraints and maintaining a {{LinkedHashMap}}.  This is eagerly constructed 
for a brand-new collection, and lazily for a filtered/sorted collection.

Some open questions: 
* I have limited mutability to append only, which permits easy immutable 
snapshots, but may be fiddly to change later.
* I have made the {{Mutable}} variants extend the immutable ones, but this is 
not necessary, as an immutable view can always be constructed - this might be 
cleaner, and might also permit easier sharing of {{Mutable}} implementation 
details
* {{AbstractReplicaCollection}} introduces a method called {{select()}} for 
efficiently getting a subset of the collection based on a sequence of filters - 
this considerably clarifies one caller, and might clarify others I didn’t spot, 
but is optional.
* Conversely, since we often filter and sort together, this class could be 
extended to perform them together, and avoid some unnecessary work, but since 
this would be harder to rollback it probably needs consensus first.
* We maintain a {{LinkedHashMap}} to avoid code bloat while enforcing the same 
iteration order in our derived collections, but we could introduce an 
{{AbstractMap}} that proxies to a pure {{HashMap}}, and iterates our list.  
This further complicates the code (slightly), but improves performance for 
sorting and building.

Suboptimality:
* {{Endpoints}} is necessary in a number of places, because we conflate 
range and point ops in just a handful of places; namely writing batch/hints to 
‘any’ other host, and in Data/DigestResolver, which can be for range or point 
queries.  This means some uglier generics in places, and some code is less 
clear than it might be.  The places where this conflation occurs are well 
delineated now, at least.
* We don’t have isolation when getting natural and pending replicas, so we 
ignore conflicting endpoints when concatenating these.
* {{Endpoints.newMutable}} - this is an instance method for a new {{Mutable}} 
that matches the type of the instance.  It’s ugly, but presently needed for the 
above conflation.  Possibly the select() class could be modified to support 
these cases instead.
* {{EndpointsForToken}} and {{EndpointsForRange}} *require* a {{Token}} and 
{{Range}} respectively - this means a bit of boilerplate occasionally to pass 
the correct value to empty(), as well as some unnecessary allocations.  We 
could probably relax this, but it makes the semantics of {{newMutable}} tricky 
(if we improved select() to replace this, we could probably avoid this)
* {{EndpointsByRange}} and {{RangesByEndpoint}} have immutable and mutable 
variants, though I don’t love these at all.  Nor do I like 
{{EndpointsByReplica}}.  But they will probably suffice.

> Transient Replication: Support ring changes when transient replication is in 
> use (add/remove node, change RF, add/remove DC)
> 
>
> Key: CASSANDRA-14409
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14409
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Coordination, Core, Documentation and Website
>Reporter: Ariel Weisberg
>Assignee: Ariel Weisberg
>Priority: Major
> Fix For: 4.0
>
>
> The additional state transitions that transient replication introduces 
> require 

[jira] [Commented] (CASSANDRA-14631) Add RSS support for Cassandra blog

2018-08-20 Thread Jeff Beck (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585859#comment-16585859
 ] 

Jeff Beck commented on CASSANDRA-14631:
---

[~djoshi3] what env are you building this in?

Here is the feed xml I get locally 
[https://gist.github.com/beckje01/941ab7ce1c3f4364747110cc2798db0c] I am not 
getting any of the <%HTML%> in the xml itself wondering if you can share the 
xml you got.

I also ran the gist through a feed validator and it all checks out.

The alignment is a bit challenging as those are iframes from twitter. I'll try 
and get them a bit better aligned. 

> Add RSS support for Cassandra blog
> --
>
> Key: CASSANDRA-14631
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14631
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Documentation and Website
>Reporter: Jacques-Henri Berthemet
>Assignee: Jeff Beck
>Priority: Major
> Attachments: 14631-site.txt, Screen Shot 2018-08-17 at 5.32.08 
> PM.png, Screen Shot 2018-08-17 at 5.32.25 PM.png
>
>
> It would be convenient to add RSS support to Cassandra blog:
> [http://cassandra.apache.org/blog/2018/08/07/faster_streaming_in_cassandra.html]
> And maybe also for other resources like new versions, but this ticket is 
> about blog.
>  
> {quote}From: Scott Andreas
> Sent: Wednesday, August 08, 2018 6:53 PM
> To: [d...@cassandra.apache.org|mailto:d...@cassandra.apache.org]
> Subject: Re: Apache Cassandra Blog is now live
>  
> Please feel free to file a ticket (label: Documentation and Website).
>  
> It looks like Jekyll, the static site generator used to build the website, 
> has a plugin that generates Atom feeds if someone would like to work on 
> adding one: [https://github.com/jekyll/jekyll-feed]
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14653) The performance of "NonPeriodicTasks" pools defined in class ScheduledExecutors is low

2018-08-20 Thread Peter Xie (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585843#comment-16585843
 ] 

Peter Xie commented on CASSANDRA-14653:
---

@[~jjirsa]:  The release version is 3.11.2

I increase the sstable_size from 160M (default size) to 1024M, and chunk size 
of compression to 1M, these changes can avoid disk space issue. 

Because compaction task would be reduced by increasing sstable size, so the 
stale sstable number also would be reduced.

But i think it's better make thread pool increasing dynamically when clean 
tasks is heavy. 

> The performance of "NonPeriodicTasks" pools defined in class 
> ScheduledExecutors is low
> --
>
> Key: CASSANDRA-14653
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14653
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
> Environment: Cassandra nodes :
> 3 nodes, 330G physical memory per node , and four data directory (ssd)  per 
> node.
>Reporter: Peter Xie
>Priority: Major
>
> We use cassandra as backend storage for Janusgraph. when we loading huge data 
> (~2 billion vertex, ~10 billion edges), we met some problems.
>  
> At first, we use STCS as compaction strategy , but met below exception.  we 
> checked the value of  "max memory lock" is unlimited and "file map count" is 
> 1 million, these values should enough for loading data. last we found this 
> problem is caused by the virtual memory are all cosumed by cassandra.  So not 
> additional virtual memory can be used by compaction task , and below 
> exception is thrown out.   
> {quote}ERROR [CompactionExecutor:267] 2018-08-09 02:28:40,952 
> JVMStabilityInspector.javv
>  a:74 - OutOfMemory error letting the JVM handle the error:
>  java.lang.OutOfMemoryError: Map failed
> {quote}
> So, we change compaction strategy to LCS, this change seems can resolve the 
> virtual memory problem. But we found another problem : Many sstables which 
> has been compacted are still retained on disk,  these old sstables consume so 
> many disk space, it's causing no enough disk for saving real data. and we 
> found that many files like "mc_txn_compaction_xxx.log" are created under the 
> data directory. 
> After some times' investigaton, found this problem is caused by 
> "NonPeriodicTasks" thread pools.  this pools is always using only one thread 
> for processing clean task after compaction. this thread pool is instanced 
> with class DebuggableScheduledThreadPoolExecutor,
> and DebuggableScheduledThreadPoolExecutor is inherit from class  
> ScheduledThreadPoolExecutor.
> By reading the code of class DebuggableScheduledThreadPoolExecutor,  found 
> DebuggableScheduledThreadPoolExecutor is using an unbound task queue, and 
> core pool size is 1. I think it should wrong using unbound queue.  If we 
> using unbound queue, the thread pool wouldn't  increasing thread even 
> there're many tasks are blocked in queue, because unbound queue never would 
> be full.  I think here should use bound queue, so when clean task is heavily, 
> more threads would created for processing them. 
> {quote}public DebuggableScheduledThreadPoolExecutor(int corePoolSize, String 
> threadPoolName, int priority)
>  Unknown macro: \{ super(corePoolSize, new NamedThreadFactory(threadPoolName, 
> priority)); setRejectedExecutionHandler(rejectedExecutionHandler); }
>   
> public ScheduledThreadPoolExecutor(int corePoolSize,
>  ThreadFactory threadFactory)
>  Unknown macro: \{ super(corePoolSize, Integer.MAX_VALUE, 0, NANOSECONDS, new 
> DelayedWorkQueue(), threadFactory); }
> {quote}
>  Below is the case about clean task after compaction.  there nearly 3 hours 
> delay for removing file "mc-56525". 
> {quote} 
> TRACE [CompactionExecutor:81] 2018-08-16 21:22:29,664 
> LifecycleTransaction.java:363 - Staging for obsolescence 
> BigTableReader(path='/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big-Data.db')
>  ..
>  TRACE [CompactionExecutor:81] 2018-08-16 21:22:41,162 Tracker.java:165 - 
> removing 
> /sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big from 
> list of files tracked for test_2.edgestore
>  
>  TRACE [NonPeriodicTasks:1] 2018-08-17 00:28:47,179 SSTableReader.java:2175 - 
> Async instance tidier for 
> /sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big, 
> before barrier
>  TRACE [NonPeriodicTasks:1] 2018-08-17 00:28:47,180 SSTableReader.java:2181 - 
> Async instance tidier for 
> /sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big, 
> after barrier
>  TRACE [NonPeriodicTasks:1] 2018-08-17 00:28:47,182 SSTableReader.java:2196 - 
> Async instance tidier for 
> 

[jira] [Commented] (CASSANDRA-14631) Add RSS support for Cassandra blog

2018-08-20 Thread Jacques-Henri Berthemet (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585704#comment-16585704
 ] 

Jacques-Henri Berthemet commented on CASSANDRA-14631:
-

I don't get the RSS icon neither with Chrome nor Firefox (Windows)

> Add RSS support for Cassandra blog
> --
>
> Key: CASSANDRA-14631
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14631
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Documentation and Website
>Reporter: Jacques-Henri Berthemet
>Assignee: Jeff Beck
>Priority: Major
> Attachments: 14631-site.txt, Screen Shot 2018-08-17 at 5.32.08 
> PM.png, Screen Shot 2018-08-17 at 5.32.25 PM.png
>
>
> It would be convenient to add RSS support to Cassandra blog:
> [http://cassandra.apache.org/blog/2018/08/07/faster_streaming_in_cassandra.html]
> And maybe also for other resources like new versions, but this ticket is 
> about blog.
>  
> {quote}From: Scott Andreas
> Sent: Wednesday, August 08, 2018 6:53 PM
> To: [d...@cassandra.apache.org|mailto:d...@cassandra.apache.org]
> Subject: Re: Apache Cassandra Blog is now live
>  
> Please feel free to file a ticket (label: Documentation and Website).
>  
> It looks like Jekyll, the static site generator used to build the website, 
> has a plugin that generates Atom feeds if someone would like to work on 
> adding one: [https://github.com/jekyll/jekyll-feed]
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-10726) Read repair inserts should not be blocking

2018-08-20 Thread Alex Petrov (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-10726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585612#comment-16585612
 ] 

Alex Petrov edited comment on CASSANDRA-10726 at 8/20/18 8:24 AM:
--

Thank you! 

A couple more nits: 
   * 
[here|https://github.com/bdeggleston/cassandra/commit/7b753828e38bb92884219867e10d01e4f4184da6#diff-b677a5a6a3f1a90a889bcf906c1f8001R56]
 here switch to {{ConcurrentMap}} seems to be unnecessary. If it's for explicit 
contract, I don't mind to leave it.
  * 
[here|https://github.com/bdeggleston/cassandra/commit/7b753828e38bb92884219867e10d01e4f4184da6#diff-0246c72855070863c2fdbee6d97f494dR156]
 we break unconditionally, so we might want to remove loop. I realise that the 
collection might be empty though, in which case this might be justified.
  * 
[here|https://github.com/bdeggleston/cassandra/commit/7b753828e38bb92884219867e10d01e4f4184da6#diff-0246c72855070863c2fdbee6d97f494dR170]
 I'm not 100% on board with rename as we're delegating the call to 
{{repair.maybeSendAdditionalRepairs}}, I think renaming {{awaitReads}} was 
enough. Or, if writes here are meant in the context of repair anyways, we 
should rename delegated method. Whichever way you might prefer.

Other than that - it looks good for me. +1


was (Author: ifesdjeen):
Thank you! 

A couple more nits: 
   * 
[here|https://github.com/bdeggleston/cassandra/commit/7b753828e38bb92884219867e10d01e4f4184da6#diff-b677a5a6a3f1a90a889bcf906c1f8001R56]
 here switch to {{ConcurrentMap}} seems to be unnecessary. If it's for explicit 
contract, I don't mind to leave it.
  * 
[here|https://github.com/bdeggleston/cassandra/commit/7b753828e38bb92884219867e10d01e4f4184da6#diff-0246c72855070863c2fdbee6d97f494dR156]
 we break unconditionally, so we might want to remove loop. I realise that the 
collection might be empty though, in which case this might be justified.
  * 
[here|https://github.com/bdeggleston/cassandra/commit/7b753828e38bb92884219867e10d01e4f4184da6#diff-0246c72855070863c2fdbee6d97f494dR170]
 I'm not 100% on board with rename as we're delegating the call to 
{{repair.maybeSendAdditionalRepairs}}, I think renaming {{awaitReads}} was 
enough. Or, if writes here are meant in the context of repair anyways, we 
should rename delegated method. Whichever way you might prefer.

> Read repair inserts should not be blocking
> --
>
> Key: CASSANDRA-10726
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10726
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Coordination
>Reporter: Richard Low
>Assignee: Blake Eggleston
>Priority: Major
> Fix For: 4.x
>
>
> Today, if there’s a digest mismatch in a foreground read repair, the insert 
> to update out of date replicas is blocking. This means, if it fails, the read 
> fails with a timeout. If a node is dropping writes (maybe it is overloaded or 
> the mutation stage is backed up for some other reason), all reads to a 
> replica set could fail. Further, replicas dropping writes get more out of 
> sync so will require more read repair.
> The comment on the code for why the writes are blocking is:
> {code}
> // wait for the repair writes to be acknowledged, to minimize impact on any 
> replica that's
> // behind on writes in case the out-of-sync row is read multiple times in 
> quick succession
> {code}
> but the bad side effect is that reads timeout. Either the writes should not 
> be blocking or we should return success for the read even if the write times 
> out.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-10726) Read repair inserts should not be blocking

2018-08-20 Thread Alex Petrov (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-10726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585612#comment-16585612
 ] 

Alex Petrov commented on CASSANDRA-10726:
-

Thank you! 

A couple more nits: 
   * 
[here|https://github.com/bdeggleston/cassandra/commit/7b753828e38bb92884219867e10d01e4f4184da6#diff-b677a5a6a3f1a90a889bcf906c1f8001R56]
 here switch to {{ConcurrentMap}} seems to be unnecessary. If it's for explicit 
contract, I don't mind to leave it.
  * 
[here|https://github.com/bdeggleston/cassandra/commit/7b753828e38bb92884219867e10d01e4f4184da6#diff-0246c72855070863c2fdbee6d97f494dR156]
 we break unconditionally, so we might want to remove loop. I realise that the 
collection might be empty though, in which case this might be justified.
  * 
[here|https://github.com/bdeggleston/cassandra/commit/7b753828e38bb92884219867e10d01e4f4184da6#diff-0246c72855070863c2fdbee6d97f494dR170]
 I'm not 100% on board with rename as we're delegating the call to 
{{repair.maybeSendAdditionalRepairs}}, I think renaming {{awaitReads}} was 
enough. Or, if writes here are meant in the context of repair anyways, we 
should rename delegated method. Whichever way you might prefer.

> Read repair inserts should not be blocking
> --
>
> Key: CASSANDRA-10726
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10726
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Coordination
>Reporter: Richard Low
>Assignee: Blake Eggleston
>Priority: Major
> Fix For: 4.x
>
>
> Today, if there’s a digest mismatch in a foreground read repair, the insert 
> to update out of date replicas is blocking. This means, if it fails, the read 
> fails with a timeout. If a node is dropping writes (maybe it is overloaded or 
> the mutation stage is backed up for some other reason), all reads to a 
> replica set could fail. Further, replicas dropping writes get more out of 
> sync so will require more read repair.
> The comment on the code for why the writes are blocking is:
> {code}
> // wait for the repair writes to be acknowledged, to minimize impact on any 
> replica that's
> // behind on writes in case the out-of-sync row is read multiple times in 
> quick succession
> {code}
> but the bad side effect is that reads timeout. Either the writes should not 
> be blocking or we should return success for the read even if the write times 
> out.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org