Re: Stabilising Internode Messaging in 4.0

2019-04-09 Thread Joseph Lynch
Let's try this again, apparently email is hard ... I am relatively new to these code paths—especially compared to the committers that have been working on these issues for years such as the 15066 authors as well as Jason Brown—but like many Cassandra users I am familiar with many of the classes

Re: Stabilising Internode Messaging in 4.0

2019-04-09 Thread Joseph Lynch
*I am relatively new to these code paths—especially compared to the committers that have been working on these issues for years such as the 15066 authors as well as Jason Brown—but like many Cassandra users I am familiar with many of the classes of issues Aleksey and Benedict have identified with

Re: Choosing a supported Python 3 major version for cqlsh

2019-03-19 Thread Joseph Lynch
Since we'll be maintaining backwards compatibility with python 2.7, we can't really use python 3 only language features or reserved keywords anyways so we should probably just target the lowest common denominator (so 3.4 or 3.5 probably) and then after Python 2 is officially EOL in 2020 perhaps we

Re: Audit logging to tables.

2019-02-27 Thread Joseph Lynch
Hi Sagar, Vinay can confirm, but as far as I am aware we have no current plans to implement audit logging to a table directly, but the implementation is fully pluggable (like compaction, compression, etc ...). Check out the blog post [1] and documentation [2] Vinay wrote for more details, but the

Re: [VOTE] Release Apache Cassandra 2.2.14

2019-02-05 Thread Joseph Lynch
2.2.14-tentative unit and dtest run: https://circleci.com/gh/jolynch/cassandra/tree/2.2.14-tentative unit tests: 0 failures dtests: 5 failures * test_closing_connections - thrift_hsha_test.TestThriftHSHA ( https://issues.apache.org/jira/browse/CASSANDRA-14595) * test_multi_dc_tokens_default -

Re: [VOTE] Release Apache Cassandra 3.11.4

2019-02-03 Thread Joseph Lynch
3.11.4-tentative unit and dtest run: https://circleci.com/gh/jolynch/cassandra/tree/3.11.4-tentative unit tests: 0 failures dtests: 1 failure * test_closing_connections - thrift_hsha_test.TestThriftHSHA ( https://issues.apache.org/jira/browse/CASSANDRA-14595) +1 non binding -Joey On Sat, Feb

Re: [VOTE] Release Apache Cassandra 3.0.18

2019-02-03 Thread Joseph Lynch
3.0.18-tentative unit and dtest run: https://circleci.com/gh/jolynch/cassandra/tree/3.0.18-tentative unit tests: 0 failures dtests: 1 failure * test_closing_connections - thrift_hsha_test.TestThriftHSHA ( https://issues.apache.org/jira/browse/CASSANDRA-14595) +1 non binding -Joey On Sat, Feb

Re: [VOTE] Change Jira Workflow

2018-12-18 Thread Joseph Lynch
+1 non-binding On Tue, Dec 18, 2018 at 1:15 AM Sylvain Lebresne wrote: > +1 > -- > Sylvain > > > On Tue, Dec 18, 2018 at 9:34 AM Oleksandr Petrov < > oleksandr.pet...@gmail.com> > wrote: > > > +1 > > > > On Mon, Dec 17, 2018 at 7:12 PM Nate McCall wrote: > > > > > > On Tue, Dec 18, 2018 at

Re: JIRA Workflow Proposals

2018-12-11 Thread Joseph Lynch
Just my 2c 1. D C B E A 2. B, C, A 3. A 4. +0.5 -Joey On Tue, Dec 11, 2018 at 8:28 AM Benedict Elliott Smith wrote: > Just to re-summarise the questions for people: > > 1. (A) Only contributors may edit or transition issues; (B) Only > contributors may transition issues; (C) Only Jira-users

Re: JIRA Workflow Proposals

2018-11-26 Thread Joseph Lynch
Benedict, Thank you for putting this document together, I think something like this will really improve the quality and usefulness of the Jira tickets! A few pieces of overall feedback on the proposal: * I agree with Jeremy and Joshua on keeping labels. Labels are the only way that contributors

Re: 4.0 Testing Signup

2018-11-08 Thread Joseph Lynch
On Thu, Nov 8, 2018 at 1:42 PM kurt greaves wrote: > Been thinking about this for a while and agree it's how we should approach > it. BIkeshedding but seems like a nice big table would be suitable here, > and I think rather than a separate confluence page per component we just > create separate

Re: 4.0 Testing Signup

2018-11-08 Thread Joseph Lynch
On Thu, Nov 8, 2018 at 11:04 AM Romain Hardouin wrote: > > Hi, > I'm volunteer to be contributor on Metrics or Tooling component. Are we > supposed/allowed to edit Confluence page directly?Btw I think that tooling > should be split, maybe one ticket per tool? > Awesome! Yes feel free to add

4.0 Testing Signup

2018-11-07 Thread Joseph Lynch
Following up on Jon's call for QA, I put together the start of a confluence page for people to list

Re: MD5 in the read path

2018-09-26 Thread Joseph Lynch
> > Thank you all for the response. > For RandomPartitioner, MD5 is used to avoid collision. However, why is it > necessary for comparing data between different replicas? Is it not feasible > to use CRC for data comparison? > My understanding is that it is not necessary to use MD5 and we can

Re: MD5 in the read path

2018-09-26 Thread Joseph Lynch
Michael Kjellman and others (Jason, Sam, et al.) have already done a lot of work in 4.0 to help change the use of MD5 to something more modern [1][2]. Also I cut a ticket a little while back about the significant performance penalty of using MD5 for digests when doing quorum reads of wide

Re: [DISCUSS] changing default token behavior for 4.0

2018-09-24 Thread Joseph Lynch
I am a big fan of lowering the default number of tokens for many reasons (availability, repair, etc...). I also agree there are some usability blockers to "just lowering the number today", but I very much agree that the current default of 256 random tokens is a huge bug I hope we fix by 4.0

Re: QA signup

2018-09-12 Thread Joseph Lynch
> In looking at the Confluence space restrictions, it appears the main page is > open for editing and I don't see restrictions on page creation; can you try > to sign in, create one, and let me know if that doesn't work? I signed in and went to "Jira reports" and then tried to hit "Add Jira

Re: [VOTE] Development Approach for Apache Cassandra Management process

2018-09-12 Thread Joseph Lynch
> I'd like to ask those of you that are +1'ing, are you willing to contribute > or are you just voting we start an admin tool from scratch because you > think it'll somehow produce a perfect codebase? Roopa, Vinay, Sumanth and I are voting as community members (and a sizeable user) and our

Re: [VOTE] Development Approach for Apache Cassandra Management process

2018-09-12 Thread Joseph Lynch
+1 for piecemeal (option b). I think I've explained my opinion on all the various threads and tickets. -Joey On Wed, Sep 12, 2018 at 10:48 AM Vinay Chella wrote: > > +1 for option b, considering the advantages mentioned in dev email thread > that Sankalp linked. > > ~Vinay > > > On Wed, Sep 12,

Re: Proposing an Apache Cassandra Management process

2018-09-08 Thread Joseph Lynch
On Fri, Sep 7, 2018 at 10:00 PM Blake Eggleston wrote: > > Right, I understand the arguments for starting a new project. I’m not saying > reaper is, technically speaking, the best place to start. The point I’m > trying to make is that the non-technical advantages of using an existing > project

Re: Proposing an Apache Cassandra Management process

2018-09-07 Thread Joseph Lynch
> What’s the benefit of doing it that way vs starting with reaper and > integrating the netflix scheduler? If reaper was just a really inappropriate > choice for the cassandra management process, I could see that being a better > approach, but I don’t think that’s the case. > The benefit, as

Re: Proposing an Apache Cassandra Management process

2018-09-07 Thread Joseph Lynch
On Fri, Sep 7, 2018 at 5:03 PM Jonathan Haddad wrote: > > We haven’t even defined any requirements for an admin tool. It’s hard to > make a case for anything without agreement on what we’re trying to build. > We were/are trying to sketch out scope/requirements in the #14395 and #14346 tickets as

Re: QA signup

2018-09-07 Thread Joseph Lynch
I don't think anyone has mentioned this yet but we probably want to consider releasing 4.0 alpha jars to maven central soon so the open source ecosystem can start testing a consistent Cassandra 4.0; for example I had to hack 4.0 into Priam's build [1] by manually building a jar and checking it in

Re: Yet another repair solution

2018-08-28 Thread Joseph Lynch
I'm pretty interested in seeing and understanding your solution! When we started on CASSANDRA-14346 reading your design documents and plan you sketched out in CASSANDRA-10070 were really helpful in improving our design. I'm particularly interested in how the Scheduler/Job/Task APIs turned out

Re: Reaper as cassandra-admin

2018-08-28 Thread Joseph Lynch
I and the rest of the Netflix Cassandra team share Dinesh's concerns. I was excited to work on this project precisely because we were taking only the best designs, techniques, and functionality out of the community sidecars such as Priam, Reaper, and any other community tool and building the

Re: JIRAs in Review

2018-08-22 Thread Joseph Lynch
Just want to bump this up if any reviewers have time before the 9/1 window. I think these are all patch available and ready for review at this point. Useful improvements for 4.0: > > https://issues.apache.org/jira/browse/CASSANDRA-14303 and > https://issues.apache.org/jira/browse/CASSANDRA-14557

Re: Side Car New Repo vs not

2018-08-20 Thread Joseph Lynch
I think that the pros of incubating the sidecar in tree as a tool first outweigh the alternatives at this point of time. Rough tradeoffs that I see: Unique pros of in tree sidecar: * Faster iteration speed in general. For example when we need to add a new JMX endpoint that the sidecar needs, or

Re: Proposing an Apache Cassandra Management process

2018-08-20 Thread Joseph Lynch
> We are looking to contribute Reaper to the Cassandra project. > Just to clarify are you proposing contributing Reaper as a project via donation or you are planning on contributing the features of Reaper as patches to Cassandra? If the former how far along are you on the donation process? If the

Re: Proposing an Apache Cassandra Management process

2018-08-17 Thread Joseph Lynch
While I would love to use a different build system (e.g. gradle) for the sidecar, I agree with Dinesh that a separate repo would make sidecar development much harder to verify, especially on the testing and compatibility front. As Jeremiah mentioned we can always choose later to release the

Re: JIRAs in Review

2018-07-20 Thread Joseph Lynch
We have a few improvements and bug fixes that could use reviewer feedback. Useful improvements for 4.0: https://issues.apache.org/jira/browse/CASSANDRA-14303 and https://issues.apache.org/jira/browse/CASSANDRA-14557 - Makes the user interface for creating keyspaces easier to use and less error

Re: reroll the builds?

2018-07-17 Thread Joseph Lynch
We ran the tests against 3.0, 2.2 and 3.11 using circleci and there are various failing dtests but all three have green unit tests. 3.11.3 tentative (31d5d87, test branch , unit tests

Re: [VOTE] Release Apache Cassandra 2.2.13

2018-07-03 Thread Joseph Lynch
+1 nb Tests look reasonable with passing unit tests and about 13 failing dtests On Tue, Jul 3, 2018 at 1:55 PM kurt greaves

Re: Difference between heartbeat and generation on a Gossip packet

2018-06-28 Thread Joseph Lynch
Hi Abdelkarim, Other people on this list are much more knowledgeable than me and can correct me if I'm wrong, but my understanding is that the combination of generation and version (aka heartbeat) form a logical clock tuple consisting of (generation, version) and that combination is the

Re: Quantifying Virtual Node Impact on Cassandra Availability

2018-04-17 Thread Joseph Lynch
r does only nodetool clean do that?) > > > > > > > > Pre-subdivided sstables for manually maanged tokens would REALLY pay > > big > > > > dividends in large-scale cluster expansion. Say you wanted to double > or > > > > triple the cluster. Si

Re: Quantifying Virtual Node Impact on Cassandra Availability

2018-04-17 Thread Joseph Lynch
could just be a drive > > > detach / drive attach. > > > > > > > > > > > > > > >> On Tue, Apr 17, 2018 at 7:37 AM, kurt greaves <k...@instaclustr.com> > > wrote: > > >> > > >> Great write up. Glad

Re: Quantifying Virtual Node Impact on Cassandra Availability

2018-04-16 Thread Joseph Lynch
nch/python_performance_toolkit/raw/master/notebooks/cassandra_availability/whitepaper/cassandra-availability-virtual.pdf> On Mon, Apr 16, 2018 at 1:14 PM, Joseph Lynch <joe.e.ly...@gmail.com> wrote: > Josh Snyder and I have been working on evaluating virtual nodes for large > scale deployments and w

Quantifying Virtual Node Impact on Cassandra Availability

2018-04-16 Thread Joseph Lynch
Josh Snyder and I have been working on evaluating virtual nodes for large scale deployments and while it seems like there is a lot of anecdotal support for reducing the vnode count [1], we couldn't find any concrete math on the topic, so we had some fun and took a whack at quantifying how

Re: Repair scheduling tools

2018-04-12 Thread Joseph Lynch
Given the feedback here and on the ticket, I've written up a proposal for a repair sidecar tool in the ticket's design document. If there are no major concerns we're going to start working

Re: Repair scheduling tools

2018-04-12 Thread Joseph Lynch
> > I personally would rather see improvements to reaper and supporting reaper > so the repair tool improvements aren't tied to Cassandra releases. If we > get to a place where the repair tools are stable then figuring out how to > bundle for the best install makes sense to me. > I view the

Re: Roadmap for 4.0

2018-04-12 Thread Joseph Lynch
The Netflix team prefers September as well. We don't have time before that to do a full certification (e2e and performance testing), but can probably work it into end of Q3 / start of Q4. I personally hope that the extra time gives us as a community a chance to come up with a compelling user

Re: Repair scheduling tools

2018-04-05 Thread Joseph Lynch
> > We see this in larger clusters regularly. Usually folks have just > 'grown into it' because it was the default. > I could understand a few dozen nodes with 256 vnodes, but hundreds is surprising. I have a whitepaper draft lying around showing how vnodes decrease availability in large clusters

Re: Repair scheduling tools

2018-04-05 Thread Joseph Lynch
es of data, and I really do believe would not cause significant, if any, heap pressure. The repairs *themselves* certainly would create heap pressure, but that happens regardless of the scheduler. -Joey On Thu, Apr 5, 2018 at 7:25 PM, Joseph Lynch <joe.e.ly...@gmail.com> wrote: > I

Re: Repair scheduling tools

2018-04-05 Thread Joseph Lynch
> > I wouldn't trivialize it, scheduling can end up dealing with more than a > single repair. If theres 1000 keyspace/tables, with 400 nodes and 256 > vnodes on each thats a lot of repairs to plan out and keep track of and can > easily cause heap allocation spikes if opted in. > > Chris The

Re: Repair scheduling tools

2018-04-05 Thread Joseph Lynch
, have we looked into how other NoSQL > databases > > do > > > > > repair? > > > > > >>> Is there a side car process? > > > > > >>> > > > > > >>> > > > > > >>> On Tue, Apr 3, 2018 a

Re: Repair scheduling tools

2018-04-05 Thread Joseph Lynch
>>>> great > > >>>>> addition to the database. I am hoping, we as a community will make > it > > >>>> easy > > >>>>> for teams to operate and run Cassandra by enhancing the core > product, > > >>> and

Re: Repair scheduling tools

2018-04-03 Thread Joseph Lynch
I just want to say I think it would be great for our users if we moved repair scheduling into Cassandra itself. The team here at Netflix has opened the ticket and have written a detailed design document