Re: [Discuss] Repair inside C*

Jaydeep Chovatia Sun, 25 Feb 2024 12:24:53 -0800

Thanks, Josh. I've just updated the CEP
<https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-37+%28DRAFT%29+Apache+Cassandra+Official+Repair+Solution>
and included all the solutions you mentioned below.


Jaydeep

On Thu, Feb 22, 2024 at 9:33 AM Josh McKenzie <jmcken...@apache.org> wrote:

> Very late response from me here (basically necro'ing this thread).
>
> I think it'd be useful to get this condensed into a CEP that we can then
> discuss in that format. It's clearly something we all agree we need and
> having an implementation that works, even if it's not in your preferred
> execution domain, is vastly better than nothing IMO.
>
> I don't have cycles (nor background ;) ) to do that, but it sounds like
> you do Jaydeep given the implementation you have on a private fork + design.
>
> A non-exhaustive list of things that might be useful incorporating into or
> referencing from a CEP:
> Slack thread:
> https://the-asf.slack.com/archives/CK23JSY2K/p1690225062383619
> Joey's old C* ticket:
> https://issues.apache.org/jira/browse/CASSANDRA-14346
> Even older automatic repair scheduling:
> https://issues.apache.org/jira/browse/CASSANDRA-10070
> Your design gdoc:
> https://docs.google.com/document/d/1CJWxjEi-mBABPMZ3VWJ9w5KavWfJETAGxfUpsViPcPo/edit#heading=h.r112r46toau0
> PR with automated repair:
> https://github.com/jaydeepkumar1984/cassandra/commit/ef6456d652c0d07cf29d88dfea03b73704814c2c
>
> My intuition is that we're all basically in agreement that this is
> something the DB needs, we're all willing to bikeshed for our personal
> preference on where it lives and how it's implemented, and at the end of
> the day, code talks. I don't think anyone's said they'll die on the hill of
> implementation details, so that feels like CEP time to me.
>
> If you were willing and able to get a CEP together for automated repair
> based on the above material, given you've done the work and have the proof
> points it's working at scale, I think this would be a *huge contribution*
> to the community.
>
> On Thu, Aug 24, 2023, at 7:26 PM, Jaydeep Chovatia wrote:
>
> Is anyone going to file an official CEP for this?
> As mentioned in this email thread, here is one of the solution's design
> doc
> <https://docs.google.com/document/d/1CJWxjEi-mBABPMZ3VWJ9w5KavWfJETAGxfUpsViPcPo/edit#heading=h.r112r46toau0>
> and source code on a private Apache Cassandra patch. Could you go through
> it and let me know what you think?
>
> Jaydeep
>
> On Wed, Aug 2, 2023 at 3:54 PM Jon Haddad <rustyrazorbl...@apache.org>
> wrote:
>
> > That said I would happily support an effort to bring repair scheduling
> to the sidecar immediately. This has nothing blocking it, and would
> potentially enable the sidecar to provide an official repair scheduling
> solution that is compatible with current or even previous versions of the
> database.
>
> This is something I hadn't thought much about, and is a pretty good
> argument for using the sidecar initially.  There's a lot of deployments out
> there and having an official repair option would be a big win.
>
>
> On 2023/07/26 23:20:07 "C. Scott Andreas" wrote:
> > I agree that it would be ideal for Cassandra to have a repair scheduler
> in-DB.
> >
> > That said I would happily support an effort to bring repair scheduling
> to the sidecar immediately. This has nothing blocking it, and would
> potentially enable the sidecar to provide an official repair scheduling
> solution that is compatible with current or even previous versions of the
> database.
> >
> > Once TCM has landed, we’ll have much stronger primitives for repair
> orchestration in the database itself. But I don’t think that should block
> progress on a repair scheduling solution in the sidecar, and there is
> nothing that would prevent someone from continuing to use a sidecar-based
> solution in perpetuity if they preferred.
> >
> > - Scott
> >
> > > On Jul 26, 2023, at 3:25 PM, Jon Haddad <rustyrazorbl...@apache.org>
> wrote:
> > >
> > > I'm 100% in favor of repair being part of the core DB, not the
> sidecar.  The current (and past) state of things where running the DB
> correctly *requires* running a separate process (either community
> maintained or official C* sidecar) is incredibly painful for folks.  The
> idea that your data integrity needs to be opt-in has never made sense to me
> from the perspective of either the product or the end user.
> > >
> > > I've worked with way too many teams that have either configured this
> incorrectly or not at all.
> > >
> > > Ideally Cassandra would ship with repair built in and on by default.
> Power users can disable if they want to continue to maintain their own
> repair tooling for some reason.
> > >
> > > Jon
> > >
> > >> On 2023/07/24 20:44:14 German Eichberger via dev wrote:
> > >> All,
> > >> We had a brief discussion in [2] about the Uber article [1] where
> they talk about having integrated repair into Cassandra and how great that
> is. I expressed my disappointment that they didn't work with the community
> on that (Uber, if you are listening time to make amends 🙂) and it turns
> out Joey already had the idea and wrote the code [3] - so I wanted to start
> a discussion to gauge interest and maybe how to revive that effort.
> > >> Thanks,
> > >> German
> > >> [1]
> https://www.uber.com/blog/how-uber-optimized-cassandra-operations-at-scale/
> > >> [2] https://the-asf.slack.com/archives/CK23JSY2K/p1690225062383619
> > >> [3] https://issues.apache.org/jira/browse/CASSANDRA-14346
> >
>
>
>

Re: [Discuss] Repair inside C*

Reply via email to