Re: [DISCUSS] CASSANDRA-16767, CASSANDRA-16768, and CASSANDRA-16769 for 3.11.x

2021-08-10 Thread bened...@apache.org
Hi Scott,

I wonder if it’s possible that too few people who saw your email consider 
themselves sufficiently involved in this part of the codebase.  People tend to 
keep quiet about stuff they don’t participate in deeply, which is why I haven’t 
responded – and I wonder if this might explain the tumbleweed. I wonder how we 
might generally track if areas of the codebase are adequately covered by active 
contributors.

To answer your question, I don’t personally believe it is problematic to add 
additional features to command line tools in a patch version – they’re not 
scary systems where new features introduce much risk of high impact bugs. 
Others have stricter interpretations of the rules, but if they haven’t spoken 
up yet I’d say you’re clear to post some patches – but you might want to first 
make sure there’s somebody able and willing to review them.


From: Scott Carey 
Date: Monday, 9 August 2021 at 20:12
To: dev@cassandra.apache.org 
Subject: Re: [DISCUSS] CASSANDRA-16767, CASSANDRA-16768, and CASSANDRA-16769 
for 3.11.x
Thank you Brandon, for answering my questions on slack, and providing early
feedback on these ideas more than a month before I created the patches and
replying here.

Does anyone else have any comments or opinions?  Can a decision be reached
one way or another?  It is my understanding that we'll need more than one
+1 to move forward here.

I understand that the 4.0 release was a busy time, and that many probably
saw this, thought about replying, but got too busy and never did.
However, in light of the recent discussions around attracting new
contributors, I would like to highlight that being left in limbo with no
resolution is worse than being told "no", especially for new contributors.




On Fri, Jul 2, 2021 at 1:23 PM Brandon Williams  wrote:

> On Tue, Jun 29, 2021 at 5:49 PM Scott Carey  wrote:
> >
> > I'd like to discuss the inclusion of the above tickets for a 3.11.x
> > release.  These are not a pure 'bug fix' so I'll need a waiver to get
> them
> > into 3.11.x  (and implicitly, 4.0.x).
> >
> > The first two are straightforward oversights:  neither *nodetool
> > garbagecollect *nor *nodetool scrub* currently accept a *--user-defined*
> > parameter list of SSTables in the same way that *nodetool compact* does.
> >
> > This is an operational problem for large tables.
> >
> > I often need to scrub just one file that is corrupted for some reason,
> and
> > not scrub an entire 1TB+ of data for a table on a node.  This renders
> > 'nodetool scrub' operationally useless for large tables.
>
> I think that given not having user defined options for these
> compaction types is clearly an oversight, and that the alternative of
> deleting the large 1TB+ sstable and then repairing is a cure worse
> than the disease, this should be added to 3.11.x and 4.0.x. I am +1
> here.
>
> > For *garbagecollect* it is often operationally easy to identify which
> > tables are likely to be full of bloa- and operationally useful to do this
> > task in small increments.  The existing order that garbagecollect
> processes
> > SSTables prevents it from being useful in any incremental fashion -- if
> you
> > stop it and later restart, it will first process the SSTables you just
> > garbage collected.
> >
> > The third ticket adds an option for* nodetool garbagecollect*,
> > *--oldest-fraction* that can select a fraction of the oldest table data
> in
> > bytes, and garbagecollect only the SSTables that 'cover' that percentage
> of
> > data.  Operationally, this lends itself to easy automation -- for example
> > running this once a week on 10% of a table's data would imply that there
> is
> > no data on disk that has been overwritten within the last 10 weeks.  This
> > caps data bloat in ways neither LCS nor STCS can currently achieve
> without
> > regular major compactions or full-pass garbagecollect.
>
> This is a less obvious thing to be added, and I personally lack the
> operational experience to comment on how much relief this would
> provide firsthand, so I'll leave that to others.  But it does make
> sense to me and since it isn't heavily modifying anything my
> inclination is that this could be an acceptable addition as well.
>
> Kind Regards,
> Brandon
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: [DISCUSS] CASSANDRA-16767, CASSANDRA-16768, and CASSANDRA-16769 for 3.11.x

2021-08-09 Thread Brandon Williams
On Mon, Aug 9, 2021 at 2:05 PM Scott Carey  wrote:
> Does anyone else have any comments or opinions?  Can a decision be reached
> one way or another?  It is my understanding that we'll need more than one
> +1 to move forward here.

Lazy consensus says 72 hours
(https://community.apache.org/committers/lazyConsensus.html) with
considerations for availability, but I think it has been more than
long enough at this point.  If no one else objects after this
prodding, I think you should move forward.

Kind Regards,
Brandon

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [DISCUSS] CASSANDRA-16767, CASSANDRA-16768, and CASSANDRA-16769 for 3.11.x

2021-08-09 Thread Scott Carey
Thank you Brandon, for answering my questions on slack, and providing early
feedback on these ideas more than a month before I created the patches and
replying here.

Does anyone else have any comments or opinions?  Can a decision be reached
one way or another?  It is my understanding that we'll need more than one
+1 to move forward here.

I understand that the 4.0 release was a busy time, and that many probably
saw this, thought about replying, but got too busy and never did.
However, in light of the recent discussions around attracting new
contributors, I would like to highlight that being left in limbo with no
resolution is worse than being told "no", especially for new contributors.




On Fri, Jul 2, 2021 at 1:23 PM Brandon Williams  wrote:

> On Tue, Jun 29, 2021 at 5:49 PM Scott Carey  wrote:
> >
> > I'd like to discuss the inclusion of the above tickets for a 3.11.x
> > release.  These are not a pure 'bug fix' so I'll need a waiver to get
> them
> > into 3.11.x  (and implicitly, 4.0.x).
> >
> > The first two are straightforward oversights:  neither *nodetool
> > garbagecollect *nor *nodetool scrub* currently accept a *--user-defined*
> > parameter list of SSTables in the same way that *nodetool compact* does.
> >
> > This is an operational problem for large tables.
> >
> > I often need to scrub just one file that is corrupted for some reason,
> and
> > not scrub an entire 1TB+ of data for a table on a node.  This renders
> > 'nodetool scrub' operationally useless for large tables.
>
> I think that given not having user defined options for these
> compaction types is clearly an oversight, and that the alternative of
> deleting the large 1TB+ sstable and then repairing is a cure worse
> than the disease, this should be added to 3.11.x and 4.0.x. I am +1
> here.
>
> > For *garbagecollect* it is often operationally easy to identify which
> > tables are likely to be full of bloa- and operationally useful to do this
> > task in small increments.  The existing order that garbagecollect
> processes
> > SSTables prevents it from being useful in any incremental fashion -- if
> you
> > stop it and later restart, it will first process the SSTables you just
> > garbage collected.
> >
> > The third ticket adds an option for* nodetool garbagecollect*,
> > *--oldest-fraction* that can select a fraction of the oldest table data
> in
> > bytes, and garbagecollect only the SSTables that 'cover' that percentage
> of
> > data.  Operationally, this lends itself to easy automation -- for example
> > running this once a week on 10% of a table's data would imply that there
> is
> > no data on disk that has been overwritten within the last 10 weeks.  This
> > caps data bloat in ways neither LCS nor STCS can currently achieve
> without
> > regular major compactions or full-pass garbagecollect.
>
> This is a less obvious thing to be added, and I personally lack the
> operational experience to comment on how much relief this would
> provide firsthand, so I'll leave that to others.  But it does make
> sense to me and since it isn't heavily modifying anything my
> inclination is that this could be an acceptable addition as well.
>
> Kind Regards,
> Brandon
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: [DISCUSS] CASSANDRA-16767, CASSANDRA-16768, and CASSANDRA-16769 for 3.11.x

2021-07-02 Thread Brandon Williams
On Tue, Jun 29, 2021 at 5:49 PM Scott Carey  wrote:
>
> I'd like to discuss the inclusion of the above tickets for a 3.11.x
> release.  These are not a pure 'bug fix' so I'll need a waiver to get them
> into 3.11.x  (and implicitly, 4.0.x).
>
> The first two are straightforward oversights:  neither *nodetool
> garbagecollect *nor *nodetool scrub* currently accept a *--user-defined*
> parameter list of SSTables in the same way that *nodetool compact* does.
>
> This is an operational problem for large tables.
>
> I often need to scrub just one file that is corrupted for some reason, and
> not scrub an entire 1TB+ of data for a table on a node.  This renders
> 'nodetool scrub' operationally useless for large tables.

I think that given not having user defined options for these
compaction types is clearly an oversight, and that the alternative of
deleting the large 1TB+ sstable and then repairing is a cure worse
than the disease, this should be added to 3.11.x and 4.0.x. I am +1
here.

> For *garbagecollect* it is often operationally easy to identify which
> tables are likely to be full of bloa- and operationally useful to do this
> task in small increments.  The existing order that garbagecollect processes
> SSTables prevents it from being useful in any incremental fashion -- if you
> stop it and later restart, it will first process the SSTables you just
> garbage collected.
>
> The third ticket adds an option for* nodetool garbagecollect*,
> *--oldest-fraction* that can select a fraction of the oldest table data in
> bytes, and garbagecollect only the SSTables that 'cover' that percentage of
> data.  Operationally, this lends itself to easy automation -- for example
> running this once a week on 10% of a table's data would imply that there is
> no data on disk that has been overwritten within the last 10 weeks.  This
> caps data bloat in ways neither LCS nor STCS can currently achieve without
> regular major compactions or full-pass garbagecollect.

This is a less obvious thing to be added, and I personally lack the
operational experience to comment on how much relief this would
provide firsthand, so I'll leave that to others.  But it does make
sense to me and since it isn't heavily modifying anything my
inclination is that this could be an acceptable addition as well.

Kind Regards,
Brandon

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org