+1 to Benedict and Ekaterina's points - adding new flags to explicitly introduce the new behavior and documenting the hell out of both the default and the new flags seems like the right play. There's quite possibly a lot of tooling out there in the wild that relies on the current behavior and breaking that for users is Bad.
~Josh On Tue, Jul 26, 2022, at 10:05 AM, Ekaterina Dimitrova wrote: > I also like the idea with the flags plus improving the documentation as my > understanding was that it is not really documented and can be confusing and > risky for end users. > > On Tue, 26 Jul 2022 at 9:28, Benedict Elliott Smith <bened...@apache.org> > wrote: >> >> I think a change like this could be dangerous for a lot of existing >> automation built atop nodetool. >> >> I’m not sure this change is worthwhile. I think it would be better to >> introduce e.g. -ste and -ete for “start token exclusive” and “end token >> exclusive” so that users can opt-in to whichever scheme they prefer for >> their tooling, without breaking existing users. >> >> > On 26 Jul 2022, at 14:22, Brandon Williams <dri...@gmail.com> wrote: >> > >> > +1, I think that makes the most sense. >> > >> > Kind Regards, >> > Brandon >> > >> > On Tue, Jul 26, 2022 at 8:19 AM J. D. Jordan <jeremiah.jor...@gmail.com> >> > wrote: >> >> >> >> I like the third option, especially if it makes it consistent with >> >> repair, which has supported ranges longer and I would guess most people >> >> would think the compact ranges work the same as the repair ranges. >> >> >> >> -Jeremiah Jordan >> >> >> >>> On Jul 26, 2022, at 6:49 AM, Andrés de la Peña <adelap...@apache.org> >> >>> wrote: >> >>> >> >>> >> >>> Hi all, >> >>> >> >>> CASSANDRA-17575 has detected that token ranges in nodetool compact are >> >>> interpreted as closed on both sides. For example, the command "nodetool >> >>> compact -st 10 -et 50" will compact the tokens in [10, 50]. This way of >> >>> interpreting token ranges is unusual since token ranges are usually >> >>> half-open, and I think that in the previous example one would expect >> >>> that the compacted tokens would be in (10, 50]. That's for example the >> >>> way nodetool repair works, and indeed the class >> >>> org.apache.cassandra.dht.Range is always half-open. >> >>> >> >>> It's worth mentioning that, differently from nodetool repair, the help >> >>> and doc for nodetool compact doesn't specify whether the supplied >> >>> start/end tokens are inclusive or exclusive. >> >>> >> >>> I think that ideally nodetool compact should interpret the provided >> >>> token ranges as half-open, to be consistent with how token ranges are >> >>> usually interpreted. However, this would change the way the tool has >> >>> worked until now. This change might be problematic for existing users >> >>> relying on the old behaviour. That would be especially severe for the >> >>> case where the begin and end token are the same, because interpreting >> >>> [x, x] we would compact a single token, whereas I think that >> >>> interpreting (x, x] would compact all the tokens. As for compacting >> >>> ranges including multiple tokens, I think the change wouldn't be so bad, >> >>> since probably the supplied token ranges come from tools that are >> >>> already presenting the ranges as half-open. Also, if we are splitting >> >>> the full ring into smaller ranges, half-open intervals would still work >> >>> and would save us some repetitions. >> >>> >> >>> So my question is: Should we change the behaviour of nodetool compact to >> >>> interpret the token ranges as half-opened, aligning it with the usual >> >>> interpretation of ranges? Or should we just document the current odd >> >>> behaviour to prevent compatibility issues? >> >>> >> >>> A third option would be changing to half-opened ranges and also >> >>> forbidding ranges where the begin and end token are the same, to prevent >> >>> the accidental compaction of the entire ring. Note that nodetool repair >> >>> also forbids this type of token ranges. >> >>> >> >>> What do you think? >> >>