+1, I think that makes the most sense. Kind Regards, Brandon
On Tue, Jul 26, 2022 at 8:19 AM J. D. Jordan <jeremiah.jor...@gmail.com> wrote: > > I like the third option, especially if it makes it consistent with repair, > which has supported ranges longer and I would guess most people would think > the compact ranges work the same as the repair ranges. > > -Jeremiah Jordan > > > On Jul 26, 2022, at 6:49 AM, Andrés de la Peña <adelap...@apache.org> wrote: > > > > > > Hi all, > > > > CASSANDRA-17575 has detected that token ranges in nodetool compact are > > interpreted as closed on both sides. For example, the command "nodetool > > compact -st 10 -et 50" will compact the tokens in [10, 50]. This way of > > interpreting token ranges is unusual since token ranges are usually > > half-open, and I think that in the previous example one would expect that > > the compacted tokens would be in (10, 50]. That's for example the way > > nodetool repair works, and indeed the class org.apache.cassandra.dht.Range > > is always half-open. > > > > It's worth mentioning that, differently from nodetool repair, the help and > > doc for nodetool compact doesn't specify whether the supplied start/end > > tokens are inclusive or exclusive. > > > > I think that ideally nodetool compact should interpret the provided token > > ranges as half-open, to be consistent with how token ranges are usually > > interpreted. However, this would change the way the tool has worked until > > now. This change might be problematic for existing users relying on the old > > behaviour. That would be especially severe for the case where the begin and > > end token are the same, because interpreting [x, x] we would compact a > > single token, whereas I think that interpreting (x, x] would compact all > > the tokens. As for compacting ranges including multiple tokens, I think the > > change wouldn't be so bad, since probably the supplied token ranges come > > from tools that are already presenting the ranges as half-open. Also, if we > > are splitting the full ring into smaller ranges, half-open intervals would > > still work and would save us some repetitions. > > > > So my question is: Should we change the behaviour of nodetool compact to > > interpret the token ranges as half-opened, aligning it with the usual > > interpretation of ranges? Or should we just document the current odd > > behaviour to prevent compatibility issues? > > > > A third option would be changing to half-opened ranges and also forbidding > > ranges where the begin and end token are the same, to prevent the > > accidental compaction of the entire ring. Note that nodetool repair also > > forbids this type of token ranges. > > > > What do you think?