On Tue, Jan 21, 2020 at 7:41 PM Jonathan Koppenhofer <j...@koppedomain.com> wrote:
> If someone isn't explicitly setting vnodes, and the default changes, it > will vary from the number of assigned tokens for existing clusters, right? > Won't this cause the node to fail to start? > Nope. You can have 32 tokens on some instances and 256 in other instances in the same dc/cluster. No error. The hosts with 256 tokens will just have 8x as much data as the hosts with 32 tokens. And that's why changing defaults is hard. > > I am in favor of changing these defaults, but should provide very clear > guidance on vnodes (unless I am wrong). > > I'm sure there are others that would be safe to change. I'll review our > defaults we typically set and report back tomorrow. > > On Tue, Jan 21, 2020, 7:22 PM Jeremy Hanna <jeremy.hanna1...@gmail.com> > wrote: > > > I mentioned this in the contributor meeting as a topic to bring up on the > > list - should we take the opportunity to update defaults for Cassandra > 4.0? > > > > The rationale is two-fold: > > 1) There are best practices and tribal knowledge around certain > properties > > where people just know to update those properties immediately as a > starting > > point. If it's pretty much a given that we set something as a starting > > point different than the current defaults, why not make that the new > > default? > > 2) We should align the defaults with what we test with. There may be > > exceptions if we have one-off tests but on the whole, we should be > testing > > with defaults. > > > > As a starting point, compaction throughput and number of vnodes seem like > > good candidates but it would be great to get feedback for any others. > > > > For compaction throughput ( > > https://jira.apache.org/jira/browse/CASSANDRA-14902), I've made a basic > > case on the ticket to default to 64 just as a starting point because the > > decision for 16 was made when spinning disk was most common. Hence most > > people I know change that and I think without too much bikeshedding, 64 > is > > a reasonable starting point. A case could be made that empirically the > > compaction throughput throttle may have less effect than many people > think, > > but I still think an updated default would make sense. > > > > For number of vnodes, Michael Shuler made the point in the discussion > that > > we already test with 32, which is a far better number than the 256 > > default. I know many new users that just leave the 256 default and then > > discover later that it's better to go lower. I think 32 is a good > > balance. One could go lower with the new algorithm but I think 32 is > much > > better than 256 without being too skewed, and it's what we currently > test. > > > > Jeff brought up a good point that we want to be careful with defaults > > since changing them could come as an unpleasant surprise to people who > > don't explicitly set them. As a general rule, we should always update > > release notes to clearly state that a default has changed. For these two > > defaults in particular, I think it's safe. For compaction throughput I > > think a release not is sufficient in case they want to modify it. For > > number of vnodes, it won't affect existing deployments with data - it > would > > be for new clusters, which would honestly benefit from this anyway. > > > > The other point is whether it's too late to go into 4.0. For these two > > changes, I think significant testing can still be done with these new > > defaults before release and I think testing more explicitly with 32 > vnodes > > in particular will give people more confidence in the lower number with a > > wider array of testing (where we don't already use 32 explicitly). > > > > In summary, are people okay with considering updating these defaults and > > possibly others in the alpha stage of a new major release? Are there > other > > properties to consider? > > > > Jeremy > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > > For additional commands, e-mail: dev-h...@cassandra.apache.org > > > > >