If someone isn't explicitly setting vnodes, and the default changes, it
will vary from the number of assigned tokens for existing clusters, right?
Won't this cause the node to fail to start?

I am in favor of changing these defaults, but should provide very clear
guidance on vnodes (unless I am wrong).

I'm sure there are others that would be safe to change. I'll review our
defaults we typically set and report back tomorrow.

On Tue, Jan 21, 2020, 7:22 PM Jeremy Hanna <jeremy.hanna1...@gmail.com>
wrote:

> I mentioned this in the contributor meeting as a topic to bring up on the
> list - should we take the opportunity to update defaults for Cassandra 4.0?
>
> The rationale is two-fold:
> 1) There are best practices and tribal knowledge around certain properties
> where people just know to update those properties immediately as a starting
> point.  If it's pretty much a given that we set something as a starting
> point different than the current defaults, why not make that the new
> default?
> 2) We should align the defaults with what we test with.  There may be
> exceptions if we have one-off tests but on the whole, we should be testing
> with defaults.
>
> As a starting point, compaction throughput and number of vnodes seem like
> good candidates but it would be great to get feedback for any others.
>
> For compaction throughput (
> https://jira.apache.org/jira/browse/CASSANDRA-14902), I've made a basic
> case on the ticket to default to 64 just as a starting point because the
> decision for 16 was made when spinning disk was most common.  Hence most
> people I know change that and I think without too much bikeshedding, 64 is
> a reasonable starting point.  A case could be made that empirically the
> compaction throughput throttle may have less effect than many people think,
> but I still think an updated default would make sense.
>
> For number of vnodes, Michael Shuler made the point in the discussion that
> we already test with 32, which is a far better number than the 256
> default.  I know many new users that just leave the 256 default and then
> discover later that it's better to go lower.  I think 32 is a good
> balance.  One could go lower with the new algorithm but I think 32 is much
> better than 256 without being too skewed, and it's what we currently test.
>
> Jeff brought up a good point that we want to be careful with defaults
> since changing them could come as an unpleasant surprise to people who
> don't explicitly set them.  As a general rule, we should always update
> release notes to clearly state that a default has changed.  For these two
> defaults in particular, I think it's safe.  For compaction throughput I
> think a release not is sufficient in case they want to modify it.  For
> number of vnodes, it won't affect existing deployments with data - it would
> be for new clusters, which would honestly benefit from this anyway.
>
> The other point is whether it's too late to go into 4.0.  For these two
> changes, I think significant testing can still be done with these new
> defaults before release and I think testing more explicitly with 32 vnodes
> in particular will give people more confidence in the lower number with a
> wider array of testing (where we don't already use 32 explicitly).
>
> In summary, are people okay with considering updating these defaults and
> possibly others in the alpha stage of a new major release?  Are there other
> properties to consider?
>
> Jeremy
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>

Reply via email to