I know the dtests take a long time and this will make them longer. As a counter 
point most people run individual dtests locally and the full set on dedicated 
test infrastructure. For the dedicated test infrastructure Mick also improved 
the wall clock runtime when parallelizing the dtests on 
https://issues.apache.org/jira/browse/CASSANDRA-16006. 

Even with the longer dtest full runtime, I firmly believe that for the sake of 
new users and how hard it is to change num_tokens once data is written, this 
change to the default of num_tokens is long overdue. Another hidden benefit of 
this change is that the dtests will now run bootstraps the way operators should 
run them in practice with the new defaults. That will make the more common 
default case much more tested and hopefully catch regressions in that execution 
path faster.

So while it is not a trivial change in full dtest runtime, the benefits to the 
community and project are also not trivial. I’m really grateful to all who have 
put in effort to make this a reality and know that new users in 4.0 will 
benefit from these improved defaults.

In other words my non binding vote is to merge this and look to improve 
execution time separately with that effort not being as urgent for the reasons 
stated above.

Jeremy

> On Aug 20, 2020, at 2:49 AM, Mick Semb Wever <m...@apache.org> wrote:
> 
> It was agreed¹ that 4.0 should have the new configuration defaults of
>  num_tokens: 16
>  allocate_tokens_for_local_replication_factor: 3
> 
> 13701's patches: against cassandra, cassandra-builds, cassandra-dtest, ccm;
> are reviewed, tested, and ready to commit. But the ccm and dtest patches
> required ccm having to now start nodes sequentially, and adding some longer
> timeout values in the dtests.
> 
> The consequence of this is CI runs now take longer. ci-cassandra.a.o's
> dtests take ~30% longer, and circleci's dtests (with vnodes) have gone from
> ~22 to ~43 minutes. The general opinion (on slack²) is to commit, and work
> on improving ccm and dtest startup times in a subsequent ticket.
> 
> 13701 was intended to be committed before the first beta release because of
> its user-facing changes. But these numbers are significant enough it makes
> sense to touch base with dev@
> 
> Does anyone (strongly) object to the "commit + follow up ticket" approach?
> 
> regards,
> Mick
> 
> 
> ¹ –
> https://lists.apache.org/thread.html/ra829084fcf344e9e96fa5c61cb31e909c8629091993471594b65ea89%40%3Cdev.cassandra.apache.org%3E
> ² – https://the-asf.slack.com/archives/CK23JSY2K/p1597747395032600 and
> https://the-asf.slack.com/archives/CK23JSY2K/p1597849774078200?thread_ts=1597762085.048300&cid=CK23JSY2K

Reply via email to