Speaking to Caleb in Slack, so putting the main comments I have there here…
I am not -1 on this new dependency, but more asking what we should use for random testing moving forward…. ATM we have the following: 1) QuickTheories - I feel like I am the only user at this point… 2) 1-off - many reinvent random testing for a specific class; using Random, ThreadLocalRandom, UUID.randomUUID(), and lang3 classes (such as org.apache.commons.lang3.RandomUtils) 3) Harry - even though the main API is for cluster testing, this is built on-top of random generation so could be used for low level random testing (just less fleshed out for this use-case) 4) Simulator - same as Harry, built on top of a random generator and not fleshed out for low level random testing Another reason I ask this is I have a fuzz testing that I have developed for Accord testing that generates random valid CQL statements to make sure we “do the right thing” and have been struggling with the question “where do I put this” and “what random do I use?”. I built this off QuickTheories as I have a lot of utilities for building all supported Tables and Types so really quick do bootstrap, and every other random testing thing we have are less fleshed out… so if we add yet another random testing library what “should” we be using? Do we build on-top of it to get to the same level QuickTheory is (see org.apache.cassandra.utils.Generators, org.apache.cassandra.utils.CassandraGenerators, and org.apache.cassandra.utils.AbstractTypeGenerators)? > On Dec 13, 2022, at 9:21 AM, Caleb Rackliffe <calebrackli...@gmail.com> wrote: > > We need random generators no matter what for these tests, so I think what we > need to decide is whether to continue to use Carrot or migrate those to > QuickTheories, along the lines of what we have now in > org.apache.cassandra.utils.Generators. > > When it comes to a library like this, the thing I would optimize for is how > much it already provides (and therefore how much we need to write and > maintain ourselves). If you look at something like NumericTypeSortingTest in > the 18058 branch <https://github.com/maedhroz/cassandra/pull/6>, it's pretty > compact w/ Carrot's RandomizedTest in use, but I suppose it could also use > IntegersDSL from QT... > > (Not that it matters, but just for reference, we do use com.carrotsearch.hppc > already.) > > On Tue, Dec 13, 2022 at 10:14 AM Mike Adamson <madam...@datastax.com > <mailto:madam...@datastax.com>> wrote: > Can you talk more about why? There are several ways to do random testing > in-tree ATM, so wondering why we need another one > > I can see one mechanism for random testing in-tree. That is the Simulator but > that seems primarily involved in the random orchestration of operations. My > apologies if I have simplified its significance. Apart from that, I can only > see different usages of Random in unit tests. I admit I have not looked > beyond this at dtests. > > The random testing in SAI is more focussed on the behaviour of the low-level > index structures and flow of data to / from these. Using randomly generated > values in tests has proved invaluable in highlighting edge conditions in the > code. This above library was only added to provide us with a rich set of > random generators. I am happy to look at removing this library if its > inclusion is contentious. > > > On Mon, 12 Dec 2022 at 19:41, David Capwell <dcapw...@apple.com > <mailto:dcapw...@apple.com>> wrote: >> com.carrotsearch.randomizedtesting.randomizedtesting-runner 2.1.2 - test >> dependency > > Can you talk more about why? There are several ways to do random testing > in-tree ATM, so wondering why we need another one > > >> On Dec 8, 2022, at 6:51 AM, Mike Adamson <madam...@datastax.com >> <mailto:madam...@datastax.com>> wrote: >> >> Hi, >> >> I wanted to discuss the addition of the following dependencies for CEP-7. >> The dependencies are: >> >> org.apache.lucene.lucene-core 7.5.0 >> org.apache.lucene.lucene-analyzers-common 7.5.0 >> com.carrotsearch.randomizedtesting.randomizedtesting-runner 2.1.2 - test >> dependency >> >> Lucene is an apache project so is licensed APL2. Carrotsearch is not an >> apache project but is licensed APL2 >> >> We are also removing the dependency on com.github.rholder.snowball-stemmer. >> This library is used by SASI stemming filters but a later version of the >> same library is available in the lucene libraries. >> >> Does anyone have any concerns about these changes? >> >> Mike Adamson > > > > -- > <https://www.datastax.com/> Mike Adamson > Engineering > > +1 650 389 6000 <tel:16503896000> | datastax.com <https://www.datastax.com/> > Find DataStax Online: > <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.linkedin.com_company_datastax&d=DwMFaQ&c=adz96Xi0w1RHqtPMowiL2g&r=IFj3MdIKYLLXIUhYdUGB0cTzTlxyCb7_VUmICBaYilU&m=uHzE4WhPViSF0rsjSxKhfwGDU1Bo7USObSc_aIcgelo&s=akx0E6l2bnTjOvA-YxtonbW0M4b6bNg4nRwmcHNDo4Q&e=> > > <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.facebook.com_datastax&d=DwMFaQ&c=adz96Xi0w1RHqtPMowiL2g&r=IFj3MdIKYLLXIUhYdUGB0cTzTlxyCb7_VUmICBaYilU&m=uHzE4WhPViSF0rsjSxKhfwGDU1Bo7USObSc_aIcgelo&s=ncMlB41-6hHuqx-EhnM83-KVtjMegQ9c2l2zDzHAxiU&e=> > <https://twitter.com/DataStax> <https://www.datastax.com/blog/rss.xml> > <https://github.com/datastax> >