[
https://issues.apache.org/jira/browse/CASSANDRA-17240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17624364#comment-17624364
]
Alex Petrov commented on CASSANDRA-17240:
-----------------------------------------
{quote}I don't think that attributing desires with expressions such as "you
personally or members of your team" helps us in anything but creating conflict.
{quote}
Re-reading it now, my wording might have been suboptimal, so let me try to
rephrase. What I meant was Harry adoption might have been seen as unnecessary,
or there might be something that was preventing its adoption. There was no
attempt of attributing desires, rather the opposite - stating there is no
(visible) inclination for adoption. This was an unnecessary assumption on my
part. Regardless, there definitely was no bad intention in what I was
attempting to convey, on contrary - I've offered my help with Harry tests
previously, and have repeated it in the last paragraph.
{quote}Are those tests publicly available?
{quote}
Since some of them are Transactional-Metadata specific, I haven't posted them
just yet. I am working to make them available on trunk, which requires some
minor changes to them, alongside with a simple, two-command stress-like tool
with validation abilities.
{quote}However, eight months have passed and I can't find a single class
extending that FuzzTestBase
{quote}
Since I was mostly working on Transactional Metadata all that time, my
intention for pushing 16262 out was to help folks working on SAI to adopt it,
as it was discussed in cassandra-sai slack channel. But most of the actual
fuzz-testing was as simple as creating clusters and running Harry with
different schemas and workloads.
{quote}CASSANDRA-16262 was meant to add fuzz testing for coordination and
replication. We had it as a blocker for 4.0 for some time, but we finally
released without it.
{quote}
The code that got merged into Cassandra tree was intended to enable people to
write new tests, such as bootstrap/decom, and others, in-tree. Fuzz testing
itself was done by running Harry with different configurations against
Cassandra clusters. Even though most of these tests did not run on Apache
infrastructure, all issues found by it were published, and stability of 4.0
can, in part, be attributed to it, since several issues in the storage engine
might have not been triggered without it, or would've been harder to find. Even
[Scylladb|https://github.com/apache/cassandra-harry/blob/trunk/scylla-usage.md]
folks are using Harry for validation of foundational functionality. So saying
that we have released 4.0 without it is a bit unfair.
{quote}Marking any new features as experimental until they are tested with
Harry is mostly equivalent to force people to use it, isn't it?
{quote}
This heavily depends on the feature. With SAI - we would have to write several
new models. I can certainly help to write them; we can collaborate on what's to
be tested, and we find the best way to model SAI, which is not that hard. With
features like memtables - again, I'd say only having a bake test and lengthy
read/write workload would give us quite a bit of confidence already.
Besides, I'm not saying it absolutely has to be Harry, I did mention
"equivalent rigour" in my previous message: it could be any property-based
model-supported integration testing tool, that tests the feature not in
isolation, but tests database behaviour with this feature assumed. Since Harry
is already available, I just think it makes sense to use it.
{quote}Maybe I'm missing some public, community-owned repo containing a
gazillion tests using Harry.
{quote}
Thing is, using Harry for testing is really as easy as calling
{{visitor.visit();}} in the loop, followed by {{model.validate();}} with any
additional calls such as streaming, etc, you would like to do, in-between. And
this test is, in itself, a gazillion tests, since {{validate}} tests paging,
single partition reads, reverse reads, slices, ranges, and so on, while
{{visit}} tests partition deletions, range tombstones, etc.
In case of trie-based memtables, I think we just need to run bake tests for
several hundred (or more) cluster-hours (which is of course parallelizable),
and make sure we trigger conditions such as sstable/memtable merges, range
tombstones, partition deletions, etc.
In order to test SAI we will actually require some new models, and same with
transactions and transactional metadata, but for features as foundational as
memtables you don't even need to write any new code.
{quote}Those wanting to use them just used it, and that led others by example.
I'd suggest the same approach for Harry.
{quote}
Fair enough. My impression was that allowing people to _use_ the code would be
something that would enable them to test it, but you're right maybe it didn't
go far enough. I've put together a simple bake-tests for trie-based (or any
other type of) memtable that I'll do my best to publish soon.
> CEP-19: Trie memtable implementation
> ------------------------------------
>
> Key: CASSANDRA-17240
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17240
> Project: Cassandra
> Issue Type: Improvement
> Components: Local/Memtable
> Reporter: Branimir Lambov
> Assignee: Branimir Lambov
> Priority: Normal
> Fix For: 4.2
>
> Attachments: SkipListMemtable-OSS.png, TrieMemtable-OSS.png,
> density_SG.html.gz, density_test_with_sharding.html.gz, latency-1_1-95.png,
> latency-9_1-95.png, throughput_SG.png, throughput_apache.png
>
> Time Spent: 13.5h
> Remaining Estimate: 0h
>
> Trie-based memtable implementation as described in CEP-19, built on top of
> CASSANDRA-17034 and CASSANDRA-6936.
> The implementation is available in this
> [branch|https://github.com/blambov/cassandra/tree/CASSANDRA-17240].
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]