I have also recently worked with a teams who lost critical data as a result
of gossip issues combined with collision in our token allocation.  I
haven’t filed a jira yet as it slipped my mind but I’ve seen it in my own
testing as well. I’ll get a JIRA in describing it in detail.

It’s severe enough that it should probably block 5.0.

Jon

On Thu, May 16, 2024 at 10:37 AM Jordan West <jw...@apache.org> wrote:

> I’m a big +1 on 18917 or more testing of gossip. While I appreciate that
> it makes TCM more complicated, gossip and schema propagation bugs have been
> the source of our two worst data loss events in the last 3 years. Data loss
> should immediately cause us to evaluate what we can do better.
>
> We will likely live with gossip for at least 1, maybe 2, more years.
> Otherwise outside of bug fixes (and to some degree even still) I think the
> only other solution is to not touch gossip *at all* until we are all
> TCM-only which I don’t think is practical or realistic. recent changes to
> gossip in 4.1 introduced several subtle bugs that had serious impact (from
> data loss to loss of ability to safely replace nodes in the cluster).
>
> I am happy to contribute some time to this if lack of folks is the issue.
>
> Jordan
>
> On Mon, May 13, 2024 at 17:05 David Capwell <dcapw...@apple.com> wrote:
>
>> So, I created https://issues.apache.org/jira/browse/CASSANDRA-18917 which
>> lets you do deterministic gossip simulation testing cross large clusters
>> within seconds… I stopped this work as it conflicted with TCM (they were
>> trying to merge that week) and it hit issues where some nodes never
>> converged… I didn’t have time to debug so I had to drop the patch…
>>
>> This type of change would be a good reason to resurrect that patch as
>> testing gossip is super dangerous right now… its behavior is only in a few
>> peoples heads and even then its just bits and pieces scattered cross
>> multiple people (and likely missing pieces)…
>>
>> My brain is far too fried right now to say your idea is safe or not, but
>> honestly feel that we would need to improve our tests (we have 0) before
>> making such a change…
>>
>> I do welcome the patch though...
>>
>>
>> On May 12, 2024, at 8:05 PM, Zemek, Cameron via dev <
>> dev@cassandra.apache.org> wrote:
>>
>> In looking into CASSANDRA-19580 I noticed something that raises a
>> question. With Gossip SYN it doesn't check for missing digests. If its
>> empty for shadow round it will add everything from endpointStateMap to the
>> reply. But why not included missing entries in normal replies? The
>> branching for reply handling of SYN requests could then be merged into
>> single code path (though shadow round handles empty state different with
>> CASSANDRA-16213). Potential is performance impact as this requires doing a
>> set difference.
>>
>> For example, something along the lines of:
>>
>> ```
>>         Set<InetAddressAndPort> missing = new
>> HashSet<>(endpointStateMap.keySet());
>>
>> missing.removeAll(gDigestList.stream().map(GossipDigest::getEndpoint).collect(Collectors.toSet()));
>>         for ( InetAddressAndPort endpoint : missing)
>>         {
>>             gDigestList.add(new GossipDigest(endpoint, 0, 0));
>>         }
>> ```
>>
>> It seems odd to me that after shadow round for a new node we have
>> endpointStateMap with only itself as an entry. Then the only way it gets
>> the gossip state is by another node choosing to send the new node a gossip
>> SYN. The choosing of this is random. Yeah this happens every second so
>> eventually its going to receive one (outside the issue of CASSANDRA-19580
>> were it doesn't if its in a dead state like hibernate) , but doesn't this
>> open up bootstrapping to failures on very large clusters as it can take
>> longer before its sent a SYN (as the odds of being chosen for SYN get
>> lower)? For years been seeing bootstrap failures with 'Unable to contact
>> any seeds' but they are infrequent and never been able to figure out how to
>> reproduce in order to open a ticket, but I wonder if some of them have been
>> due to not receiving a SYN message before it does the seenAnySeed check.
>>
>>
>>

Reply via email to