> starting a new node with the same id as an existing live node will cause a collision
Is this not fixed if we add a simple collision check for existing host id? We can file a bug request and add this check which should be fairly straightforward. > it would be pretty untenable to base any new/improved cluster membership or data placement implementations on host id if the system isn't in control of assigning those. Do we intend to encode any information on the host UUID in the near future? If not, I don't see why we can't just keep treating these as permanent opaque UUIDs, as they always have been. We can always remove this if we change the host identifier to be something else in the future. Em qua., 27 de abr. de 2022 às 14:38, Sam Tunnicliffe <s...@beobal.com> escreveu: > Like I mentioned, the possibility of easily introducing divergent views of > the ring between live nodes is pretty dangerous, e.g. starting a new node > with the same id as an existing live node will cause a collision. The > existing node will not add the new node to the ring (although it will > remain in gossip). Other nodes will remove the existing node from token > metadata, but won't mark it down. There's no requirement for the new node > to have the same tokens as the existing one either, so the topology has > just completely changed without any constraints or movement of existing > data. Subsequent reads and writes will be directed to different replica > sets, depending on which coordinator they land on. The ownership of the > host id as well as the status of nodes in the token metadata of peers will > continue to flap if those nodes go down and come back up as the resolution > of who rightfully owns the host id is decided on startup time. > > As for things further down the line, it would be pretty untenable to base > any new/improved cluster membership or data placement implementations on > host id if the system isn't in control of assigning those. So even if only > a handful of power users might actually make use of the feature, its very > existence would constrain what we can assume/assert about host ids going > forward. Given that drawback, the fact that this is a very niche feature > makes it even less compelling. > > > On 27 Apr 2022, at 18:20, Paulo Motta <pauloricard...@gmail.com> wrote: > > Fully agree we should add a collision check but I don't understand why > this optional feature is bad/dangerous after we add this ability? Can you > provide an example of a potential issue? > > I don't expect this property to be used by most users, except power users > which normally know what they're doing. We have tons of potentially > dangerous knobs and I don't get why this particular one is any different. > > Em qua., 27 de abr. de 2022 às 14:05, Sam Tunnicliffe <s...@beobal.com> > escreveu: > >> CASSANDRA-14582 added support for users to supply an arbitrary value for >> HOST_ID when booting a new node. IMO it's a pretty bad and potentially >> dangerous idea for the unique identifier to be settable in this way. Hint >> delivery is already routed by host id and there have been several JIRAs >> which have called for more fundamental reworking of cluster metadata using >> permanent opaque identifiers rather than IPs to address members >> (CASSANDRA-11559, CASSANDRA-15823, etc). Using host id for anything like >> that in future would be made much more difficult with this capability. >> >> Aside from the longer term implications, it seems that the feature as >> currently implemented has some issues. There doesn't appear to be any >> validation that a supplied host id isn't already in use by a live node, so >> it's trivial to trigger a collision which can lead to divergent ring views >> between nodes and ultimately in data loss. >> >> Although this landed in trunk almost 11 months ago it hasn't been >> included in a release yet, so I propose we revert it before cutting 4.1 >> (although, as the revert isn't a feature, I guess technically we could do >> that during the freeze). I'm not completely convinced about encoding >> metadata into host ids, but even if that is something we want to do, I >> don't think it's wise to completely remove control over the identifiers >> from Cassandra itself. >> >> Thanks, >> Sam >> >> On 25 Apr 2022, at 16:17, Ekaterina Dimitrova <e.dimitr...@gmail.com> >> wrote: >> >> Hi everyone, >> >> Kind reminder that 1st May is around the corner. What does this mean? Our >> code freeze starts on 1st May and my understanding is that only bug fixing >> can go into the 4.1 branch. >> If anyone has anything to raise, now is a good time. On my end I saw a >> few things for this week that we should probably put to completion: >> - CASSANDRA-17571 <https://issues.apache.org/jira/browse/CASSANDRA-17571> - >> I have to close this one, it is in progress; new types in Config is good to >> be in before the freeze I guess, even if It is not yaml change >> - CASSANDRA-17557 <https://issues.apache.org/jira/browse/CASSANDRA-17557> - >> we need to take care of the parameters so we don't have to deprecate and >> support anything not actually needed; I think it is probably more or less >> done >> - CASSANDRA-17379 <https://issues.apache.org/jira/browse/CASSANDRA-17379> - >> adds a new flag around config; I think it is more or less done, depends on >> final CI and second reviewer maybe needed? >> - JMX intercept Cassandra exceptions, I think David mentioned a rebase >> was needed >> - CASSANDRA-17212 - The config property minimum_keyspace_rf and their >> nodetool getter and setter commands are new to 4.1. They are suitable to be >> ported to guardrails, and if we do this port in 4.1 we won't need to >> deprecate that property and nodetool commands in the next release, just one >> release after their introduction. >> >> I guess the failing tests we see could be fixed after the freeze but no >> API changes. >> >> Thanks everyone for all the hard work. Please don’t hesitate to raise the >> flag with questions, concerns or any help needed. >> >> Best regards, >> Ekaterina >> >> >> >