>So if we moved to determanisticId for both branches of if / else and remove force, we should be good? RepairJobDesc#deterministicId is basically unique? The challenge with using RepairJobDesc#deterministicId in the SNAPSHOT_MSG is that it will create N snapshots for N token-splits for a table. However, CLEANUP_MSG is only invoked once for a table, complicating the cleanup of the snapshots.
Jaydeep On Tue, May 13, 2025 at 8:56 AM Štefan Miklošovič <smikloso...@apache.org> wrote: > That true / false for "force" argument seems to be purely derived from > whether it is global repair (force set to false) or not (set to true). > > > https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/repair/RepairMessageVerbHandler.java#L185-L192 > > On Tue, May 13, 2025 at 5:25 PM David Capwell <dcapw...@apple.com> wrote: > >> That method can be backported even if it’s just for this single case. For >> the vtable I needed something more friendly so made this for that. It’s >> also used for dedup logic in repair retry works. >> >> I didn’t look closely, what sets force to be true or false? I assumed >> this was the force flag in repair (as these are repair snapshots) but >> sounds like that isn’t the case? >> >> Sent from my iPhone >> >> On May 12, 2025, at 11:58 PM, Štefan Miklošovič <smikloso...@apache.org> >> wrote: >> >> >> David, >> >> that "force" you copied from nodetool, that has nothing to do with >> "force" in these snapshots but I think you already know that. >> >> org.apache.cassandra.repair.RepairJobDesc#determanisticId is only in 4.1+ >> included, not in 4.0. That's quite a bummer but we could limit the patch >> for 4.1+ only and leave 4.0 be. >> >> When we do that, then "force" will become redundant and we can perhaps >> remove it? >> >> How I look at it is that the introduction of "force" flag in >> TableRepairManager is the result of a developer trying to consolidate the >> name of snapshots from parent id / session id to both parent id / parent id >> but there was the realisation of that being problematic because else branch >> might be invoked multiple times so it would start to clash. So they said >> "ok and here we just override and take no matter what" but the >> implementation of that was not finished as it clearly fails. >> >> So if we moved to determanisticId for both branches of if / else and >> remove force, we should be good? RepairJobDesc#deterministicId is basically >> unique? I see it is created from parent id, session id, keyspace, table and >> ranges. I do not think there would ever be two cases of snapshots like that >> with exact same values. >> >> WDYT? >> >> Regards >> >> On Mon, May 12, 2025 at 6:55 PM David Capwell <dcapw...@apple.com> wrote: >> >>> > "force" can be true / false. When true, then it will not check if a >>> snapshot exists >>> >>> >>> My mental node for “force” was only to deal with down nodes, and nothing >>> to do with snapshots… so this feels like a weird behavior >>> >>> @Option(title = "force", name = {"-force", "--force"}, description = >>> "Use -force to filter out down endpoints”) >>> >>> Our documentation even only calls out dealing with downed enodes… >>> >>> >>> > a) we will remove existing snapshot and take a new one >>> >>> So this is racy and can make repairs brittle. The snapshot is at the >>> RepairSession (global) or RepairJob level (a subset of ranges for a single >>> table, different RepairJobs can work on the same table, just different >>> ranges). With this code path you also have 1 more variable at play that >>> makes things complex: isGlobal >>> >>> public boolean isGlobal() {return dataCenters.isEmpty() && >>> hosts.isEmpty();} >>> >>> Global does the full range, where as non-global covers the partial range >>> >>> So when you do host or dc this snapshot is isolated to the RepairJob and >>> not the session logically, but physically its isolated to the session; >>> which doesn’t make any sense. >>> >>> > 2) if we do not want to fix "force", then it is most probably just >>> redundant and we would remove it but it order to not fail, we would need to >>> go back to naming snapshots not parent id / parent id but parent id / >>> session id for global and non-global repair respectively (basically we >>> would return to pre-14116 behavior). >>> >>> org.apache.cassandra.repair.RepairJobDesc#determanisticId >>> >>> This is scoped to a single RepairJob, so wouldn’t have the issues with >>> concurrency. So if we were going to alter what the snapshot name is, I >>> would strongly prefer this (its also present in the repair vtable, so not >>> some hidden UUID) >>> >>> I personally feel the lowest risk is to switch to the job name and away >>> from the session name we had… I do wonder about removing this “force” >>> semantic as its not documented to users as far as I can tell, so is there >>> any place that defines this behavior? >>> >>> > we might just fix the trunk and keep the rest broken, not ideal but >>> >>> With some patches its hard to tell if something is a bug or a feature… >>> so altering the semantic is that making an improvement to allow better >>> combinations of repair options, or is it fixing a bug (aka worthy of a back >>> port)…. I know most users do parallel repair (default) so this snapshot >>> logic doesn’t impact most users, but its also why its been so long before >>> this issue got noticed… so coverage around snapshots currently isn’t the >>> best. >>> >>> Sorry for this reply, its less of an answer and more a dump while I >>> think through the problem you stated…. >>> >>> > On May 12, 2025, at 9:18 AM, Štefan Miklošovič <smikloso...@apache.org> >>> wrote: >>> > >>> > Hello, >>> > >>> > I am looking for advice for (1). There is quite a quality archaeology >>> to go through. I have asked around already (thanks for that!) but nobody >>> seems to ultimately remember / tell what it is for. >>> > >>> > There is (2). "force" can be true / false. When true, then it will not >>> check if a snapshot exists. If it exists and "force" is true, this has >>> never worked - that is what CASSANDRA-20490 is for. It would try to hard >>> link twice and it would error out. >>> > >>> > This "force" logic is called here (3). If ParentRepairSession (prs) >>> is global, "force" is false, but if it is not (not global means that some >>> dc was specified), then it is true. >>> > >>> > Notice that both branches of that if/else use >>> desc.parentSessionId.toString() as snapshots' name. >>> > >>> > This "force" logic was introduced in (4) but it seems to me that it >>> never worked. AFAICT this was just broken on delivery. >>> > >>> > One commit before (4), it looked like (5). The first branch of "if" >>> used parentSessionId as a snapshot name and it checked if it exists or not >>> before taking it, the else branch took sessionId (not parent session id) >>> and took a snapshot of that name without any check if it exists or not. >>> > >>> > From javadoc on TableRepairManager, "force" means: (introduced in (4) >>> CASSANDRA-14116) >>> > >>> > "If force is true, a snapshot should be taken even if one already >>> exists with that name.". >>> > >>> > Questions: >>> > >>> > 1) do we want to fix the behaviour of "force"? (patch 20490) >>> > if the answer is yes, what is the "semantics" about this. What does >>> "force" mean exactly? It can mean >>> > a) we will remove existing snapshot and take a new one >>> > b) we will add more SSTables onto existing snapshot, we just >>> hardlink more SSTables into same snapshot (that is what 20490 is doing) >>> > >>> > 2) if we do not want to fix "force", then it is most probably just >>> redundant and we would remove it but it order to not fail, we would need to >>> go back to naming snapshots not parent id / parent id but parent id / >>> session id for global and non-global repair respectively (basically we >>> would return to pre-14116 behavior). >>> > >>> > I think the introduction of "force" flag was there because there might >>> be multiple calls to "else" branch in RepairMessageVerbHandler and calling >>> it all over again with parentSessionId would be erroneous when already >>> there, so "force" was there for not caring and just taking a snapshot >>> again. If we go back to sessionId only, then we do not have this problem I >>> guess. >>> > >>> > Is it important for people to keep the name of a snapshot in both >>> cases (global / not global) equal to parentSessionId? Can we just go back? >>> > >>> > If we want to keep naming snapshots as done now, we might just do >>> 20490, I was trying to fix older branches as well but it is very difficult, >>> we might just fix the trunk and keep the rest broken, not ideal but ... I >>> am just reinventing a wheel in 4.0+, it is just too cumbersome / complex to >>> copy trunk's behaviour in 20490 to older stuff. >>> > >>> > Thanks and regards >>> > >>> > (1) https://issues.apache.org/jira/browse/CASSANDRA-20490 >>> > (2) >>> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/repair/CassandraTableRepairManager.java#L84 >>> > (3) >>> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/repair/RepairMessageVerbHandler.java#L185-L192 >>> > (4) https://issues.apache.org/jira/browse/CASSANDRA-14116 >>> > (5) >>> https://github.com/apache/cassandra/blob/478c1a9fdf027af0f373f9e26521803ae8775fdb/src/java/org/apache/cassandra/repair/RepairMessageVerbHandler.java#L106-L121 >>> >>>