sbp opened a new issue, #639:
URL: https://github.com/apache/tooling-trusted-releases/issues/639
Consider a new release on ATR with no files, `{}`, in r1. Alice, a release
manager, uploads file `A` giving `{A}` in r2 which is based on the base
revision r1. Unbeknownst to her, Bob, another release manager, uploads file `B`
simultaneously, giving `{B}` in r3, which is also based on the base revision
r1. Bob's r3, the new revision, is _not_ based on Alice's r2, the prior
revision, because her upload process did not finish before Bob started his
upload. ATR resolves this merge context by considering the latest state to be
the outcome.
What is the consequence of this? Alice checks the file list and sees `{B}`.
But she just uploaded `A`. Where did `B` come from? This situation is made more
likely when release managers use unattended scripts; the use of scripts could
even cause merge conflicts between two of a single user's own changes. One
potential solution to #628 exacerbates the problem still further, as it uses a
new revision for every file in a concurrent JS upload.
The obvious solution here is to apply 3-way file merges where it is possible
to do so automatically. When a revision is created, it knows the revision that
it is based on. It also knows the revision serial number that it has been
allocated, and if it is not consecutive with the base revision then it knows
that there must have been at least one intervening revision, the most recent of
which is the prior revision. In the example above, when `{B}` is being
published it would see that its base is r1, but it's being allocated r3, and it
can look up the content of r1 and the prior revision r2 and find that they are
`{}` and `{A}` respectively. Since there is no conflict here, it can
effectively rebase on `{A}`, to obtain `{A, B}`.
For simplicity, overall we can consider an automatic merge possible as long
as the base and new revisions agree when the prior revision differs. In order
to do this, we need to keep a list of file hashes for the latest revision and
any revisions which are being used as the base for new revisions. We can still
use inodes for efficiency, but will have to consult the hashes in some cases.
Adding this list of file hashes to the database will be tracked by a separate
issue.
The specific algorithm will be as defined by the following tables. B, P, and
N refer to the base, prior, and new revisions respectively. The values b, p,
and n refer to the hashes of files first introduced in B, P, or N respectively
for a specific path, and a hyphen means that the file is absent at that path.
In other words, these tables apply to each individual path. The cases are
numbered because the general principles regarding them will be summarised
afterwards.
**No intervening change, so no action needed:**
| # | B | P | N | Result |
|---|---|---|---|--------|
| 1 | - | - | - | - |
| 2 | - | - | n | n |
| 3 | b | b | - | - |
| 4 | b | b | n | n |
| 5 | b | b | b | b |
**Identical intervening change, so no action needed:**
| # | B | P | N | Result |
|---|---|---|---|--------|
| 6 | - | p | p | p |
| 7 | b | - | - | - |
| 8 | b | p | p | p |
**Automatically resolved by merge:**
| # | B | P | N | Result | Resolution action |
|---|---|---|---|--------|--------|
| 9 | - | p | - | p | Hard link to p in P |
| 10 | b | - | b | - | Remove b from N |
| 11 | b | p | b | p | Replace b in N with hard link to p in P |
**Conflict, so N wins:**
| # | B | P | N | Result |
|---|---|---|---|--------|
| 12 | - | p | n | n |
| 13 | b | - | n | n |
| 14 | b | p | - | - |
| 15 | b | p | n | n |
In 1-5, B and P agree so N wins. In 6-8, P and N agree so P and N win. In
9-11, B and N agree, so P wins. In 12-15, nobody agrees so N wins.
There are some edge cases when a move is performed. Let's say that P moves a
file from path X to path Y, and N does nothing. In this case, the current
behaviour of "N wins all" means that the file remains at X and P's changes are
lost. In the new model, the file moves to Y correctly. But if P moves a file
from path X to path Z, and N moves a different file from path Y to the same
path Z, the current behaviour is that X still exist and Y is moved to Z. Under
the new merge algorithm, Y would be moved to Z but X would no longer exist.
This is probably worse, but these and similar edge cases concerning moved files
are all arguable.
In any case, if N introduces a file, that file is always introduced, as
guaranteed by the tables, and that is probably the most important aspect. Cases
9-11 are the only new behaviours that need to be introduced to resolve this
issue, because in all other cases N wins which is also the currently
implemented behaviour.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]