Re: Status of branches/pristine-checksum-salt

Branko Čibej Mon, 19 Jan 2026 22:29:25 -0800

On 20. 1. 26 00:16, Evgeny Kotkov via dev wrote:

Evgeny Kotkov<[email protected]> writes:

With a fresh look, I think that in the current state we might want to
indeed have the full content comparison for pristineful working copies,
and only use checksum-based comparison for pristineless working copies
(as described in your response).

I'll see if I can put together a patch for this approach.

Please find the patch attached.

With the additional information I gained from the code, I realized that
while it may be possible to rely on the global WC state, it also introduces
a potential race condition and a non-transactional state dependency, where
the global settings could conflict with the state of the pristine we are
accessing transactionally.

The refined approach makes a decision based on the current state of an
individual pristine (which technically appears to be the correct source
of truth for this layer of operations), and uses bytewise comparison if
the pristine content is available.

If there are no objections, I could commit the patch shortly.



I have no objections at all, the patch looks good.

But I do have one question that's only somewhat related to the patchitself. In the new, refactored function compare_exact(), there's thisexplanation:


/* We don't have pristine contents.  To make the comparison work without
   it, let's check for two things:

   1) That the checksum of the detranslated contents matches the recorded
      pristine checksum, as in the case of a non-exact comparison, ...

   2) ...and additionally, that the contents of the working file does not
      change when retranslated according to its properties.

   Technically we're going to do that with a single read of the file
   contents, while checksumming it's original, detranslated and
   retranslated versions.
*/

The code then proceeds to compute three checksums: of the originalworking copy contents, the untranslated contents with we presume wouldbe the pristine text and of the retranslated contents.

I don't understand why we need two of these three checksums. We have theworking stream and the retranslated stream – why not just do abyte-for-byte comparison between them instead of burning CPU cycles bycomputing checksums on exactly the same sets of data?

Not only is comparing the data much faster than computing its checksum;but if the original and retranslated streams are different, thecomparison would stop early without having to read the whole file.


-- Brane

Re: Status of branches/pristine-checksum-salt

Reply via email to