Just a couple of observations on the comments.
+ /* EXPANDED_SIZE is 0. If the MD5 does not match the one for empty
+ * contents, we know that we need to correct EXPANDED_SIZE. */
+ empty_md5 = svn_checksum_empty_checksum(svn_checksum_md5, scratch_pool);
+ if (memcmp(empty_md5->digest, rep->md5_digest, sizeof(rep->md5_digest)))
+ {
+ rep->expanded_size = rep->size;
+ return SVN_NO_ERROR;
+ }
The comment could be clearer. At every step though this function we
"know that we need to correct EXPANDED_SIZE", if "correct" can mean
either "change to a correct value" or "confirm that 0 is the correct
value".
I suggest:
/* EXPANDED_SIZE is 0. If the MD5 does not match the one for empty
* contents, the content is definitely non-empty. It must be a PLAIN rep, as
* EXPANDED_SIZE is always set correctly in a delta rep. Therefore the
* rep SIZE field is also the expanded size. */
+ /* Only two cases are left here.
+ * (1) A non-empty PLAIN rep with a MD5 collision on EMPTY_MD5.
+ * (2) An empty DELTA rep. */
+
+ /* SVN always stores an empty DELTA rep as an empty sequence of txdelta
+ * windows, i.e. as "SVN\1". In that case, SIZE is 4 bytes. There is
[...]
+ * Note that it technically legal to have DELTA reps with a 0 length
+ * output window. Their on-disk size would be longer. [...]
The phrase "empty DELTA rep" is ambiguous. I suggest:
[...] (2) A DELTA rep with zero-length output. [...] SVN always stores
a DELTA rep with zero-length output as [...]
- Julian