[
https://issues.apache.org/jira/browse/SVN-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15242752#comment-15242752
]
Stefan Hett commented on SVN-4554:
----------------------------------
Looked into the state in 1.7 and 1.8 and realized the issue is still present
for the latest bugfix releases there (aka: 1.7.22 and 1.8.15).
Proposed patch for 1.7: ^/subversion/branches/1.7.x-issue4554
Proposed patch for 1.8: ^/subversion/branches/1.8.x-issue4554-v2
Obviously since 1.7 is no longer supported it won't be fixed there.
> Wrong file length with PLAIN representations in FSFS
> ----------------------------------------------------
>
> Key: SVN-4554
> URL: https://issues.apache.org/jira/browse/SVN-4554
> Project: Subversion
> Issue Type: Bug
> Components: libsvn_fs_fs
> Affects Versions: 1.8.x
> Reporter: Stefan Fuhrmann
> Assignee: Stefan Fuhrmann
> Priority: Blocker
> Fix For: 1.9.0
>
>
> FSFS allows PLAIN, i.e. non-deltified non-compressed, representations to
> store the content length as 0 as it matches the on-disk size. Up to and
> including 1.8.x, there is no restriction on the representation type for that
> omission.
> In practice, however, it can be difficult to decide whether a 0 value
> represents an omission or an actually empty file: A self-deltified empty file
> has a length of 0 but an 4 byte on-disk size. When representations are read,
> their header tells us whether it is indeed a PLAIN or a DELTA representation
> and that is enough to resolve any ambiguity.
> The problem occurs when we omit the length value for file contents and call
> {{svn_fs_file_length()}} on it. FSFS will report the length as 0 and that
> causes e.g. 'svnadmin dump' to write broken dump files where the skipped /
> empty contents does not match the checksum.
> Up to 1.7.x, we used this omission rule only for hash data, i.e. props and
> directories, never for file contents. Thus, there is no problem with these
> Subversion releases. Starting 1.9.0 and up, the "structure" document
> explicitly restricts the omission to property reps. Furthermore, 1.9+ will
> not omit length values at all.
> 3rd party implementations like SVNkit (?) may have produced instances of
> omitted length values for file contents, though. We need to handle those
> correctly and extend the API implementation accordingly.
> Moreover, 1.8.x generalized the rep sharing mechanism. If a file contents
> happened to match a property representation, e.g. "END\n", it would now use
> the property representation. The latter is PLAIN by default and stores a 0
> length value in the rep cache. Hence, the file contents rep will also report
> a 0 length.
> Reproduction sketch (requires 1.8.x):
> * Create repo. Keep rep sharing on and prop deltification off.
> * Add empty file, set prop on file and commit.
> * Remove prop on file and commit.
> * Set file contents to "END\n" and commit.
> * run 'svn ls -v' on the parent folder => file length is shown as 0
> The following things need to be fixed:
> * Don't omit the length value - even for properties.
> That prevents new instances caused by incoming data. (already fixed in 1.9;
> fix for 1.8 still needed.)
> * Update the "structure" document with info when the omission is safe.
> * Compare the size and length values of the rep returned by the rep cache
> with the data of the new rep. Only replace new with old if those match. The
> prevents new instances caused by the rep cache.
> * Fix {{svn_fs_fs__file_length()}} to return 0 lengths only for files that
> are known empty. This will be the actual bug fix.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)