[ 
https://issues.apache.org/jira/browse/SVN-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Zhakov updated SVN-4554:
-----------------------------
    Fix Version/s: 1.8.16

> Wrong file length with PLAIN representations in FSFS
> ----------------------------------------------------
>
>                 Key: SVN-4554
>                 URL: https://issues.apache.org/jira/browse/SVN-4554
>             Project: Subversion
>          Issue Type: Bug
>          Components: libsvn_fs_fs
>    Affects Versions: 1.8.x
>            Reporter: Stefan Fuhrmann
>            Assignee: Stefan Fuhrmann
>            Priority: Blocker
>             Fix For: 1.8.16, 1.9.0
>
>
> FSFS allows PLAIN, i.e. non-deltified non-compressed, representations to 
> store the content length as 0 as it matches the on-disk size.  Up to and 
> including 1.8.x, there is no restriction on the representation type for that 
> omission.
> In practice, however, it can be difficult to decide whether a 0 value 
> represents an omission or an actually empty file: A self-deltified empty file 
> has a length of 0 but an 4 byte on-disk size.  When representations are read, 
> their header tells us whether it is indeed a PLAIN or a DELTA representation 
> and that is enough to resolve any ambiguity.
> The problem occurs when we omit the length value for file contents and call 
> {{svn_fs_file_length()}} on it.  FSFS will report the length as 0 and that 
> causes e.g. 'svnadmin dump' to write broken dump files where the skipped / 
> empty contents does not match the checksum.
> Up to 1.7.x, we used this omission rule only for hash data, i.e. props and 
> directories, never for file contents.  Thus, there is no problem with these 
> Subversion releases.  Starting 1.9.0 and up, the "structure" document 
> explicitly restricts the omission to property reps.  Furthermore, 1.9+ will 
> not omit length values at all.
> 3rd party implementations like SVNkit (?) may have produced instances of 
> omitted length values for file contents, though.  We need to handle those 
> correctly and extend the API implementation accordingly.
> Moreover, 1.8.x generalized the rep sharing mechanism.  If a file contents 
> happened to match a property representation, e.g. "END\n", it would now use 
> the property representation.  The latter is PLAIN by default and stores a 0 
> length value in the rep cache.  Hence, the file contents rep will also report 
> a 0 length.
> Reproduction sketch (requires 1.8.x):
> * Create repo. Keep rep sharing on and prop deltification off.
> * Add empty file, set prop on file and commit.
> * Remove prop on file and commit.
> * Set file contents to "END\n" and commit.
> * run 'svn ls -v' on the parent folder => file length is shown as 0
> The following things need to be fixed:
> * Don't omit the length value - even for properties.
>   That prevents new instances caused by incoming data. (already fixed in 1.9; 
> fix for 1.8 still needed.)
> * Update the "structure" document with info when the omission is safe.
> * Compare the size and length values of the rep returned by the rep cache   
> with the data of the new rep.  Only replace new with old if those match. The 
> prevents new instances caused by the rep cache.
> * Fix {{svn_fs_fs__file_length()}} to return 0 lengths only for files that   
> are known empty.  This will be the actual bug fix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to