Julian Foad wrote on Fri, Jun 25, 2021 at 14:27:30 +0100:
> Now filed as: https://issues.apache.org/jira/browse/SVN-4877
> 

"Now"?  The issue is a month old.

> THE UNWANTED BEHAVIOUR
> 
> The code we are concerned with is a section of the FSFS commit finalization 
> code, in commit_body() called from svn_fs_fs__commit(),  in 
> subversion/libsvn_fs_fs/transaction.c, around line 3800.
> 
> Within the following code section:
> 
>     1. Get a write handle on the proto revision file.
>     2. Complete the construction of the proto-rev file.
>     3. Move the finished rev file into place.
>     4. Unlock the proto-rev file.
>     5. Continue with the rest of the commit finalization.
> 
> If an error is thrown during steps 2 and 3, and the calling process re-tries 
> the commit, the second try fails with:
> 
>     160044 - Cannot write to the prototype revision file of transaction '1-2' 
> because a previous representation is currently being written by this process
>     at subversion/libsvn_fs_fs/transaction.c:312
> 
> What is happening:
> 
>   - when the initial error is encountered, FSFS returns the error;
>   - FSFS does not call unlock_proto_rev()...
>   - ... so the proto-rev lock file remains on disk, with a file-lock held on 
> it and the "being_written" flag remaining set in the transaction object in 
> memory;
>   - on re-try, when the code tries to acquire a proto-rev write lock, it 
> finds the lock is already held, and throws SVN_ERR_FS_REP_BEING_WRITTEN.
> 

Okay, so it fails with a wrong error message.

> Subversion itself doesn't care. It never re-tries at this level. It always 
> discards the transaction if an error is returned.
> 

Even under mod_dav_svn when the client is a generic DAV client?

> DESIRED BEHAVIOUR
> 
> The commit re-try should fail in exactly the same way as the first try, if 
> the underlying cause of the error persists.
> 
> Or, if the underlying cause of the error was transient and has gone away, 
> then the commit should succeed.
> 

I think it's possible that the second try would legitimately fail with
a different error message than the first try.

Thus, it's not clear what the desired _change_ in behaviour is.

> PROPOSED FIX
> 
> My proposed fix is to unlock the lock when exiting that code section, no 
> matter whether successful or throwing an error.
> 
> Pseudo-Python:
> 
>     try:
>         1. Get a write handle on the proto revision file.
>         2. Complete the construction of the proto-rev file.
>         3. Move the finished rev file into place.
>     finally:
>         4. Unlock the proto-rev file.
>     5. Continue with the rest of the commit finalization.
> 

Why would this be correct?  Why would this have no downside?

Reply via email to