Re: Re: Transitioning from existence-based lockfiles and /var/lock to flock

2025-10-18 Thread Josh Triplett
Russ Allbery wrote:
> Could you open a Policy bug?

Done: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1118371



Re: Transitioning from existence-based lockfiles and /var/lock to flock

2025-10-18 Thread Bastian Blank
On Mon, Oct 13, 2025 at 09:07:21PM +0200, Bill Allombert wrote:
> There are also issues with NFS to consider.

If we talk about stuff not in /dev, which at least the original bug was
scoped to.

But in this case flock() on NFS is already better then /var/lock on a
local tmpfs.  Okay we suddely could get overlocking, because flock
behaves differently.  It's also likely that the old behaviour is already
broken in this case.  Do you have an example of such behaviour?

Bastian

-- 
The face of war has never changed.  Surely it is more logical to heal
than to kill.
-- Surak of Vulcan, "The Savage Curtain", stardate 5906.5



Re: Transitioning from existence-based lockfiles and /var/lock to flock

2025-10-18 Thread Simon McVittie

On Mon, 13 Oct 2025 at 11:26:39 -0700, Russ Allbery wrote:

Way back in the day, it used to matter whether you used flock or fcntl and
one type of lock was potentially invisible to the other type of lock.


It still matters (usually). The devil is in the details: see my other 
mail to this thread for more on this.


smcv



Re: Transitioning from existence-based lockfiles and /var/lock to flock

2025-10-18 Thread Josh Triplett
Russ Allbery wrote:
> Way back in the day, it used to matter whether you used flock or fcntl and
> one type of lock was potentially invisible to the other type of lock. Has
> this been fixed, or is that still a concern?

It's still a concern, in that those are two different kinds of locks in
most circumstances. (In the broader universe of POSIX it's even less
specified, but in the narrower universe of things Debian supports we can
primarily focus on fcntl and flock.)

I intended to specify in the full text more precisely that there are
multiple kinds of locks, and it's okay to use one or the other as long
as all cooperating software agrees.

Generally speaking, I think very few programs actually want to use the
byte-range locking that fcntl can do, and that in practice the options
usually boil down to either flock or whole-file fcntl, which are roughly
equivalent but unfortunately distinct.

The main advantage of flock, apart from simplicity, is that flock always
follows the file descriptor, while fcntl *normally* follows the process
(unless you use the non-portable "open file description locks", which
act like flock).



Re: Transitioning from existence-based lockfiles and /var/lock to flock

2025-10-17 Thread Russ Allbery
Josh Triplett  writes:

> - Software should not use existence-based lockfiles (where the existence
>   of the lockfile constitutes holding the lock); software should use
>   file-based locking (`flock`) on an appropriate file instead.
> - Where possible, software should apply `flock` to an appropriate target
>   file rather than a dedicated lockfile. For instance, if locking a
>   device or a data file, software should `flock` the device file or data
>   file, rather than creating a separate file to lock.

Way back in the day, it used to matter whether you used flock or fcntl and
one type of lock was potentially invisible to the other type of lock. Has
this been fixed, or is that still a concern?

This otherwise looks good to me. Could you open a Policy bug?

-- 
Russ Allbery ([email protected])  



Re: Transitioning from existence-based lockfiles and /var/lock to flock

2025-10-13 Thread Josh Triplett
Bill Allombert wrote:
> There are also issues with NFS to consider.

flock has worked over NFS since Linux 2.6.12, which was released in June
2005, making it more than 20 years old. Some of the text in Policy that
tries to account for locks "not working" on NFS is deeply outdated in
that regard. I think it's safe to say that flock on NFS is better than
other potential alternatives (including existence-based lockfiles).

There are, of course, still some filesystems that flock doesn't work on;
if nothing else, an arbitrarily broken FUSE filesystem could do that.

I would argue that it's reasonable for programs to assume that /dev will
support locking, that it's possible for programs to tell the difference
between "locking works and I didn't acquire the lock" and "locking
doesn't work", and that in the latter case programs can choose (based on
what they use locking for) whether they want to fail and notify the
user, warn and proceed anyway, or try falling back to a well-defined
proxy location on another filesystem before giving up (e.g.
`$XDG_RUNTIME_DIR`, where locks will also always work).

I don't think Policy should cover exact procedures there, but I think
it'd be reasonable for Policy to cover "locks must work on /dev and on
$XDG_RUNTIME_DIR", and lump the rest in with other deference to "if
coordinating software all agrees" together with the *hint* of
potentially locking a (non-existence-based) lockfile in a proxy
location.



Re: Transitioning from existence-based lockfiles and /var/lock to flock

2025-10-13 Thread Bastian Blank
On Mon, Oct 13, 2025 at 07:55:55PM +0100, Simon McVittie wrote:
> There are several orthogonal advisory lock mechanisms, and I don't think
> Policy should take a general position on which one should be used, as long
> as all programs that might want to exclude each other by holding a lock can
> agree on which one they are going to use. The ones I know about are:
> 
> * flock(2) (POSIX) and its command-line interface flock(1) (util-linux)

s/POSIX/BSD/

And let's say, we already have the "policy" to use flock(2) in form of
https://systemd.io/BLOCK_DEVICE_LOCKING/, just restricted to block
devices.  We really don't need to invent something new.

> For example, when Flatpak wants to prevent a concurrent Flatpak process from
> deleting a runtime that is in use by an app, it implements that by locking
> the file ${runtime}/.ref with fcntl F_SETLK. It would be fine for a program
> that interacts with Flatpak (or a newer version of Flatpak itself) to use
> either F_SETLK or F_OFD_SETLCK on ${runtime}/.ref, because F_OFD_SETLCK is
> documented to be mutually exclusive with an incompatible F_SETLK, but it
> would be a potentialy serious bug for it to use flock(2), because it is
> unspecified whether flock(2) and F_SETLK exclude each other (and on Linux
> they don't, unless NFS happens to be involved).

What flatpak does internally is of concern how?  Do they document this
as external usable lock, then they also need to specify which method is
to be used.  But this would be unrelated to the Debian policy.

> According to #1110980 and #1110981, the FHS, which Policy incorporates by
> reference, specifies the use of lock files in /var/lock/ for serial ports.

More current versions removed /var/lock completely, so we can just
update the reference:

https://uapi-group.org/specifications/specs/linux_file_system_hierarchy/

Bastian

-- 
Murder is contrary to the laws of man and God.
-- M-5 Computer, "The Ultimate Computer", stardate 4731.3



Re: Transitioning from existence-based lockfiles and /var/lock to flock

2025-10-13 Thread Bill Allombert
On Mon, Oct 13, 2025 at 11:26:39AM -0700, Russ Allbery wrote:
> Josh Triplett  writes:
> 
> > - Software should not use existence-based lockfiles (where the existence
> >   of the lockfile constitutes holding the lock); software should use
> >   file-based locking (`flock`) on an appropriate file instead.
> > - Where possible, software should apply `flock` to an appropriate target
> >   file rather than a dedicated lockfile. For instance, if locking a
> >   device or a data file, software should `flock` the device file or data
> >   file, rather than creating a separate file to lock.
> 
> Way back in the day, it used to matter whether you used flock or fcntl and
> one type of lock was potentially invisible to the other type of lock. Has
> this been fixed, or is that still a concern?

There are also issues with NFS to consider.

Cheers,
-- 
Bill. 

Imagine a large red swirl here. 



Re: Transitioning from existence-based lockfiles and /var/lock to flock

2025-10-13 Thread Simon McVittie

On Mon, 13 Oct 2025 at 11:04:05 -0700, Josh Triplett wrote:

- Software should not use existence-based lockfiles (where the existence
 of the lockfile constitutes holding the lock); software should use
 file-based locking (`flock`) on an appropriate file instead.


There are several orthogonal advisory lock mechanisms, and I don't think 
Policy should take a general position on which one should be used, as 
long as all programs that might want to exclude each other by holding a 
lock can agree on which one they are going to use. The ones I know about 
are:


* flock(2) (POSIX) and its command-line interface flock(1) (util-linux)
* fcntl F_SETLK and friends (POSIX)
* fcntl F_OFD_SETLCK and friends (Linux-specific)
* lockf(3) (POSIX)
  - which wraps one of the fcntl locks on GNU/Linux, but might be something
else on other kernel/libc combinations

There might be others.

In general it would be a significant bug to replace one of these with 
another of these without domain-specific attention being paid to the 
subtleties of their semantics in terms of which ones exclude each other, 
which ones can be inherited from parent to child, which ones are scoped 
to a process or a thread or an open file description, and so on.


For example, when Flatpak wants to prevent a concurrent Flatpak process 
from deleting a runtime that is in use by an app, it implements that by 
locking the file ${runtime}/.ref with fcntl F_SETLK. It would be fine 
for a program that interacts with Flatpak (or a newer version of Flatpak 
itself) to use either F_SETLK or F_OFD_SETLCK on ${runtime}/.ref, 
because F_OFD_SETLCK is documented to be mutually exclusive with an 
incompatible F_SETLK, but it would be a potentialy serious bug for it 
to use flock(2), because it is unspecified whether flock(2) and F_SETLK 
exclude each other (and on Linux they don't, unless NFS happens to be 
involved).


Similarly, it would be a potentially serious bug if one program locked 
the file ${runtime}/.ref, but another took out a lock on the directory 
itself, ${runtime}, intending to exclude the other program. Either one 
of those two locking disciplines is OK in isolation, but the two 
programs must agree on which one they are going to use. Clusters of 
closely-cooperating programs can just agree this among themselves 
without any special coordination and without any Policy involvement, but 
broader or looser categories of programs could benefit from coordination 
in Policy.


In particular:

Policy does specify (in ยง11.6) how to lock the mailboxes in /var/mail/, 
because that is an example of a single domain-specific context where 
it's necessary that everything agrees. (It already calls for this to be 
done inside /var/mail/ rather than involving /var/lock/ or /run/lock/, 
so it's out-of-scope for #1115317.)


According to #1110980 and #1110981, the FHS, which Policy incorporates 
by reference, specifies the use of lock files in /var/lock/ for serial 
ports. If we want programs like uucp to prefer to use flock or fcntl 
locks for this purpose, then we will need to document a FHS exception in 
Policy for this, and specify which of the various advisory locking 
mechanisms is to be used for it - preferably one that is already 
supported in software that locks serial ports, or already used in other 
distros. In #1110980, Luca recommended "BSD locks" and mentions that 
some serial-port-related software already supports those, but I'm not 
sure which specific API that was intended to refer to - as approximately 
POSIX-compliant OS distributions, the BSDs presumably support both 
flock(2) and fcntl F_SETLK, and possibly others.


I think it would be best to have a specific, narrowly-scoped bug to 
agree on how programs like uucp should lock serial ports, with its 
conclusion documented in Policy. I don't know whether there are other 
non-closely-cooperating groups of programs currently using /run/lock/ or 
(equivalently) /var/lock/ that need similar 
coordination.


smcv