> On Jan. 15, 2019, 6:22 p.m., Benjamin Mahler wrote:
> > Should we be surfacing a close EINTR as an error or let that be silent?
> > I think these errors need some message pre-fixing? E.g.
> > 
> > ```
> > Failed to close '3': Bad file number
> > ```
> > 
> > As it stands the error messages will only show "Bad file number" and it 
> > won't be clear if the write or close produced this? It's also an issue here 
> > between open/write/flush but would be great to resolve this now.
> 
> Benjamin Bannier wrote:
>     If I read the POSIX spec for `close` correctly, if `close` fails with 
> `EINTR` the passed file descriptor is left in an unspecified state so it 
> e.g., would not be safe to assume that pending data was successfully flushed. 
> Am I missing something?

There's quite a rabbit hole of reading around this, but with respect to the 
POSIX close spec:

> If close() is interrupted by a signal that is to be caught, it shall
> return -1 with errno set to [EINTR] and the state of fildes is unspecified.
> If an I/O error occurred while reading from or writing to the file system
> during close(), it may return -1 with errno set to [EIO]; if this error is
> returned, the state of fildes is unspecified.

My interpretation so far from reading more about this is that the "state of 
fildes is unspecified" is *only* referring to whether it's closed or open. 
Howerver, Linus says:

> The error return just tells you that soem error happened on the file: for
> example, in the case of EINTR, the close() may not have flushed all the
> pending data synchronously.

https://lkml.org/lkml/2005/9/10/129

But, even if close returns successfully we cannot assume pending data was 
flushed (although some of our code in question here uses an explicit flush 
before closing as suggested), from linux's close man page:

> A successful close does not guarantee that the data has been
> successfully saved to disk, as the kernel uses the buffer cache to
> defer writes.  Typically, filesystems do not flush buffers when a
> file is closed.  If you need to be sure that the data is physically
> stored on the underlying disk, use fsync(2).  (It will depend on the
> disk hardware at this point.)

Based on this, EINTR (which is more like EINPROGRESS in terms of semantics, see 
https://ewontfix.com/4/) seems to provide the same guarantee as return code of 
0:

* `0`: filedes is closed, previous `write()`s are in kernel buffer cache but 
data might not make it to disk
* `EINTR`: filedes is closed, close call was interrupted by a signal, previous 
`write()`s are in kernel buffer cache but data might not make it to disk

In the case of an `fsync()` before close, it seems to mean:

* `0`: filedes is closed, nothing in kernel buffer cache, data is already 
fsynced
* `EINTR`: filedes is closed, nothing in kernel buffer cache, data is already 
fsynced, close call was interrupted by a signal (not sure if EINTR is possible 
with an `fsync()` beforehand..)

Looking through the linux source, it calls through these:
https://github.com/torvalds/linux/blob/d7393226d15add056285c8fc86723d54d7e0c77d/fs/open.c#L1152-L1169
https://github.com/torvalds/linux/blob/903b77c631673eeec9e9114e9524171cdf9a2646/fs/file.c#L617-L641
https://github.com/torvalds/linux/blob/ad1d69735878a6bf797705b5d2a20316d35e1113/fs/open.c#L1126-L1150
and the interruptible part appears to be the flush here:
https://github.com/torvalds/linux/blob/ad1d69735878a6bf797705b5d2a20316d35e1113/fs/open.c#L1140

Trying to figure out what this flush is, I see this: 
https://lwn.net/Articles/576478/

> In fact, it is difficult to even return EINTR from close() on Linux, 
> according to Christoph Hellwig. If the driver or filesystem's release() 
> method returns an error, it is explicitly ignored. The only path that would 
> allow a driver to return EINTR is if it provides a flush() method that does 
> so. Hellwig plans to post a patch that would enforce a no-EINTR policy on 
> that path as well.

> If EINTR can never be returned, there is no real reason to map it to 
> EINPROGRESS in glibc. But, since glibc may be used on an older kernel that 
> can return EINTR in some rare situations, mapping it to something probably 
> makes sense. That could be EINPROGRESS or, perhaps better still, just zero 
> for success, as suggested by Rich Felker. There really isn't much the 
> application programmer can do if close() returns an error.

I'm not sure under which circumstances flushing is occurring on close (not sure 
which drivers provide the flush method), but this makes it sound rare / driver 
dependent, and it sounds like the plan would make EINTR impossible.

So, I guess surfacing EINTR up to the caller sounds like the simplest thing to 
do, and we'll have to see whether this actually occurs in practice.


- Benjamin


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69082/#review212026
-----------------------------------------------------------


On Jan. 16, 2019, 1:06 p.m., Benjamin Bannier wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69082/
> -----------------------------------------------------------
> 
> (Updated Jan. 16, 2019, 1:06 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler, Chun-Hung Hsiao, and Alex Clemmer.
> 
> 
> Bugs: MESOS-9331
>     https://issues.apache.org/jira/browse/MESOS-9331
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> When e.g., writing to disk, errors from write might only manifest at
> ::close time. We update some instances to correctly propagate these
> errors instead of dropping them silently. We only propagate the
> ::close error if the write operation succeeded, otherwise we just
> propagate the error from the write operation.
> 
> 
> Diffs
> -----
> 
>   3rdparty/stout/include/stout/os/posix/mktemp.hpp 
> 63b3d1a7720d07f877fa1d4eb7f32a548916637a 
>   3rdparty/stout/include/stout/os/write.hpp 
> f7538f94f5a953a7a90a05bc1d2f138b6c17f814 
>   3rdparty/stout/include/stout/protobuf.hpp 
> eb4adef56f1701e3c101284e05e4e6c66eef9180 
> 
> 
> Diff: https://reviews.apache.org/r/69082/diff/3/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>

Reply via email to