Re: What's the practical use of the error close() returns?

2012-07-29 Thread Daniel Shahaf
Shachar Shemesh wrote on Sun, Jul 29, 2012 at 06:35:18 +0300:
> On 07/29/2012 02:12 AM, Daniel Shahaf wrote:
> > So if the disk hardware fails after close() returns but before the OS
> > caches are flushed... 
> 
> It is not part of close(2)'s job description to protect against this
> scenario. If you want to protect against this scenario, use sync(2).

No argument here.  Just wanted to explicitly point out that, when you
wrote "cannot fail due to X", it still could fail due to Y.

Cheers,

Daniel

___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: What's the practical use of the error close() returns?

2012-07-28 Thread Shachar Shemesh
On 07/29/2012 02:12 AM, Daniel Shahaf wrote:
> So if the disk hardware fails after close() returns but before the OS
> caches are flushed... 

It is not part of close(2)'s job description to protect against this
scenario. If you want to protect against this scenario, use sync(2).

Shachar

-- 
Shachar Shemesh
Lingnu Open Source Consulting Ltd.
http://www.lingnu.com

___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: What's the practical use of the error close() returns?

2012-07-28 Thread Daniel Shahaf
Shachar Shemesh wrote on Sat, Jul 28, 2012 at 09:56:40 +0300:
> On 07/27/2012 02:52 PM, Elazar Leibovich wrote:
> >
> > (as mentioned earlier, the "no space left" could just as well happen
> > after the file was closed, so I don't mind that much it's not reported
> > on a close())
> >
> Ehm, no.
> 
> First of all, please note a subtle but important difference between your
> question and Orna's answer. You asked about close, she answered about
> fclose.
> 
> Fclose is an stdio function, and performs a flush of the (user space)
> buffers. As such, it is possible for it to run out of disk space. Close,
> on the other hand, might only run out of disk space on remote file
> systems (such as nfs), and even then, it depends on the local cache
> coherency policy. I fail to see how an "out of disk space" might
> possibly happen AFTER close, as all that's left then are caches.
> 
> Please note that just because the data is in the OS caches, rather than
> on disk, this does not mean that anything can change as far as file
> system notion of the data goes. By the time close returns, the OS has
> already allocated space for the data, and decided where on the disk this
> data should go. It is not possible for it to fail after that point
> because of lack of disk space.

So if the disk hardware fails after close() returns but before the OS
caches are flushed...

___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: What's the practical use of the error close() returns?

2012-07-28 Thread Amos Shapira
BTW - everyone here keeps assuming that close(2) is called on disk files,
what about other types of file descriptors (sockets, pipes, character and
block devices, "virtual filesystem" files)? How would you adjust your
answers for that case?

On 28 July 2012 16:56, Shachar Shemesh  wrote:

>  On 07/27/2012 02:52 PM, Elazar Leibovich wrote:
>
>
> (as mentioned earlier, the "no space left" could just as well happen after
> the file was closed, so I don't mind that much it's not reported on a
> close())
>
>   Ehm, no.
>
> First of all, please note a subtle but important difference between your
> question and Orna's answer. You asked about close, she answered about
> fclose.
>
> Fclose is an stdio function, and performs a flush of the (user space)
> buffers. As such, it is possible for it to run out of disk space. Close, on
> the other hand, might only run out of disk space on remote file systems
> (such as nfs), and even then, it depends on the local cache coherency
> policy. I fail to see how an "out of disk space" might possibly happen
> AFTER close, as all that's left then are caches.
>
> Please note that just because the data is in the OS caches, rather than on
> disk, this does not mean that anything can change as far as file system
> notion of the data goes. By the time close returns, the OS has already
> allocated space for the data, and decided where on the disk this data
> should go. It is not possible for it to fail after that point because of
> lack of disk space.
>
> As I said before, for local file systems, this is also true of "write".
> The main reason we need to check the return code of "close" is because you
> never know who will run your program over NFS. Over NFS, for performance
> reasons, it is possible for a "write" command to fail after the kernel has
> already returned to user space with a success indication. Doing it
> otherwise will result in horrible latency for anyone working over NFS. All
> such errors will get reported on the next operation over the file
> descriptor: close.
>
> Shachar
>
> --
> Shachar Shemesh
> Lingnu Open Source Consulting Ltd.http://www.lingnu.com
>
>
> ___
> Linux-il mailing list
> Linux-il@cs.huji.ac.il
> http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
>
>


-- 
 [image: View my profile on LinkedIn]

___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: What's the practical use of the error close() returns?

2012-07-27 Thread Shachar Shemesh
On 07/27/2012 02:52 PM, Elazar Leibovich wrote:
>
> (as mentioned earlier, the "no space left" could just as well happen
> after the file was closed, so I don't mind that much it's not reported
> on a close())
>
Ehm, no.

First of all, please note a subtle but important difference between your
question and Orna's answer. You asked about close, she answered about
fclose.

Fclose is an stdio function, and performs a flush of the (user space)
buffers. As such, it is possible for it to run out of disk space. Close,
on the other hand, might only run out of disk space on remote file
systems (such as nfs), and even then, it depends on the local cache
coherency policy. I fail to see how an "out of disk space" might
possibly happen AFTER close, as all that's left then are caches.

Please note that just because the data is in the OS caches, rather than
on disk, this does not mean that anything can change as far as file
system notion of the data goes. By the time close returns, the OS has
already allocated space for the data, and decided where on the disk this
data should go. It is not possible for it to fail after that point
because of lack of disk space.

As I said before, for local file systems, this is also true of "write".
The main reason we need to check the return code of "close" is because
you never know who will run your program over NFS. Over NFS, for
performance reasons, it is possible for a "write" command to fail after
the kernel has already returned to user space with a success indication.
Doing it otherwise will result in horrible latency for anyone working
over NFS. All such errors will get reported on the next operation over
the file descriptor: close.

Shachar

-- 
Shachar Shemesh
Lingnu Open Source Consulting Ltd.
http://www.lingnu.com

___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: What's the practical use of the error close() returns?

2012-07-27 Thread Nadav Har'El
On Fri, Jul 27, 2012, Elazar Leibovich wrote about "Re: What's the practical 
use of the error close() returns?":
> You nailed it! closing a file twice is an error that makes sense to be
> issued at close. So simple, how could I miss it?

Yes, that's a good reason - nice catch. I also didn't think about it.

But note that closing a non-open fd gives an EBADF, which is different
from the EIO errors we've been talking about in this thread.

Also note that an EBADF usually indicates a *logic error* in your
program, i.e., a bug (there's a code path that makes you close a non-open
fd). If you have this bug, you can't really count on the EBADF - because
it's quite possible you'll close another opened fd, not a non-opened
one, and not get any error

-- 
Nadav Har'El|   Friday, Jul 27 2012, 9 Av 5772
n...@math.technion.ac.il |-
Phone +972-523-790466, ICQ 13349191 |Disclaimer: The opinions expressed above
http://nadav.harel.org.il   |are not my own.

___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: What's the practical use of the error close() returns?

2012-07-27 Thread Oleg Goldshmidt
> This might not really be helpful to know, the file descriptor is left in an
> unspecified state according to POSIX, see:
> http://utcc.utoronto.ca/~cks/space/blog/unix/CloseEINTR

The fact that POSIX does not specify the fd's state is, indeed, not
helpful. The *reason* why it does not specify it is very helpful and
relevant, and it is spelled out even in the above link: by the time
POSIX was developed there already were well-established UNIX systems
some of which always closed the fd when returning EINTR and others
always left it open. So POSIX decided not to decide.

The fact that POSIX does not specify the outcome does not mean you do
not know the outcome in every case (or, at least, in most cases).
Linux always closes so there is nothing to be done if you notice
EINTR. However, if you write portable code and you HP-UX or other
platforms that leave the fd open (and there are arguments - with which
one may or may not agree, just be aware of their existence - for that
behaviour) are among your targets then you mght want to preprocess the
error handling code accordingly.

A quick google seems to show that the issue is under review at this
very moment: http://austingroupbugs.net/view.php?id=529 (if you have
not heard of Austin group, see http://www.opengroup.org/austin/).

-- 
Oleg Goldshmidt | p...@goldshmidt.org

___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: What's the practical use of the error close() returns?

2012-07-27 Thread Baruch Even
On Fri, Jul 27, 2012 at 7:06 PM, Oleg Goldshmidt  wrote:
>
> On Fri, Jul 27, 2012 at 2:52 PM, Elazar Leibovich  wrote:
>
> > You nailed it! closing a file twice is an error that makes sense to be
> > issued at close. So simple, how could I miss it?
>
> Not only for catching your bugs. If fclose(3) returns an error any
> further access  to the descriptor, including another call to
> fclose(3), results in undefined behaviour, and I'd regard that as the
> scariest thing that can happen to a program.
>
> There is another reason to check the status: a call to close() might
> have been interrupted by a signal. In some cases the call may be
> resumed after the signal is handled, in some cases the call returns an
> error status (errno=EINTR usually?). I suppose you want to know
> whether the descriptor was or was not closed.


This might not really be helpful to know, the file descriptor is left in an
unspecified state according to POSIX, see:
http://utcc.utoronto.ca/~cks/space/blog/unix/CloseEINTR

Baruch

___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: What's the practical use of the error close() returns?

2012-07-27 Thread Oleg Goldshmidt
On Fri, Jul 27, 2012 at 2:52 PM, Elazar Leibovich  wrote:

> You nailed it! closing a file twice is an error that makes sense to be
> issued at close. So simple, how could I miss it?

Not only for catching your bugs. If fclose(3) returns an error any
further access  to the descriptor, including another call to
fclose(3), results in undefined behaviour, and I'd regard that as the
scariest thing that can happen to a program.

There is another reason to check the status: a call to close() might
have been interrupted by a signal. In some cases the call may be
resumed after the signal is handled, in some cases the call returns an
error status (errno=EINTR usually?). I suppose you want to know
whether the descriptor was or was not closed.

-- 
Oleg Goldshmidt | p...@goldshmidt.org

___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: What's the practical use of the error close() returns?

2012-07-27 Thread Orna Agmon Ben-Yehuda
On Fri, Jul 27, 2012 at 2:52 PM, Elazar Leibovich  wrote:

> Thanks!
>
> You nailed it! closing a file twice is an error that makes sense to be
> issued at close. So simple, how could I miss it?
>
> (as mentioned earlier, the "no space left" could just as well happen after
> the file was closed, so I don't mind that much it's not reported on a
> close())
>

It can, happen later, true. But if you know that it happened at close time
you can take measures. For example, If I get this, I issue a pop-up message
with the error message without trying to write the error to the log. This
way I have a better chance to reach the user with the information of what
went wrong.

>
>
> On Fri, Jul 27, 2012 at 12:29 PM, Orna Agmon Ben-Yehuda <
> ladyp...@gmail.com> wrote:
>
>> My practical answer:
>>
>> I always check fclose() . It happened to me that it returned errors when
>> the file I was trying to close was already closed (which usually meant  I
>> had a bug, because I closed one file  twice, another never. It also failed
>> with "no space left on device", when it was trying to flush the rest of the
>> data that was on the way to the file.
>>
>> Orna
>>
>>
>> On Thu, Jul 26, 2012 at 11:49 PM, Elazar Leibovich wrote:
>>
>>> I was always intrigued by this unix tidbit, closing a file can return an
>>> error. In practice, it is rarely checked (as far as I've seen)
>>>
>>> What does it mean? If I understand it correctly, recent write can lie
>>> about its success.
>>>
>>> But when do you really need it? If you have a piece of information you
>>> want to make sure it hits the disk, the reasonable thing to do is to fsync
>>> the file, and check the error of the fsync. If you don't care about it that
>>> much, then don't check the error, you don't have much to do even if it
>>> failed. It seems to me that one can make close return void, and point the
>>> one who wishes to make sure data hit the disk to fsync.
>>>
>>> What's the practical use case, where you care about close() error, but
>>> you don't care enough to need an fsync.
>>>
>>> Another question is, why let write lie about its success, what does it
>>> gain you? Let close return void, and force write never to defer its error
>>> reporting.
>>>
>>> ___
>>> Linux-il mailing list
>>> Linux-il@cs.huji.ac.il
>>> http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
>>>
>>>
>>
>>
>> --
>> Orna Agmon Ben-Yehuda.
>> http://ladypine.org
>>
>
>


-- 
Orna Agmon Ben-Yehuda.
http://ladypine.org
___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: What's the practical use of the error close() returns?

2012-07-27 Thread Elazar Leibovich
Thanks!

You nailed it! closing a file twice is an error that makes sense to be
issued at close. So simple, how could I miss it?

(as mentioned earlier, the "no space left" could just as well happen after
the file was closed, so I don't mind that much it's not reported on a
close())

On Fri, Jul 27, 2012 at 12:29 PM, Orna Agmon Ben-Yehuda
wrote:

> My practical answer:
>
> I always check fclose() . It happened to me that it returned errors when
> the file I was trying to close was already closed (which usually meant  I
> had a bug, because I closed one file  twice, another never. It also failed
> with "no space left on device", when it was trying to flush the rest of the
> data that was on the way to the file.
>
> Orna
>
>
> On Thu, Jul 26, 2012 at 11:49 PM, Elazar Leibovich wrote:
>
>> I was always intrigued by this unix tidbit, closing a file can return an
>> error. In practice, it is rarely checked (as far as I've seen)
>>
>> What does it mean? If I understand it correctly, recent write can lie
>> about its success.
>>
>> But when do you really need it? If you have a piece of information you
>> want to make sure it hits the disk, the reasonable thing to do is to fsync
>> the file, and check the error of the fsync. If you don't care about it that
>> much, then don't check the error, you don't have much to do even if it
>> failed. It seems to me that one can make close return void, and point the
>> one who wishes to make sure data hit the disk to fsync.
>>
>> What's the practical use case, where you care about close() error, but
>> you don't care enough to need an fsync.
>>
>> Another question is, why let write lie about its success, what does it
>> gain you? Let close return void, and force write never to defer its error
>> reporting.
>>
>> ___
>> Linux-il mailing list
>> Linux-il@cs.huji.ac.il
>> http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
>>
>>
>
>
> --
> Orna Agmon Ben-Yehuda.
> http://ladypine.org
>
___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: What's the practical use of the error close() returns?

2012-07-27 Thread Orna Agmon Ben-Yehuda
My practical answer:

I always check fclose() . It happened to me that it returned errors when
the file I was trying to close was already closed (which usually meant  I
had a bug, because I closed one file  twice, another never. It also failed
with "no space left on device", when it was trying to flush the rest of the
data that was on the way to the file.

Orna


On Thu, Jul 26, 2012 at 11:49 PM, Elazar Leibovich wrote:

> I was always intrigued by this unix tidbit, closing a file can return an
> error. In practice, it is rarely checked (as far as I've seen)
>
> What does it mean? If I understand it correctly, recent write can lie
> about its success.
>
> But when do you really need it? If you have a piece of information you
> want to make sure it hits the disk, the reasonable thing to do is to fsync
> the file, and check the error of the fsync. If you don't care about it that
> much, then don't check the error, you don't have much to do even if it
> failed. It seems to me that one can make close return void, and point the
> one who wishes to make sure data hit the disk to fsync.
>
> What's the practical use case, where you care about close() error, but you
> don't care enough to need an fsync.
>
> Another question is, why let write lie about its success, what does it
> gain you? Let close return void, and force write never to defer its error
> reporting.
>
> ___
> Linux-il mailing list
> Linux-il@cs.huji.ac.il
> http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
>
>


-- 
Orna Agmon Ben-Yehuda.
http://ladypine.org
___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: What's the practical use of the error close() returns?

2012-07-26 Thread Elazar Leibovich
On Fri, Jul 27, 2012 at 12:12 AM, Nadav Har'El wrote:

>
> So it seems to me that checking the close() only *sometimes* lets you
> know of write errors which you'll otherwise miss. But since you'll
> anyway miss other write errors (those coming after the close()), it's
> not clear what exactly you're gaining.
>

It seems to me you're agreeing with me, right? You could just as well have
dup return error for recent write errors, it's just an arbitrary point
where one can report write error, with no significant gain over any other
points
___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: What's the practical use of the error close() returns?

2012-07-26 Thread Nadav Har'El
On Thu, Jul 26, 2012, Elazar Leibovich wrote about "What's the practical use of 
the error close() returns?":
> I was always intrigued by this unix tidbit, closing a file can return an
> error. In practice, it is rarely checked (as far as I've seen)

Here are my two cents:

In Unix/Linux, writing to a file usually writes only to kernel memory,
and the actual writing - to disk, to a network file system, or whatever -
only happens later. So an error might be discovered only after a write()
returns.

Now, imagine a program that does a series of writes and finally a
close(). Imagine that sometime in the middle of this, there is a write
error - perhaps the disk is bad, perhaps the filesystem becomes full, or
whatever. If the error comes between two write()s, you'll get it as an
error from the second write(). If it comes between the last write() and
the close(), you'll get it from the close. If it comes *after* the
close(), as far as I know you'll never know that the write failed...

So it seems to me that checking the close() only *sometimes* lets you
know of write errors which you'll otherwise miss. But since you'll
anyway miss other write errors (those coming after the close()), it's
not clear what exactly you're gaining.

BTW, I recently heard a talk in SYSTOR
(http://www.research.ibm.com/haifa/conferences/systor2012/) where the
guy said that for these and other reasons, many programmers started sticking
fsync() calls all over the place, e.g., before close(). But now, these
fsync() are supposed to force not only immediate write from memory to
the hard disk, but also to force actual write to the hard-disk surface
(not just its cache), but this ruins the benefits of tghe hard-disk's
write caching layer, so hard disk manufactures actually started to lie
about writing, when they actually didn't! I.e., you can do an fsync(),
thinking that the write to the hard-disk was successful, when actually
it wasn't done yet, and may fail when actually done.

> Another question is, why let write lie about its success, what does it gain
> you? Let close return void, and force write never to defer its error
> reporting.

If this is what you want, use the O_SYNC or O_DIRECT options to open(2).
Normally, you don't want this because it hurts performance.


-- 
Nadav Har'El| Thursday, Jul 26 2012, 8 Av 5772
n...@math.technion.ac.il |-
Phone +972-523-790466, ICQ 13349191 |A language is a dialect with an army.
http://nadav.harel.org.il   |

___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il