On Tue, Aug 4, 2015 at 5:53 AM, Peter Lieven <p...@kamp.de> wrote:

> Am 04.08.2015 um 14:29 schrieb Peter Lieven:
>
>> Am 04.08.2015 um 14:09 schrieb Paolo Bonzini:
>>
>>>
>>> On 04/08/2015 13:57, Peter Lieven wrote:
>>>
>>>> Okay, what I found out is that in aio_poll I get revents = POLLIN for
>>>> the nfs file descriptor. But there is no data available on the socket.
>>>>
>>> Does read return 0 or EAGAIN?
>>>
>>> If it returns EAGAIN, the bug is in the QEMU main loop or the kernel.
>>> It should never happen that poll returns POLLIN and read returns EAGAIN.
>>>
>>> If it returns 0, it means the other side called shutdown(fd, SHUT_WR).
>>> Then I think the bug is in the libnfs driver or more likely libnfs.  You
>>> should stop polling the POLLIN event after read has returned 0 once.
>>>
>>
>> You might be right. Ronnie originally used the FIONREAD ioctl before
>> every read and considered
>> the socket as disconnected if the available bytes returned where 0.
>> I found that I get available bytes == 0 from that ioctl even if the
>> socket was not closed.
>>
>
You only get >0 from this call if there are actual bytes available to read.

For context,  the problem was that

75
<http://git.qemu.org/?p=qemu.git;a=blob;f=block/nfs.c;h=c026ff6883311f2a9b16a4326ad1da97c6c3d4f4;hb=2be4f242b50a84bf360df02480b173bfed161107#l75>
static void nfs_process_read(void *arg)
76
<http://git.qemu.org/?p=qemu.git;a=blob;f=block/nfs.c;h=c026ff6883311f2a9b16a4326ad1da97c6c3d4f4;hb=2be4f242b50a84bf360df02480b173bfed161107#l76>
{
77
<http://git.qemu.org/?p=qemu.git;a=blob;f=block/nfs.c;h=c026ff6883311f2a9b16a4326ad1da97c6c3d4f4;hb=2be4f242b50a84bf360df02480b173bfed161107#l77>
    NFSClient *client = arg;
78
<http://git.qemu.org/?p=qemu.git;a=blob;f=block/nfs.c;h=c026ff6883311f2a9b16a4326ad1da97c6c3d4f4;hb=2be4f242b50a84bf360df02480b173bfed161107#l78>
    nfs_service(client->context, POLLIN);
79
<http://git.qemu.org/?p=qemu.git;a=blob;f=block/nfs.c;h=c026ff6883311f2a9b16a4326ad1da97c6c3d4f4;hb=2be4f242b50a84bf360df02480b173bfed161107#l79>
    nfs_set_events(client);
80
<http://git.qemu.org/?p=qemu.git;a=blob;f=block/nfs.c;h=c026ff6883311f2a9b16a4326ad1da97c6c3d4f4;hb=2be4f242b50a84bf360df02480b173bfed161107#l80>
}

sometimes trigger and call nfs_service(POLLIN) eventhough the socket is not
readable.
I verified this by adding an extra call to poll() at around line 78
to check whether POLLIN was actually set on the fd or not. Sometimes it
would not be but I got lost in the sources and could not find if or where
this happens or even if qemu even guarantees "only call the POLLIN
callbacks if the filedescriptor is actually readable".


The old code in libnfs used to assume that IF we are called for POLLIN and
the if ioctl(FIONREAD) returns that there are 0 bytes available to read
then there was a problem with the socket.

:-(




> This seems to be some kind of bug in Linux - at least what I have thought.
>>
>> See BUGS in the select(2) manpage.
>>
>>        Under Linux, select() may report a socket file descriptor as
>> "ready for reading", while nevertheless a subsequent read blocks. This
>> could for example happen when data  has  arrived  but
>>        upon  examination  has  wrong checksum and is discarded. There may
>> be other circumstances in which a file descriptor is spuriously reported as
>> ready.  Thus it may be safer to use O_NON‐
>>        BLOCK on sockets that should not block.
>>
>> I will debug further, but it seems to be that I receive a POLLIN even if
>> there is no data available. I see 0 bytes from the recv call inside libnfs
>> and continue without a deadlock - at least
>> so far.
>>
>> Would it be a good idea to count the number of 0 bytes from recv and
>> react after I received 0 bytes for a number of consecutive times?
>>
>> And then: stop polling POLLIN or reconnect?
>>
>
> Okay, got it. Ronnie was using FIONREAD without checking for EAGAIN or
> EINTR.
>
> I will send a patch for libnfs to reconnect if count == 0. Libiscsi is not
> affected, it reconnects if count is 0.


Thanks, and merged.


>
>
> Peter
>
>

Reply via email to