On 09/07/2010 11:43 AM, Mathieu Desnoyers wrote:
* Pierre-Marc Fournier ([email protected]) wrote:
On 09/06/2010 11:29 AM, Mathieu Desnoyers wrote:
* Mathieu Desnoyers ([email protected]) wrote:
* Pierre-Marc Fournier ([email protected]) wrote:
I disagree with you Mathieu. These retvals are the same as i/o
syscalls (read/write/send/recv/...)and therefore should in my opinion
remain as is.
Well, this function is not technically the same as i/o syscalls at all.
It uses I/O syscalls, but it is not an I/O syscall per se, so the return
value transformation to a more standard pattern (neg err val, 0 ok)
should happen right in this function rather than to let all callers
handle this. I/O syscalls use positive return values to indicate the
number of bytes read/written/etc. Here, this function arbitrarily choose
1 to indicate that "something has been sent" without caring about the
amount of data moved at all.
So as it doesn't need the whole positive range to spell out the amount
of data moved, it doesn't need to do the same special-cases that the I/O
syscalls are doing. It adds a lot of error values management oddness
without adding anything.
So even though I agree with you that this function is close to the I/O
system calls because it calls it, it is very far from the I/O syscalls
semantically (we don't care about the number of bytes written), and even
though we might be tempted to use the same error values as system calls
for them, the fact that we just don't care about the number of bytes
written combined with the fact that standardizing error value across the
code makes it much easier to follow and to write just call for this
change.
By the way, looking at include/share.h:patient_write(), in the case
where write returns 0, I think we should consider this as a success and
loop again to retry write rather than consider that an error occurred.
The same apply to patient_send(). See the manpages for details:
write(2):
RETURN VALUE
On success, the number of bytes written is returned (zero indicates
nothing was written). On error, -1 is returned, and errno is set
appropriately.
If count is zero and fd refers to a regular file, then write() may
return a failure status if one of the errors below is detected. If no
errors are detected, 0 will be returned without causing any other
effect. If count is zero and fd refers to a file other than a regular
file, the results are not specified.
send(2):
RETURN VALUE
On success, these calls return the number of characters sent. On
error, -1 is returned, and errno is set appropriately.
No. 0 indicates end of stream
By end of stream, you mean this sequence ?
server:
(block SIGPIPE)
sockwrite = socket(AF_UNIX, SOCK_STREAM, 0);
addr.sun_family = AF_UNIX;
strcpy(addr.sun_path, "./mysocket");
/* wait for client to create the unix socket */
sleep(2);
connect(sockwrite, (struct sockaddr *)&addr, sizeof(addr));)
/* let's do one write which will be consumed by the client */
write(sockwrite, "blah", sizeof("blah"));
/* wait for client to close FD */
sleep(2);
ret = write(sockwrite, "blah", sizeof("blah"));
( or even with ret = send(sockwrite, "blah", sizeof("blah"), MSG_NOSIGNAL); )
printf("write ret %ld errno %ld\n", ret, errno);
client:
(block SIGPIPE)
sockserv = socket(AF_UNIX, SOCK_STREAM, 0);
addr.sun_family = AF_UNIX;
strcpy(addr.sun_path, "./mysocket");
bind(sockserv, (struct sockaddr *)&addr, sizeof(addr));
listen(sockserv, 1);
sockread = accept(sockserv, NULL, NULL);
recv(sockread, buf, sizeof(buf), 0);
close(sockread);
The server printf returns:
"write ret -1 errno 32"
So the return value is -1, with an error number: EPIPE: Broken pipe
and looping on it will result in an
infinite loop. You need to refer to the specific backend driver to
understand these specific semantics. The read/write manpages are
notorious for their non-clarity about this.
This is what I just did by creating my small test program. Unless I am
misunderstanding your definition of "end of stream", your assumptions
about the way the unix sockets work seem incorrect.
Well, just that one assumption. But I accept your explanation and my
foolishness. ;-) patient_write() and patient_send() need to be fixed as
you describe.
Note that as the manpage says, if the file descriptor refers to a
regular file, then write() can return 0, with an error number set, which
could indicate that an error occured. If the file descriptor refers to a
file other than a regular file, the results are unspecified. So I think
the sane way to handle this would be to check if errno is 0 when write
returns 0. If errno != 0, then we have an error, but if errno == 0, I'd
consider that 0 bytes were written and retry.
By the way, you forgot to reply to my first message above which covers
the topic of semantics, so I'm re-pasting it here.
(about ustcomm_send_request())
Well, this function is not technically the same as i/o syscalls at all.
It uses I/O syscalls, but it is not an I/O syscall per se, so the return
value transformation to a more standard pattern (neg err val, 0 ok)
should happen right in this function rather than to let all callers
handle this. I/O syscalls use positive return values to indicate the
number of bytes read/written/etc. Here, this function arbitrarily choose
1 to indicate that "something has been sent" without caring about the
amount of data moved at all.
Let me copy the description of the return values here.
/*
* Return value:
* 0: Success, but no reply because recv() returned 0
* 1: Success
* -1: Error
*
* On error, the error message is printed, except on
* ECONNRESET, which is normal when the application dies.
*/
So as it doesn't need the whole positive range to spell out the amount
of data moved, it doesn't need to do the same special-cases that the I/O
syscalls are doing. It adds a lot of error values management oddness
without adding anything.
There are 3 cases the caller must consider when calling ustcomm_send_msg().
- An error occurred, there is a problem and it is necessary to complain (-1)
- The remote connection was closed so we were not able to send the
message (0)
- Everything worked fine (1)
Callers need to know if the connection was closed in order not to
complain but at the same time take consequent measures.
So even though I agree with you that this function is close to the I/O
system calls because it calls it, it is very far from the I/O syscalls
semantically (we don't care about the number of bytes written), and even
though we might be tempted to use the same error values as system calls
for them, the fact that we just don't care about the number of bytes
written combined with the fact that standardizing error value across the
code makes it much easier to follow and to write just call for this
change.
If you want to propose different semantics that still allow the caller
to know which of the 3 cases happened, I'm all open.
Thanks
pmf
_______________________________________________
ltt-dev mailing list
[email protected]
http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev