Re: Weird piecemeal reads over socketpair() pipe breaks up small writes into even smaller reads.
Isn't it easier to reclassify the bug as "uipc_send() wakes up the reader before it's done appending the data from a write() to the socket buffer" and use my patch? I don't think it makes sense for uipc_send() to depend on sorwakeup() not actually waking up anyone in certain situations. Bill To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-current" in the body of the message
Re: Weird piecemeal reads over socketpair() pipe breaks up small writes into even smaller reads.
:kern_subr.c can't do that, because `struct uio' doesn't give the original :count. : :>Would you like to do it or should I? This isn't high priority but it :>should definitely not be rescheduling after the first 96 bytes. That's :>just a waste of cpu. : :The waste for rescheduling should be insignificant, since it should only :occur every ROUNDROBIN_INTERVAL (default 100 msec). It actually seems :to be rescheduling more often. Rescheduling _after_ the first 96 bytes :is surprising, since the rescheduling is done before doing any i/o, so :sync effects from sleep(1) should cause rescheduling before any i/o is :done. Then the reader won't run, but other processes may. : :Bruce sleep() is not relevant... what is relevant is that the sub-process is doing a write() while the parent is sitting in a select() -- the parent process is thus going to have priority over the child. The moment the child allows a reschedule, the parent gets the cpu. I would like to point out that this type of situation will occur with most piping situations. The reader is almost always blocked waiting to read while the writer is obviously always running, in the middle of the write. -Matt Matthew Dillon To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-current" in the body of the message
Re: Weird piecemeal reads over socketpair() pipe breaks up small writes into even smaller reads.
>Ok, so perhaps tweeking the rescheduling changes in kern_subr.c to >not try to do it in the first few thousand bytes copied is the solution? kern_subr.c can't do that, because `struct uio' doesn't give the original count. >Would you like to do it or should I? This isn't high priority but it >should definitely not be rescheduling after the first 96 bytes. That's >just a waste of cpu. The waste for rescheduling should be insignificant, since it should only occur every ROUNDROBIN_INTERVAL (default 100 msec). It actually seems to be rescheduling more often. Rescheduling _after_ the first 96 bytes is surprising, since the rescheduling is done before doing any i/o, so sync effects from sleep(1) should cause rescheduling before any i/o is done. Then the reader won't run, but other processes may. Bruce To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-current" in the body of the message
Re: Weird piecemeal reads over socketpair() pipe breaks up small writes into even smaller reads.
> When the writer blocks, the reader runs and uses a buggy loop to read > only the first chunk of input. > > On an otherwise idle system, the need_resched() condition seems to be > true always. I would have expected the synchronisation provided by the > sleep(1) to bias need_resched() in the opposite direction. A reschedule > has been done, normally just after the previous hardclock() call, just > before the writer wakes up, so another one should not be done soon > (until after the next hardclock() call). Sorry everyone, I'll be away for a week and won't put in my scheduler fixes until I get back. Most of the changes are on Freefall in my home directory. I hate to be so passive about committing tested code, but my schedule is such over the last few months that I'm never around to fix things up if the unexpected happens. I'm working hard on a proposal that will let me spend some quality time on this - wish me luck. Meanwhile, I'm off all the lists. I'll check e-mail sent to either dufa...@hda.com or dufa...@freebsd.org intermittently. Peter -- Peter Dufault (dufa...@hda.com) Realtime development, Machine control, HD Associates, Inc. Safety critical systems, Agency approval To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-current" in the body of the message
Re: Weird piecemeal reads over socketpair() pipe breaks up small writes into even smaller reads.
How about a 1-line fix: Index: uipc_usrreq.c === RCS file: /home/ncvs/src/sys/kern/uipc_usrreq.c,v retrieving revision 1.37 diff -u -r1.37 uipc_usrreq.c --- uipc_usrreq.c 1998/10/25 17:44:51 1.37 +++ uipc_usrreq.c 1999/02/15 07:09:12 @@ -348,7 +348,8 @@ unp->unp_conn->unp_mbcnt = rcv->sb_mbcnt; snd->sb_hiwat -= rcv->sb_cc - unp->unp_conn->unp_cc; unp->unp_conn->unp_cc = rcv->sb_cc; - sorwakeup(so2); + if (!(flags & PRUS_MORETOCOME)) + sorwakeup(so2); m = 0; #undef snd #undef rcv Unfortunately, this apparently unearths a bug in the ?:?:?: expression in sosend(), so try this diff too. Index: uipc_socket.c === RCS file: /home/ncvs/src/sys/kern/uipc_socket.c,v retrieving revision 1.51 diff -u -r1.51 uipc_socket.c --- uipc_socket.c 1999/01/20 17:45:22 1.51 +++ uipc_socket.c 1999/02/15 07:09:25 @@ -388,6 +405,7 @@ register long space, len, resid; int clen = 0, error, s, dontroute, mlen; int atomic = sosendallatonce(so) || top; + int pru_flags; if (uio) resid = uio->uio_resid; @@ -518,21 +536,24 @@ } while (space > 0 && atomic); if (dontroute) so->so_options |= SO_DONTROUTE; + pru_flags = 0; + if (flags & MSG_OOB) + pru_flags |= PRUS_OOB; + /* +* If the user set MSG_EOF, the protocol +* understands this flag and nothing left to +* send then set PRUS_EOF. +*/ + if ((flags & MSG_EOF) && + (so->so_proto->pr_flags & PR_IMPLOPCL) && + (resid <= 0)) + pru_flags |= PRUS_EOF; + /* If there is more to send set PRUS_MORETOCOME */ + if (resid > 0 && space > 0) + pru_flags |= PRUS_MORETOCOME; s = splnet(); /* XXX */ error = (*so->so_proto->pr_usrreqs->pru_send)(so, - (flags & MSG_OOB) ? PRUS_OOB : - /* -* If the user set MSG_EOF, the protocol -* understands this flag and nothing left to -* send then use PRU_SEND_EOF instead of PRU_SEND. -*/ - ((flags & MSG_EOF) && -(so->so_proto->pr_flags & PR_IMPLOPCL) && -(resid <= 0)) ? - PRUS_EOF : - /* If there is more to send set PRUS_MORETOCOME */ - (resid > 0 && space > 0) ? PRUS_MORETOCOME : 0, - top, addr, control, p); + pru_flags, top, addr, control, p); splx(s); if (dontroute) so->so_options &= ~SO_DONTROUTE; Bill To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-current" in the body of the message
Re: Weird piecemeal reads over socketpair() pipe breaks up small writes into even smaller reads.
: :>This isn't a 'bug', per say, but it bothers me that a small 128 byte :>write() is being somehow broken apart into two smaller read()s. It isn't :>efficient, and it shouldn't be happening. : :Breaking apart write() into read()s would be a BUG :-). : :Breaking apart read() into read()s seems to be caused by my rescheduling :changes in kern_subr.c. : :>fcntl(fds[0], F_SETFL, O_NONBLOCK); : :Here you permit non-atomic reads for block sizes <= PIPE_BUF, so you should :be prpared to get them. Well of course. Why do you think I said "This isn't a 'bug', per say", eh? :>if (fork() == 0) { :>sleep(1); :>write(fds[1], buf, sizeof(buf)); :>_exit(1); :>} : :The write() apparently begins by copyout()ing only 96 bytes, and if the :need_resched() condition is true, then the process doing the write will :block. : :>select(fds[0] + 1, &rfds, NULL, NULL, NULL); :>while ((n = read(fds[0], buf, sizeof(buf))) > 0) :>printf("read %d\n", n); :>return(0); : :When the writer blocks, the reader runs and uses a buggy loop to read :only the first chunk of input. : :Bruce Ok, so perhaps tweeking the rescheduling changes in kern_subr.c to not try to do it in the first few thousand bytes copied is the solution? Would you like to do it or should I? This isn't high priority but it should definitely not be rescheduling after the first 96 bytes. That's just a waste of cpu. -Matt Matthew Dillon To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-current" in the body of the message
Re: Weird piecemeal reads over socketpair() pipe breaks up small writes into even smaller reads.
>This isn't a 'bug', per say, but it bothers me that a small 128 byte >write() is being somehow broken apart into two smaller read()s. It isn't >efficient, and it shouldn't be happening. Breaking apart write() into read()s would be a BUG :-). Breaking apart read() into read()s seems to be caused by my rescheduling changes in kern_subr.c. >fcntl(fds[0], F_SETFL, O_NONBLOCK); Here you permit non-atomic reads for block sizes <= PIPE_BUF, so you should be prpared to get them. >if (fork() == 0) { >sleep(1); >write(fds[1], buf, sizeof(buf)); >_exit(1); >} The write() apparently begins by copyout()ing only 96 bytes, and if the need_resched() condition is true, then the process doing the write will block. >select(fds[0] + 1, &rfds, NULL, NULL, NULL); >while ((n = read(fds[0], buf, sizeof(buf))) > 0) >printf("read %d\n", n); >return(0); When the writer blocks, the reader runs and uses a buggy loop to read only the first chunk of input. On an otherwise idle system, the need_resched() condition seems to be true always. I would have expected the synchronisation provided by the sleep(1) to bias need_resched() in the opposite direction. A reschedule has been done, normally just after the previous hardclock() call, just before the writer wakes up, so another one should not be done soon (until after the next hardclock() call). Bruce To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-current" in the body of the message
Weird piecemeal reads over socketpair() pipe breaks up small writes into even smaller reads.
This is under -current. I don't know when it started, but I think whatever change is causing this was in the last week or two. This isn't a 'bug', per say, but it bothers me that a small 128 byte write() is being somehow broken apart into two smaller read()s. It isn't efficient, and it shouldn't be happening. -Matt Matthew Dillon apollo:/home/dillon> ./x read 96 read 32 #include #include #include /* unix domain sockets */ #include /* internet sockets */ #include /* TCP_NODELAY sockopt */ #include #include int main(int ac, char **av) { int fds[2]; int n; char buf[128]; fd_set rfds; if (socketpair(PF_UNIX, SOCK_STREAM, IPPROTO_IP, fds) < 0) perror("socketpair"); fcntl(fds[0], F_SETFL, O_NONBLOCK); FD_ZERO(&rfds); FD_SET(fds[0], &rfds); if (fork() == 0) { sleep(1); write(fds[1], buf, sizeof(buf)); _exit(1); } select(fds[0] + 1, &rfds, NULL, NULL, NULL); while ((n = read(fds[0], buf, sizeof(buf))) > 0) printf("read %d\n", n); return(0); } To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-current" in the body of the message