This bug has been there as far back as 2.6.x... It is known as "userspace process sits in Z state", occasional hang, etc. It has been tagged by maintainers as "unreproducible" for 7 odd years now. I agree - on stock UML it is difficult to reproduce - it may take half a day to hit it when running a specially designed test case. You have to improve network IO to a reasonable performance level for it to come and shine on a regular basis.
In any case - this fix also provides a considerable performance improvement on "real life" network apps which need both disk io and network. There you hit the "pipe is no longer consuming requests" barrier on a very regular basis. For example: when wgetting a file and storing it across a 1GBit link without it I get ~ 340MBit max. With it I can hit line rate. I have filed it as Debian 741077, the other two are 741076 and 741075. By the way - this means that the request disposal and handling of failed requests in the UBD driver is wrong somewhere else. No idea where. A. On 08/03/14 06:51, anton.iva...@kot-begemot.co.uk wrote: > From: Anton Ivanov <antiv...@cisco.com> > > For more details see: > > http://stackoverflow.com/questions/4624071/pipe-buffer-size-is-4k-or-64k > > The observations on that thread have been confirmed by us for UML's > UBD driver. If you load UML with network IO and do disk IO at the same > time the UBD IO helper thread fails to process requests fast enough. > > In most cases this will lead to a slowdown in disk IO as well as > inability to execute new processes until the network load goes away. > In some cases however it will not recover. Example - with our new high > performance 1G+ network drivers a wget of 1G file from the network > is a nearly guaranteed crash if it writes out to UBD. > > The crashes and the slowdowns are not observed with this patch - > it switches the IPC to socket which does not have the pipe granularity > and queue size problems. > > This signifies a problem with overall UBD error handling which this > patch fails to fix. The workaround should be good enough for most > cases. > > Signed-off-by: Anton Ivanov <antiv...@cisco.com> > --- > arch/um/drivers/ubd_kern.c | 2 +- > arch/um/drivers/ubd_user.c | 2 +- > 2 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/arch/um/drivers/ubd_kern.c b/arch/um/drivers/ubd_kern.c > index 944453a..c9a5717 100644 > --- a/arch/um/drivers/ubd_kern.c > +++ b/arch/um/drivers/ubd_kern.c > @@ -1291,7 +1291,7 @@ static void do_ubd_request(struct request_queue *q) > n = os_write_file(thread_fd, &io_req, > sizeof(struct io_thread_req *)); > if(n != sizeof(struct io_thread_req *)){ > - if(n != -EAGAIN) > + if(!((n == -EAGAIN) || (n == -ENOBUFS))) > printk("write to io thread failed, " > "errno = %d\n", -n); > else if(list_empty(&dev->restart)) > diff --git a/arch/um/drivers/ubd_user.c b/arch/um/drivers/ubd_user.c > index 007b94d..f1f84a4 100644 > --- a/arch/um/drivers/ubd_user.c > +++ b/arch/um/drivers/ubd_user.c > @@ -32,7 +32,7 @@ int start_io_thread(unsigned long sp, int *fd_out) > { > int pid, fds[2], err; > > - err = os_pipe(fds, 1, 1); > + err = socketpair(AF_UNIX, SOCK_STREAM, 0, (int *) &fds); > if(err < 0){ > printk("start_io_thread - os_pipe failed, err = %d\n", -err); > goto out; -- "If you think it's expensive to hire a professional to do the job, wait until you hire an amateur." Paul Neal "Red" Adair A. R. Ivanov E-mail: anton.iva...@kot-begemot.co.uk ------------------------------------------------------------------------------ Subversion Kills Productivity. Get off Subversion & Make the Move to Perforce. With Perforce, you get hassle-free workflows. Merge that actually works. Faster operations. Version large binaries. Built-in WAN optimization and the freedom to use Git, Perforce or both. Make the move to Perforce. http://pubads.g.doubleclick.net/gampad/clk?id=122218951&iu=/4140/ostg.clktrk _______________________________________________ User-mode-linux-devel mailing list User-mode-linux-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel