On Wed, Feb 9, 2011 at 4:34 PM, Glenn Fowler <g...@research.att.com> wrote: > > the test does not exercise the original problem > it doesn't even use lseek(), which is central to the problem > > the problem can be solved by making > lseek(fd, 0, SEEK_CUR) > a readonly query > no need for locks or semaphores in or out of the kernel > > here is the observed behavior on solaris showing that > one lseek(SEEK_CUR) and one write() call are not atomic > > (the lseek(fd,0,SEEK_CUR) call is part of some user code > to detect if an fd is seekable -- on solaris it sometimes > results in loss of write() data from another process) > > given two processes p1 and p2 > and one fd at offset o1 > > p1 p2 > -- -- > lseek(fd,0,SEEK_CUR) starts > write(fd,buf,n) starts > write(fd,buf,n) completes > lseek(fd,0,SEEK_CUR) finishes > lseek(fd,0,SEEK_CUR) == o1 > lseek(fd,0,SEEK_CUR) != o1+n > > if p1 were to set fd to any offset other than the current offset > then there would be a race between p1 and p2 as to the location > of the p2 write data > > but p1 `sets' it to the current offset > and for atomic lseek() and write() we can have two standard conforming > outcomes > both with p2 lseek returning o1+n > > (1) p1 lseek() returns o1, p2 lseek() returns o1+n > (2) p1 lseek() returns o1+n, p2 lseek() returns o1+n > > for solaris a third incorrect outcome is sometimes observed > > (3) p1 lseek returns o1, p2 lseek() returns o1 > > On Wed, 09 Feb 2011 15:39:28 +0100 casper....@oracle.com wrote: >> >No, that will work fine since the file offset is ignored in O_APPEND mode >> >and writes are serialized by lower-level (file system) mechanisms. >> >(Note that I used O_APPEND for the output file in my test program.) >> > >> >>> I would appreciate it if some people out there would run this test >> >>> on other systems (Red Hat Linux, NETBSD, Apple OS X, HP-UX, IRIX) >> >>> and post the results. >> >> >> >> I tried to run the test on the following systems: >> >> >> >> 2.6.32-27-generic-pae #49-Ubuntu SMP >> >> 2.6.18-194.el5 #1 SMP (Red Hat) >> >> 2.6.16.60-0.21-smp #1 SMP (SUSE) >> >> >> >> It failed on all of them. >> >> >> >> Thanks, >> >> Dmitry >> > >> >Thanks for running the test, > >> I'm not really surprised, when multiple threads read from the same >> fd, are we supposed to serialize all the reads to make sure that you can't >> have concurrent reads starting at the same offset? I don't think that >> POSIX prevents that. How does it fail on the Linux kernels? On Solaris, >> the output file is somewhat larger but not much (when using tmpfs)
BTW: Please keep Roger and Casper in the CC: loop... AFAIK they are not subscribed to ksh93-integration-discuss@ yet... ---- Bye, Roland -- __ . . __ (o.\ \/ /.o) roland.ma...@nrubsig.org \__\/\/__/ MPEG specialist, C&&JAVA&&Sun&&Unix programmer /O /==\ O\ TEL +49 641 3992797 (;O/ \/ \O;) _______________________________________________ ksh93-integration-discuss mailing list ksh93-integration-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/ksh93-integration-discuss