Ignore this message. I didn't see Elaine's second message about the scheduler.
Becky -- Becky Ligon HPC Admin Staff PVFS/OrangeFS Developer Clemson University/Omnibond.com OrangeFS Support 864-650-4065 > Don't know for sure, but Elaine looked at the job scheduler and it appears > that the scheduler schedules based on handles. So, the two writes > shouldn't be in that part of the code at the same time. We might want to > see if there is a problem with the scheduler or with the state machine > calls before we change the underlying structure. > > Becky > -- > Becky Ligon > HPC Admin Staff > PVFS/OrangeFS Developer > Clemson University/Omnibond.com OrangeFS Support > 864-650-4065 > >> With some additional offline information from Benjamin the problem has >> been >> tracked down to dbpf_bstream_direct_write_op_svc(). The issue is that >> two >> write calls to different, contiguous, sections of the file occur without >> locking around retrieval of the current file size. The flow is similar >> to >> this, assuming two writes X -> Y, Y+1 - >Z >> both writes enter dbpf_bstream_direct_write_op_svc() >> write X->Y gets the current file size >> write X->Y makes the pwrite call >> write Y+1 -> Z gets the current file size >> write X->Y updates the file size >> write Y+1 -> Z makes the pwrite call (padding zeros from the previous >> end >> of >> file) >> write Y+1 -> Z updates the file size >> >> I can certainly add some locking to prevent this. Mostly to Phil or Sam, >> was >> there something in place that should be preventing this before I add >> another >> wheel? >> >> I did try moving the flocks from direct_locked_write() around the get >> file >> size and update but it looks like the fd is being closed causing the >> locks >> to be tossed. >> >> Thanks, >> Michael >> >> On Mon, Jun 13, 2011 at 4:47 PM, Becky Ligon <[email protected]> wrote: >> >>> Benjamin: >>> >>> Thanks for the extra information. I have added it to our tracking >>> system. >>> >>> Becky >>> -- >>> Becky Ligon >>> HPC Admin Staff >>> PVFS/OrangeFS Developer >>> Clemson University/Omnibond.com OrangeFS Support >>> 864-650-4065 >>> >>> > On Wednesday, June 08, 2011 07:33:45 Benjamin Severs wrote: >>> >> On Wednesday, June 08, 2011 06:21:45 AM Michael Moore wrote: >>> >> > Hi Benjamin, >>> >> > >>> >> > I don't have a quick fix, I don't know if others have seen this >>> kind >>> >> of >>> >> > issue before. I've created a ticket in Trac for it: >>> >> > https://www.orangefs.org/trac/orangefs/ticket/34 >>> >> > >>> >> > I'm planning on just copying your setup to recreate the problem. >>> >> Anything >>> >> > special about the contents of the 10GB file or just urandom type >>> data? >>> >> > >>> >> > Thanks for letting us know about this! >>> >> > >>> >> > Michael >>> >> >>> >> There was nothing special about the file. My file was just a >>> repeating >>> >> sequence of characters (ex. 200 'A's, followed by 200 'B's, and so >>> on). >>> >> >>> >> - Benjamin Severs >>> > >>> > More information... >>> > >>> > I believe this issue is caused by some race condition among the >>> directio >>> > threads. From my testing, it appears that the corruption stops (or >>> at >>> > least >>> > hasn't shown up in my testing) if I either switch to using the >>> alt-aio >>> > trove >>> > method or if I reduce the number of directio threads to 1. >>> > >>> > Hope this is useful. >>> > >>> > -- >>> > Benjamin Severs >>> > _______________________________________________ >>> > Pvfs2-developers mailing list >>> > [email protected] >>> > http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers >>> > >>> >>> >>> _______________________________________________ >>> Pvfs2-developers mailing list >>> [email protected] >>> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers >>> >> _______________________________________________ >> Pvfs2-developers mailing list >> [email protected] >> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers >> > > > _______________________________________________ > Pvfs2-developers mailing list > [email protected] > http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers > _______________________________________________ Pvfs2-developers mailing list [email protected] http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
