Ignore this message.  I didn't see Elaine's second message about the
scheduler.

Becky
-- 
Becky Ligon
HPC Admin Staff
PVFS/OrangeFS Developer
Clemson University/Omnibond.com OrangeFS Support
864-650-4065

> Don't know for sure, but Elaine looked at the job scheduler and it appears
> that the scheduler  schedules based on handles.  So, the two writes
> shouldn't be in that part of the code at the same time.  We might want to
> see if there is a problem with the scheduler or with the state machine
> calls before we change the underlying structure.
>
> Becky
> --
> Becky Ligon
> HPC Admin Staff
> PVFS/OrangeFS Developer
> Clemson University/Omnibond.com OrangeFS Support
> 864-650-4065
>
>> With some additional offline information from Benjamin the problem has
>> been
>> tracked down to dbpf_bstream_direct_write_op_svc(). The issue is that
>> two
>> write calls to different, contiguous, sections of the file occur without
>> locking around retrieval of the current file size. The flow is similar
>> to
>> this, assuming two writes X -> Y, Y+1 - >Z
>> both writes enter dbpf_bstream_direct_write_op_svc()
>> write X->Y gets the current file size
>> write X->Y makes the pwrite call
>> write Y+1 -> Z gets the current file size
>> write X->Y updates the file size
>> write Y+1 -> Z makes the pwrite call (padding zeros from the previous
>> end
>> of
>> file)
>> write Y+1 -> Z updates the file size
>>
>> I can certainly add some locking to prevent this. Mostly to Phil or Sam,
>> was
>> there something in place that should be preventing this before I add
>> another
>> wheel?
>>
>> I did try moving the flocks from direct_locked_write() around the get
>> file
>> size and update but it looks like the fd is being closed causing the
>> locks
>> to be tossed.
>>
>> Thanks,
>> Michael
>>
>> On Mon, Jun 13, 2011 at 4:47 PM, Becky Ligon <[email protected]> wrote:
>>
>>> Benjamin:
>>>
>>> Thanks for the extra information.  I have added it to our tracking
>>> system.
>>>
>>> Becky
>>> --
>>> Becky Ligon
>>> HPC Admin Staff
>>> PVFS/OrangeFS Developer
>>> Clemson University/Omnibond.com OrangeFS Support
>>> 864-650-4065
>>>
>>> > On Wednesday, June 08, 2011 07:33:45 Benjamin Severs wrote:
>>> >> On Wednesday, June 08, 2011 06:21:45 AM Michael Moore wrote:
>>> >> > Hi Benjamin,
>>> >> >
>>> >> > I don't have a quick fix, I don't know if others have seen this
>>> kind
>>> >> of
>>> >> > issue before. I've created a ticket in Trac for it:
>>> >> > https://www.orangefs.org/trac/orangefs/ticket/34
>>> >> >
>>> >> > I'm planning on just copying your setup to recreate the problem.
>>> >> Anything
>>> >> > special about the contents of the 10GB file or just urandom type
>>> data?
>>> >> >
>>> >> > Thanks for letting us know about this!
>>> >> >
>>> >> > Michael
>>> >>
>>> >> There was nothing special about the file.  My file was just a
>>> repeating
>>> >> sequence of characters (ex. 200 'A's, followed by 200 'B's, and so
>>> on).
>>> >>
>>> >> - Benjamin Severs
>>> >
>>> > More information...
>>> >
>>> > I believe this issue is caused by some race condition among the
>>> directio
>>> > threads.  From my testing, it appears that the corruption stops (or
>>> at
>>> > least
>>> > hasn't shown up in my testing) if I either switch to using the
>>> alt-aio
>>> > trove
>>> > method or if I reduce the number of directio threads to 1.
>>> >
>>> > Hope this is useful.
>>> >
>>> > --
>>> > Benjamin Severs
>>> > _______________________________________________
>>> > Pvfs2-developers mailing list
>>> > [email protected]
>>> > http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
>>> >
>>>
>>>
>>> _______________________________________________
>>> Pvfs2-developers mailing list
>>> [email protected]
>>> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
>>>
>> _______________________________________________
>> Pvfs2-developers mailing list
>> [email protected]
>> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
>>
>
>
> _______________________________________________
> Pvfs2-developers mailing list
> [email protected]
> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
>


_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Reply via email to