On 06/24/2011 10:15 AM, Michael Moore wrote:
With some additional offline information from Benjamin the problem has
been tracked down to dbpf_bstream_direct_write_op_svc(). The issue is
that two write calls to different, contiguous, sections of the file
occur without locking around retrieval of the current file size. The
flow is similar to this, assuming two writes X -> Y, Y+1 - >Z
both writes enter dbpf_bstream_direct_write_op_svc()
write X->Y gets the current file size
write X->Y makes the pwrite call
write Y+1 -> Z gets the current file size
write X->Y updates the file size
write Y+1 -> Z makes the pwrite call (padding zeros from the previous
end of file)
write Y+1 -> Z updates the file size
I can certainly add some locking to prevent this. Mostly to Phil or
Sam, was there something in place that should be preventing this
before I add another wheel?
I can't speak for Sam, but your analysis sounds correct to me. I guess
it is the "padding zeros" part that is corrupting the data, right?
Thanks for tracking that down!
I did try moving the flocks from direct_locked_write() around the get
file size and update but it looks like the fd is being closed causing
the locks to be tossed.
I think it is an fcntl lock, right? Either way that would probably be
tricky to use to protect the file size query. I think that hits the db
rather than the underlying file so it won't be affected by the lock.
Kind of a separate topic, but if the fd is being closed then we might
want to look into that too. Trove has an fd open cache that is supposed
to keep it from repeatedly opening and closing the same underlying file.
-Phil
_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers