Thanks Mike, this is good stuff. :)

+ Justin


On 06/09/2014, at 8:19 PM, mike wrote:
> I upgraded the client to Gluster 3.5.2, but there is no difference.
> 
> The bug is almost certainly in the Fuse client. If I remount the filesystem 
> with NFS, the problem is no longer observable.
> 
> I spent a little time looking through the xlator/fuse-bridge to see where the 
> offsets are coming from, but I'm really not familiar enough with the code, so 
> it is slow going.
> 
> Unfortunately, I'm still having trouble reproducing this in a python script 
> that could be readily attached to a bug report.
> 
> I'll take a crack at that again, but I will a file a bug anyway for 
> completeness.
> 
> On Sep 5, 2014, at 7:10 PM, mike <[email protected]> wrote:
> 
>> I have narrowed down the source of the bug. 
>> 
>> Here is an strace of glusterfsd http://fpaste.org/131455/40996378/
>> 
>> The first line represents a write that does *not* make it into the 
>> underlying file.
>> 
>> The last line is the write that stomps the earlier write.
>> 
>> As I said, the client file is opened in O_APPEND mode, but on the glusterfsd 
>> side, the file is just O_CREAT|O_WRONLY. The means the offsets to pwrite() 
>> need to be valid.
>> 
>> I correlated this to a tcpdump I took and I can see that in fact, the RPCs 
>> being sent have the wrong offset.  Interestingly, glusterfs.write-is-append 
>> = 0, which I wouldn't have expected.
>> 
>> I think the bug lies in the glusterfs fuse client.
>> 
>> As to your question about Gluster 3.5.2, I may be able to do that if I am 
>> unable to find the bug in the source.
>> 
>> -Mike
>> 
>> On Sep 5, 2014, at 6:16 PM, Justin Clift <[email protected]> wrote:
>> 
>>> On 06/09/2014, at 12:10 AM, mike wrote:
>>>> I have found that the O_APPEND flag is key to this failure - I had 
>>>> overlooked that flag when reading the strace and trying to cobble up a 
>>>> minimal reproduction.
>>>> 
>>>> I now have a small pair of python scripts that can reliably reproduce this 
>>>> failure.
>>> 
>>> 
>>> As a thought, is there a reasonable way you can test this on GlusterFS 
>>> 3.5.2?
>>> 
>>> There were some important bug fixes in 3.5.2 (from 3.5.1).
>>> 
>>> Note I'm not saying yours is one of them, I'm just asking if it's
>>> easy to test and find out. :)
>>> 
>>> Regards and best wishes,
>>> 
>>> Justin Clift
>>> 
>>> --
>>> GlusterFS - http://www.gluster.org
>>> 
>>> An open source, distributed file system scaling to several
>>> petabytes, and handling thousands of clients.
>>> 
>>> My personal twitter: twitter.com/realjustinclift
>>> 
>> 
> 

--
GlusterFS - http://www.gluster.org

An open source, distributed file system scaling to several
petabytes, and handling thousands of clients.

My personal twitter: twitter.com/realjustinclift

_______________________________________________
Gluster-users mailing list
[email protected]
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Reply via email to