It seems using the 'sync' mount option on the server uploader01 is also a
valid work around.

Is it a problem that the meta data is available to other cephfs clients
ahead of the file contents being flushed by the client doing the write?

I think having an invalid page cache of zeros is a problem but its not
clear to me what the expected behavior is when a cephfs client is trying to
read a file contents that is currently still being flushed to the file
system by the cephfs client that created the file.

On Tue, Aug 30, 2016 at 5:49 PM, Sean Redmond <[email protected]>
wrote:

> Hi,
>
> I have been able to pick through the process a little further and
> replicate it via the command line. The flow seems looks like this:
>
> 1) The user uploads an image to webserver server 'uploader01' it gets
> written to a path such as '/cephfs/webdata/static/456/JHL/66448H-755h.jpg'
> on cephfs
>
> 2) The MDS makes the file meta data available for this new file
> immediately to all clients.
>
> 3) The 'uploader01' server asynchronously commits the file contents to
> disk as sync is not explicitly called during the upload.
>
> 4) Before step 3 is done the visitor requests the file via one of two web
> servers server1 or server2 - the MDS provides the meta data but
> the contents of the file is not committed to disk yet so the data read
> returns 0's - This is then cached by the file system page cache until it
> expires or is flushed manually.
>
> 5) As step 4 typically only happens on one of the two web servers before
> step 3 is complete we get the mismatch between server1 and server2 file
> system page cache.
>
> *The below demonstrates how to reproduce this issue*
> http://pastebin.com/QK8AemAb
>
> As we can see the checksum of the file returned by the web server is 0 as
> the file contents has not been flushed to disk from server uploader01
>
> *If however we call ‘sync’ as shown below the checksum is correct:*
>
> http://pastebin.com/p4CfhEFt
>
> *If we also wait for 10 seconds for the kernel to flush the dirty pages,
> we can also see the checksum is valid:*
>
> http://pastebin.com/1w6UZzNQ
>
> It looks it maybe a race between the time it takes the uploader01 server
> to commit the file to the file system and the fast incoming read request
> from the visiting user to server1 or server2.
>
> Thanks
>
>
> On Tue, Aug 30, 2016 at 10:21 AM, Sean Redmond <[email protected]>
> wrote:
>
>> You are correct it only seems to impact recently modified files.
>>
>> On Tue, Aug 30, 2016 at 3:36 AM, Yan, Zheng <[email protected]> wrote:
>>
>>> On Tue, Aug 30, 2016 at 2:11 AM, Gregory Farnum <[email protected]>
>>> wrote:
>>> > On Mon, Aug 29, 2016 at 7:14 AM, Sean Redmond <[email protected]>
>>> wrote:
>>> >> Hi,
>>> >>
>>> >> I am running cephfs (10.2.2) with kernel 4.7.0-1. I have noticed that
>>> >> frequently static files are showing empty when serviced via a web
>>> server
>>> >> (apache). I have tracked this down further and can see when running a
>>> >> checksum against the file on the cephfs file system on the node
>>> serving the
>>> >> empty http response the checksum is '00000'
>>> >>
>>> >> The below shows the checksum on a defective node.
>>> >>
>>> >> [root@server2]# ls -al /cephfs/webdata/static/456/JHL/66448H-755h.jpg
>>> >> -rw-r--r-- 1 apache apache 53317 Aug 28 23:46
>>> >> /cephfs/webdata/static/456/JHL/66448H-755h.jpg
>>>
>>> It seems this file was modified recently. Maybe the web server
>>> silently modifies the files. Please check if this issue happens on
>>> older files.
>>>
>>> Regards
>>> Yan, Zheng
>>>
>>> >>
>>> >> [root@server2]# sum /cephfs/webdata/static/456/JHL/66448H-755h.jpg
>>> >> 00000    53
>>> >
>>> > So can we presume there are no file contents, and it's just 53 blocks
>>> of zeros?
>>> >
>>> > This doesn't sound familiar to me; Zheng, do you have any ideas?
>>> > Anyway, ceph-fuse shouldn't be susceptible to this bug even with the
>>> > page cache enabled; if you're just serving stuff via the web it's
>>> > probably a better idea anyway (harder to break, easier to update,
>>> > etc).
>>> > -Greg
>>> >
>>> >>
>>> >> The below shows the checksum on a working node.
>>> >>
>>> >> [root@server1]# ls -al /cephfs/webdata/static/456/JHL/66448H-755h.jpg
>>> >> -rw-r--r-- 1 apache apache 53317 Aug 28 23:46
>>> >> /cephfs/webdata/static/456/JHL/66448H-755h.jpg
>>> >>
>>> >> [root@server1]# sum /cephfs/webdata/static/456/JHL/66448H-755h.jpg
>>> >> 03620    53
>>> >> [root@server1]#
>>> >>
>>> >> If I flush the cache as shown below the checksum returns as expected
>>> and the
>>> >> web server serves up valid content.
>>> >>
>>> >> [root@server2]# echo 3 > /proc/sys/vm/drop_caches
>>> >> [root@server2]# sum /cephfs/webdata/static/456/JHL/66448H-755h.jpg
>>> >> 03620    53
>>> >>
>>> >> After some time typically less than 1hr the issue repeats, It seems
>>> to not
>>> >> repeat if I take any one of the servers out of the LB and only serve
>>> >> requests from one of the servers.
>>> >>
>>> >> I may try and use the FUSE client has has a mount option direct_io
>>> that
>>> >> looks to disable page cache.
>>> >>
>>> >> I have been hunting in the ML and tracker but could not see anything
>>> really
>>> >> close to this issue, Any input or feedback on similar experiences is
>>> >> welcome.
>>> >>
>>> >> Thanks
>>> >>
>>> >>
>>> >> _______________________________________________
>>> >> ceph-users mailing list
>>> >> [email protected]
>>> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>> >>
>>> > _______________________________________________
>>> > ceph-users mailing list
>>> > [email protected]
>>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>
>>
>
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to