Hi,

It seems to be using syscall mmap() from what I read this indicates it is
using memory-mapped IO.

Please see a strace here: http://pastebin.com/6wjhSNrP

Thanks

On Wed, Aug 31, 2016 at 5:51 PM, Sean Redmond <[email protected]>
wrote:

> I am not sure how to tell?
>
> Server1 and Server2 mount the ceph file system using kernel client 4.7.2
> and I can replicate the problem using '/usr/bin/sum' to read the file or a
> http GET request via a web server (apache).
>
> On Wed, Aug 31, 2016 at 2:38 PM, Yan, Zheng <[email protected]> wrote:
>
>> On Wed, Aug 31, 2016 at 12:49 AM, Sean Redmond <[email protected]>
>> wrote:
>> > Hi,
>> >
>> > I have been able to pick through the process a little further and
>> replicate
>> > it via the command line. The flow seems looks like this:
>> >
>> > 1) The user uploads an image to webserver server 'uploader01' it gets
>> > written to a path such as '/cephfs/webdata/static/456/JH
>> L/66448H-755h.jpg'
>> > on cephfs
>> >
>> > 2) The MDS makes the file meta data available for this new file
>> immediately
>> > to all clients.
>> >
>> > 3) The 'uploader01' server asynchronously commits the file contents to
>> disk
>> > as sync is not explicitly called during the upload.
>> >
>> > 4) Before step 3 is done the visitor requests the file via one of two
>> web
>> > servers server1 or server2 - the MDS provides the meta data but the
>> contents
>> > of the file is not committed to disk yet so the data read returns 0's -
>> This
>> > is then cached by the file system page cache until it expires or is
>> flushed
>> > manually.
>>
>> do server1 or server2 use memory-mapped IO to read the file?
>>
>> Regards
>> Yan, Zheng
>>
>> >
>> > 5) As step 4 typically only happens on one of the two web servers before
>> > step 3 is complete we get the mismatch between server1 and server2 file
>> > system page cache.
>> >
>> > The below demonstrates how to reproduce this issue
>> >
>> > http://pastebin.com/QK8AemAb
>> >
>> > As we can see the checksum of the file returned by the web server is 0
>> as
>> > the file contents has not been flushed to disk from server uploader01
>> >
>> > If however we call ‘sync’ as shown below the checksum is correct:
>> >
>> > http://pastebin.com/p4CfhEFt
>> >
>> > If we also wait for 10 seconds for the kernel to flush the dirty pages,
>> we
>> > can also see the checksum is valid:
>> >
>> > http://pastebin.com/1w6UZzNQ
>> >
>> > It looks it maybe a race between the time it takes the uploader01
>> server to
>> > commit the file to the file system and the fast incoming read request
>> from
>> > the visiting user to server1 or server2.
>> >
>> > Thanks
>> >
>> >
>> > On Tue, Aug 30, 2016 at 10:21 AM, Sean Redmond <[email protected]
>> >
>> > wrote:
>> >>
>> >> You are correct it only seems to impact recently modified files.
>> >>
>> >> On Tue, Aug 30, 2016 at 3:36 AM, Yan, Zheng <[email protected]> wrote:
>> >>>
>> >>> On Tue, Aug 30, 2016 at 2:11 AM, Gregory Farnum <[email protected]>
>> >>> wrote:
>> >>> > On Mon, Aug 29, 2016 at 7:14 AM, Sean Redmond <
>> [email protected]>
>> >>> > wrote:
>> >>> >> Hi,
>> >>> >>
>> >>> >> I am running cephfs (10.2.2) with kernel 4.7.0-1. I have noticed
>> that
>> >>> >> frequently static files are showing empty when serviced via a web
>> >>> >> server
>> >>> >> (apache). I have tracked this down further and can see when
>> running a
>> >>> >> checksum against the file on the cephfs file system on the node
>> >>> >> serving the
>> >>> >> empty http response the checksum is '00000'
>> >>> >>
>> >>> >> The below shows the checksum on a defective node.
>> >>> >>
>> >>> >> [root@server2]# ls -al /cephfs/webdata/static/456/JHL
>> /66448H-755h.jpg
>> >>> >> -rw-r--r-- 1 apache apache 53317 Aug 28 23:46
>> >>> >> /cephfs/webdata/static/456/JHL/66448H-755h.jpg
>> >>>
>> >>> It seems this file was modified recently. Maybe the web server
>> >>> silently modifies the files. Please check if this issue happens on
>> >>> older files.
>> >>>
>> >>> Regards
>> >>> Yan, Zheng
>> >>>
>> >>> >>
>> >>> >> [root@server2]# sum /cephfs/webdata/static/456/JHL/66448H-755h.jpg
>> >>> >> 00000    53
>> >>> >
>> >>> > So can we presume there are no file contents, and it's just 53
>> blocks
>> >>> > of zeros?
>> >>> >
>> >>> > This doesn't sound familiar to me; Zheng, do you have any ideas?
>> >>> > Anyway, ceph-fuse shouldn't be susceptible to this bug even with the
>> >>> > page cache enabled; if you're just serving stuff via the web it's
>> >>> > probably a better idea anyway (harder to break, easier to update,
>> >>> > etc).
>> >>> > -Greg
>> >>> >
>> >>> >>
>> >>> >> The below shows the checksum on a working node.
>> >>> >>
>> >>> >> [root@server1]# ls -al /cephfs/webdata/static/456/JHL
>> /66448H-755h.jpg
>> >>> >> -rw-r--r-- 1 apache apache 53317 Aug 28 23:46
>> >>> >> /cephfs/webdata/static/456/JHL/66448H-755h.jpg
>> >>> >>
>> >>> >> [root@server1]# sum /cephfs/webdata/static/456/JHL/66448H-755h.jpg
>> >>> >> 03620    53
>> >>> >> [root@server1]#
>> >>> >>
>> >>> >> If I flush the cache as shown below the checksum returns as
>> expected
>> >>> >> and the
>> >>> >> web server serves up valid content.
>> >>> >>
>> >>> >> [root@server2]# echo 3 > /proc/sys/vm/drop_caches
>> >>> >> [root@server2]# sum /cephfs/webdata/static/456/JHL/66448H-755h.jpg
>> >>> >> 03620    53
>> >>> >>
>> >>> >> After some time typically less than 1hr the issue repeats, It
>> seems to
>> >>> >> not
>> >>> >> repeat if I take any one of the servers out of the LB and only
>> serve
>> >>> >> requests from one of the servers.
>> >>> >>
>> >>> >> I may try and use the FUSE client has has a mount option direct_io
>> >>> >> that
>> >>> >> looks to disable page cache.
>> >>> >>
>> >>> >> I have been hunting in the ML and tracker but could not see
>> anything
>> >>> >> really
>> >>> >> close to this issue, Any input or feedback on similar experiences
>> is
>> >>> >> welcome.
>> >>> >>
>> >>> >> Thanks
>> >>> >>
>> >>> >>
>> >>> >> _______________________________________________
>> >>> >> ceph-users mailing list
>> >>> >> [email protected]
>> >>> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >>> >>
>> >>> > _______________________________________________
>> >>> > ceph-users mailing list
>> >>> > [email protected]
>> >>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >>
>> >>
>> >
>>
>
>
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to