I am not sure how to tell? Server1 and Server2 mount the ceph file system using kernel client 4.7.2 and I can replicate the problem using '/usr/bin/sum' to read the file or a http GET request via a web server (apache).
On Wed, Aug 31, 2016 at 2:38 PM, Yan, Zheng <[email protected]> wrote: > On Wed, Aug 31, 2016 at 12:49 AM, Sean Redmond <[email protected]> > wrote: > > Hi, > > > > I have been able to pick through the process a little further and > replicate > > it via the command line. The flow seems looks like this: > > > > 1) The user uploads an image to webserver server 'uploader01' it gets > > written to a path such as '/cephfs/webdata/static/456/ > JHL/66448H-755h.jpg' > > on cephfs > > > > 2) The MDS makes the file meta data available for this new file > immediately > > to all clients. > > > > 3) The 'uploader01' server asynchronously commits the file contents to > disk > > as sync is not explicitly called during the upload. > > > > 4) Before step 3 is done the visitor requests the file via one of two web > > servers server1 or server2 - the MDS provides the meta data but the > contents > > of the file is not committed to disk yet so the data read returns 0's - > This > > is then cached by the file system page cache until it expires or is > flushed > > manually. > > do server1 or server2 use memory-mapped IO to read the file? > > Regards > Yan, Zheng > > > > > 5) As step 4 typically only happens on one of the two web servers before > > step 3 is complete we get the mismatch between server1 and server2 file > > system page cache. > > > > The below demonstrates how to reproduce this issue > > > > http://pastebin.com/QK8AemAb > > > > As we can see the checksum of the file returned by the web server is 0 as > > the file contents has not been flushed to disk from server uploader01 > > > > If however we call ‘sync’ as shown below the checksum is correct: > > > > http://pastebin.com/p4CfhEFt > > > > If we also wait for 10 seconds for the kernel to flush the dirty pages, > we > > can also see the checksum is valid: > > > > http://pastebin.com/1w6UZzNQ > > > > It looks it maybe a race between the time it takes the uploader01 server > to > > commit the file to the file system and the fast incoming read request > from > > the visiting user to server1 or server2. > > > > Thanks > > > > > > On Tue, Aug 30, 2016 at 10:21 AM, Sean Redmond <[email protected]> > > wrote: > >> > >> You are correct it only seems to impact recently modified files. > >> > >> On Tue, Aug 30, 2016 at 3:36 AM, Yan, Zheng <[email protected]> wrote: > >>> > >>> On Tue, Aug 30, 2016 at 2:11 AM, Gregory Farnum <[email protected]> > >>> wrote: > >>> > On Mon, Aug 29, 2016 at 7:14 AM, Sean Redmond < > [email protected]> > >>> > wrote: > >>> >> Hi, > >>> >> > >>> >> I am running cephfs (10.2.2) with kernel 4.7.0-1. I have noticed > that > >>> >> frequently static files are showing empty when serviced via a web > >>> >> server > >>> >> (apache). I have tracked this down further and can see when running > a > >>> >> checksum against the file on the cephfs file system on the node > >>> >> serving the > >>> >> empty http response the checksum is '00000' > >>> >> > >>> >> The below shows the checksum on a defective node. > >>> >> > >>> >> [root@server2]# ls -al /cephfs/webdata/static/456/ > JHL/66448H-755h.jpg > >>> >> -rw-r--r-- 1 apache apache 53317 Aug 28 23:46 > >>> >> /cephfs/webdata/static/456/JHL/66448H-755h.jpg > >>> > >>> It seems this file was modified recently. Maybe the web server > >>> silently modifies the files. Please check if this issue happens on > >>> older files. > >>> > >>> Regards > >>> Yan, Zheng > >>> > >>> >> > >>> >> [root@server2]# sum /cephfs/webdata/static/456/JHL/66448H-755h.jpg > >>> >> 00000 53 > >>> > > >>> > So can we presume there are no file contents, and it's just 53 blocks > >>> > of zeros? > >>> > > >>> > This doesn't sound familiar to me; Zheng, do you have any ideas? > >>> > Anyway, ceph-fuse shouldn't be susceptible to this bug even with the > >>> > page cache enabled; if you're just serving stuff via the web it's > >>> > probably a better idea anyway (harder to break, easier to update, > >>> > etc). > >>> > -Greg > >>> > > >>> >> > >>> >> The below shows the checksum on a working node. > >>> >> > >>> >> [root@server1]# ls -al /cephfs/webdata/static/456/ > JHL/66448H-755h.jpg > >>> >> -rw-r--r-- 1 apache apache 53317 Aug 28 23:46 > >>> >> /cephfs/webdata/static/456/JHL/66448H-755h.jpg > >>> >> > >>> >> [root@server1]# sum /cephfs/webdata/static/456/JHL/66448H-755h.jpg > >>> >> 03620 53 > >>> >> [root@server1]# > >>> >> > >>> >> If I flush the cache as shown below the checksum returns as expected > >>> >> and the > >>> >> web server serves up valid content. > >>> >> > >>> >> [root@server2]# echo 3 > /proc/sys/vm/drop_caches > >>> >> [root@server2]# sum /cephfs/webdata/static/456/JHL/66448H-755h.jpg > >>> >> 03620 53 > >>> >> > >>> >> After some time typically less than 1hr the issue repeats, It seems > to > >>> >> not > >>> >> repeat if I take any one of the servers out of the LB and only serve > >>> >> requests from one of the servers. > >>> >> > >>> >> I may try and use the FUSE client has has a mount option direct_io > >>> >> that > >>> >> looks to disable page cache. > >>> >> > >>> >> I have been hunting in the ML and tracker but could not see anything > >>> >> really > >>> >> close to this issue, Any input or feedback on similar experiences is > >>> >> welcome. > >>> >> > >>> >> Thanks > >>> >> > >>> >> > >>> >> _______________________________________________ > >>> >> ceph-users mailing list > >>> >> [email protected] > >>> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >>> >> > >>> > _______________________________________________ > >>> > ceph-users mailing list > >>> > [email protected] > >>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >> > >> > > >
_______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
