On Sun, Jul 22, 2018 at 1:41 PM, Raghavendra Gowdappa <[email protected]> wrote:
> George, > > Sorry. I sent you a version of the fix which was stale. Can you try with: > https://review.gluster.org/20549 > > This patch passes the test case you've given. > Patchset 1 solves this problem. However, it ran into dbench failures as md-cache was slow to update its cache. Once I fixed it, I am seeing failures again. with performance.stat-prefetch off, the error goes away. But, I can see only ctime changes. Wondering whether this is related to ctime translator or an issue in md-cache. Note that md-cache caches stats from codepaths which don't result in stat updation in kernel too. So, it could be either, * a bug in md-cache * or a bug where in those codepaths wrong/changed stat was sent. I'll probe the first hypothesis. @Pranith/@Ravi, What do you think about second hypothesis? regards, Raghavendra > > regards, > Raghavendra > > On Fri, Jul 20, 2018 at 2:59 PM, Lian, George (NSB - CN/Hangzhou) < > [email protected]> wrote: > >> Hi, >> >> >> >> Sorry, there seems still have issue. >> >> >> >> We use “dd” application of linux tools instead of my demo program, and >> if the file is not exist before dd, the issue still be there. >> >> >> >> The test command is >> >> rm -rf /mnt/test/file.txt ; dd if=/dev/zero of=/mnt/test/file.txt bs=512 >> count=1 oflag=sync;stat /mnt/test/file.txt;tar -czvf /tmp/abc.gz >> >> 1) If we set md-cache-timeout to 0, the issue will not happen >> >> 2) If we set md-cache-timeout to 1, the issue will 100% reproduced! >> (with new patch you mentioned in the mail) >> >> >> >> Please see detail test result as the below: >> >> >> >> bash-4.4# gluster v set export md-cache-timeout 0 >> >> volume set: failed: Volume export does not exist >> >> bash-4.4# gluster v set test md-cache-timeout 0 >> >> volume set: success >> >> bash-4.4# dd if=/dev/zero of=/mnt/test/file.txt bs=512 count=1 >> oflag=sync;stat /mnt/test/file.txt;tar -czvf /tmp/abc.gz >> /mnt/test/file.txt;stat /mnt/test/file.txt^C >> >> bash-4.4# rm /mnt/test/file.txt >> >> bash-4.4# dd if=/dev/zero of=/mnt/test/file.txt bs=512 count=1 >> oflag=sync;stat /mnt/test/file.txt;tar -czvf /tmp/abc.gz >> /mnt/test/file.txt;stat /mnt/test/file.txt >> >> 1+0 records in >> >> 1+0 records out >> >> 512 bytes copied, 0.00932571 s, 54.9 kB/s >> >> File: /mnt/test/file.txt >> >> Size: 512 Blocks: 1 IO Block: 131072 regular file >> >> Device: 33h/51d Inode: 9949244856126716752 Links: 1 >> >> Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root) >> >> Access: 2018-07-13 17:55:02.756000000 +0000 >> >> Modify: 2018-07-13 17:55:02.764000000 +0000 >> >> Change: 2018-07-13 17:55:02.768000000 +0000 >> >> Birth: - >> >> tar: Removing leading `/' from member names >> >> /mnt/test/file.txt >> >> File: /mnt/test/file.txt >> >> Size: 512 Blocks: 1 IO Block: 131072 regular file >> >> Device: 33h/51d Inode: 9949244856126716752 Links: 1 >> >> Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root) >> >> Access: 2018-07-13 17:55:02.776000000 +0000 >> >> Modify: 2018-07-13 17:55:02.764000000 +0000 >> >> Change: 2018-07-13 17:55:02.768000000 +0000 >> >> Birth: - >> >> bash-4.4# gluster v set test md-cache-timeout 1 >> >> volume set: success >> >> bash-4.4# rm /mnt/test/file.txt >> >> bash-4.4# dd if=/dev/zero of=/mnt/test/file.txt bs=512 count=1 >> oflag=sync;stat /mnt/test/file.txt;tar -czvf /tmp/abc.gz >> /mnt/test/file.txt;stat /mnt/test/file.txt >> >> 1+0 records in >> >> 1+0 records out >> >> 512 bytes copied, 0.0107589 s, 47.6 kB/s >> >> File: /mnt/test/file.txt >> >> Size: 512 Blocks: 1 IO Block: 131072 regular file >> >> Device: 33h/51d Inode: 13569976446871695205 Links: 1 >> >> Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root) >> >> Access: 2018-07-13 17:55:11.548000000 +0000 >> >> Modify: 2018-07-13 17:55:11.560000000 +0000 >> >> Change: 2018-07-13 17:55:11.560000000 +0000 >> >> Birth: - >> >> tar: Removing leading `/' from member names >> >> /mnt/test/file.txt >> >> tar: /mnt/test/file.txt: file changed as we read it >> >> File: /mnt/test/file.txt >> >> Size: 512 Blocks: 1 IO Block: 131072 regular file >> >> Device: 33h/51d Inode: 13569976446871695205 Links: 1 >> >> Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root) >> >> Access: 2018-07-13 17:55:11.580000000 +0000 >> >> Modify: 2018-07-13 17:55:11.560000000 +0000 >> >> Change: 2018-07-13 17:55:11.564000000 +0000 >> >> Birth: - >> >> >> >> >> >> Best Regards, >> >> George >> >> *From:* [email protected] [mailto:gluster-devel-bounces@ >> gluster.org] *On Behalf Of *Raghavendra Gowdappa >> *Sent:* Friday, July 20, 2018 4:01 PM >> *To:* Lian, George (NSB - CN/Hangzhou) <[email protected]> >> *Cc:* Zhang, Bingxuan (NSB - CN/Hangzhou) <[email protected]>; >> Raghavendra G <[email protected]>; [email protected] >> >> *Subject:* Re: [Gluster-devel] The ctime of fstat is not correct which >> lead to "tar" utility error >> >> >> >> >> >> >> >> On Fri, Jul 20, 2018 at 1:22 PM, Lian, George (NSB - CN/Hangzhou) < >> [email protected]> wrote: >> >> >>>We recently identified an issue with stat-prefetch. Fix can be found >> at: >> >> >>>https://review.gluster.org/#/c/20410/11 >> >> >> >> >>>Can you let us know whether this helps? >> >> >> >> >> >> The patch can resolve this issue, I have verified in Gluster 4.2(master >> trunk branch) and Gluster 3.12.3! >> >> >> >> Thanks we'll merge it. >> >> >> >> >> >> Thanks & Best Regards, >> >> George >> >> >> >> *From:* [email protected] [mailto:gluster-devel-bounces@ >> gluster.org] *On Behalf Of *Raghavendra Gowdappa >> *Sent:* Thursday, July 19, 2018 5:06 PM >> *To:* Lian, George (NSB - CN/Hangzhou) <[email protected]> >> *Cc:* Zhang, Bingxuan (NSB - CN/Hangzhou) <[email protected]>; >> [email protected]; Raghavendra G <[email protected]> >> *Subject:* Re: [Gluster-devel] The ctime of fstat is not correct which >> lead to "tar" utility error >> >> >> >> >> >> >> >> On Thu, Jul 19, 2018 at 2:29 PM, Lian, George (NSB - CN/Hangzhou) < >> [email protected]> wrote: >> >> Hi, Gluster Experts, >> >> >> >> In glusterfs version 3.12.3, There seems a “fstat” issue for ctime after >> we use fsync, >> >> We have a demo execute binary which write some data and then do fsync for >> this file, it named as “tt”, >> >> Then run tar command right after “tt” command, it will always error with >> “tar: /mnt/test/file1.txt: file changed as we read it” >> >> >> >> The command output is list as the below, the source code and volume info >> configuration attached FYI, >> >> This issue will be 100% reproducible! (/mnt/test is mountpoint of >> glusterfs volume “test” , which the volume info is attached in mail) >> >> ------------------------------------------------------------------ >> >> ./tt;tar -czvf /tmp/abc.gz /mnt/test/file1.txt >> >> mtime:1531247107.272000000 >> >> ctime:1531247107.272000000 >> >> tar: Removing leading `/' from member names >> >> /mnt/test/file1.txt >> >> tar: /mnt/test/file1.txt: file changed as we read it >> >> ---------------------------------------------------------------------- >> >> >> >> After my investigation, the xattrop for changelog is later than the fsync >> response , this is mean: >> >> In function “afr_fsync_cbk” will call afr_delayed_changelog_wake_resume >> (this, local->fd, stub); >> >> >> >> In our case, it always a pending changelog , so glusterfs save the >> metadata information to stub, and handle pending changelog first, >> >> But the changelog will also change the ctime, from the packet captured by >> tcpdump, the response packet of xattrop will not include the metadata >> information, and the wake_resume also not handle this metadata changed >> case. >> >> >> >> So in this case, the metadata in mdc_cache is not right, and when cache >> is valid, the application will get WRONG metadata! >> >> >> >> For verify my guess, if I change the configuration for this volume >> >> “gluster v set test md-cache-timeout 0” or >> >> “gluster v set export stat-prefetch off” >> >> This issue will be GONE! >> >> >> >> We recently identified an issue with stat-prefetch. Fix can be found at: >> >> https://review.gluster.org/#/c/20410/11 >> >> >> >> Can you let us know whether this helps? >> >> >> >> >> >> >> >> And I restore the configuration to default, which mean stat-prefetch is >> on and md-cache-timeout is 1 second, >> >> I try invalidate the md-cache in source code as the below in function >> mdc_fync_cbk on md-cache.c >> >> The issue also will be GONE! >> >> >> >> So GLusterFS Experts, >> >> Could you please verify this issue, and share your comments on my >> investigation? >> >> And your finally solutions is highly appreciated! >> >> >> >> Does the following fix you've posted solves the problem? >> >> >> >> >> >> ------------------------------------changes in function “mdc_fsync_cbk” >> >> int >> >> mdc_fsync_cbk (call_frame_t *frame, void *cookie, xlator_t *this, >> >> int32_t op_ret, int32_t op_errno, >> >> struct iatt *prebuf, struct iatt *postbuf, dict_t *xdata) >> >> { >> >> mdc_local_t *local = NULL; >> >> >> >> local = frame->local; >> >> >> >> if (op_ret != 0) >> >> goto out; >> >> >> >> if (!local) >> >> goto out; >> >> >> >> mdc_inode_iatt_set_validate(this, local->fd->inode, prebuf, >> postbuf, >> >> _gf_true); >> >> /* new added for ctime issue*/ >> >> mdc_inode_iatt_invalidate(this, local->fd->inode); >> >> >> /* new added end*/ >> >> out: >> >> MDC_STACK_UNWIND (fsync, frame, op_ret, op_errno, prebuf, >> postbuf, >> >> xdata); >> >> >> >> return 0; >> >> } >> >> ------------------------------------------------------------ >> --------------------- >> >> Best Regards, >> >> George >> >> >> >> >> > >
_______________________________________________ Gluster-devel mailing list [email protected] https://lists.gluster.org/mailman/listinfo/gluster-devel
