Re: [Gluster-devel] The ctime of fstat is not correct which lead to "tar" utility error

2018-07-23 Thread Lian, George (NSB - CN/Hangzhou)
Hi,

I tested both patchset1 and patchset2 of  https://review.gluster.org/20549, the 
ctime issue seems both be there.
And I use my test c program and “dd” program, the issue both be there.

But when use the patch of https://review.gluster.org/#/c/20410/11,
My test C program and “dd” to an exist file will pass,
ONLY “dd” to new file will be failed.

Best Regards,
George



From: gluster-devel-boun...@gluster.org 
[mailto:gluster-devel-boun...@gluster.org] On Behalf Of Raghavendra Gowdappa
Sent: Monday, July 23, 2018 10:37 AM
To: Lian, George (NSB - CN/Hangzhou) 
Cc: Zhang, Bingxuan (NSB - CN/Hangzhou) ; 
Raghavendra G ; Gluster-devel@gluster.org
Subject: Re: [Gluster-devel] The ctime of fstat is not correct which lead to 
"tar" utility error



On Sun, Jul 22, 2018 at 1:41 PM, Raghavendra Gowdappa 
mailto:rgowd...@redhat.com>> wrote:
George,
Sorry. I sent you a version of the fix which was stale. Can you try with:
https://review.gluster.org/20549
This patch passes the test case you've given.

Patchset 1 solves this problem. However, it ran into dbench failures as 
md-cache was slow to update its cache. Once I fixed it, I am seeing failures 
again. with performance.stat-prefetch off, the error goes away. But, I can see 
only ctime changes. Wondering whether this is related to ctime translator or an 
issue in  md-cache. Note that md-cache caches stats from codepaths which don't 
result in stat updation in kernel too. So, it could be either,
* a bug in md-cache
* or a bug where in those codepaths wrong/changed stat was sent.

I'll probe the first hypothesis. @Pranith/@Ravi,

What do you think about second hypothesis?

regards,
Raghavendra

regards,
Raghavendra

On Fri, Jul 20, 2018 at 2:59 PM, Lian, George (NSB - CN/Hangzhou) 
mailto:george.l...@nokia-sbell.com>> wrote:
Hi,

Sorry, there seems still have issue.

We use “dd” application of linux tools instead of my demo program, and if the 
file is not exist before dd, the issue still be there.

The test command is
rm -rf /mnt/test/file.txt ; dd if=/dev/zero of=/mnt/test/file.txt bs=512 
count=1 oflag=sync;stat /mnt/test/file.txt;tar -czvf /tmp/abc.gz

1) If we set md-cache-timeout to 0, the issue will not happen

2) If we set md-cache-timeout to 1, the issue will 100% reproduced! (with 
new patch you mentioned in the mail)


Please see detail test result as the below:

bash-4.4# gluster v set export md-cache-timeout 0
volume set: failed: Volume export does not exist
bash-4.4# gluster v set test md-cache-timeout 0
volume set: success
bash-4.4# dd if=/dev/zero of=/mnt/test/file.txt bs=512 count=1 oflag=sync;stat 
/mnt/test/file.txt;tar -czvf /tmp/abc.gz /mnt/test/file.txt;stat 
/mnt/test/file.txt^C
bash-4.4# rm /mnt/test/file.txt
bash-4.4# dd if=/dev/zero of=/mnt/test/file.txt bs=512 count=1 oflag=sync;stat 
/mnt/test/file.txt;tar -czvf /tmp/abc.gz /mnt/test/file.txt;stat 
/mnt/test/file.txt
1+0 records in
1+0 records out
512 bytes copied, 0.00932571 s, 54.9 kB/s
  File: /mnt/test/file.txt
  Size: 512 Blocks: 1  IO Block: 131072 regular file
Device: 33h/51d Inode: 9949244856126716752  Links: 1
Access: (0644/-rw-r--r--)  Uid: (0/root)   Gid: (0/root)
Access: 2018-07-13 17:55:02.75600 +
Modify: 2018-07-13 17:55:02.76400 +
Change: 2018-07-13 17:55:02.76800 +
Birth: -
tar: Removing leading `/' from member names
/mnt/test/file.txt
  File: /mnt/test/file.txt
  Size: 512 Blocks: 1  IO Block: 131072 regular file
Device: 33h/51d Inode: 9949244856126716752  Links: 1
Access: (0644/-rw-r--r--)  Uid: (0/root)   Gid: (0/root)
Access: 2018-07-13 17:55:02.77600 +
Modify: 2018-07-13 17:55:02.76400 +
Change: 2018-07-13 17:55:02.76800 +
Birth: -
bash-4.4# gluster v set test md-cache-timeout 1
volume set: success
bash-4.4# rm /mnt/test/file.txt
bash-4.4# dd if=/dev/zero of=/mnt/test/file.txt bs=512 count=1 oflag=sync;stat 
/mnt/test/file.txt;tar -czvf /tmp/abc.gz /mnt/test/file.txt;stat 
/mnt/test/file.txt
1+0 records in
1+0 records out
512 bytes copied, 0.0107589 s, 47.6 kB/s
  File: /mnt/test/file.txt
  Size: 512 Blocks: 1  IO Block: 131072 regular file
Device: 33h/51d Inode: 13569976446871695205  Links: 1
Access: (0644/-rw-r--r--)  Uid: (0/root)   Gid: (0/root)
Access: 2018-07-13 17:55:11.54800 +
Modify: 2018-07-13 17:55:11.56000 +
Change: 2018-07-13 17:55:11.56000 +
Birth: -
tar: Removing leading `/' from member names
/mnt/test/file.txt
tar: /mnt/test/file.txt: file changed as we read it
  File: /mnt/test/file.txt
  Size: 512 Blocks: 1  IO Block: 131072 regular file
Device: 33h/51d Inode: 13569976446871695205  Links: 1
Access: (0644/-rw-r--r--)  Uid: (0/root)   Gid: (0/root)
Access: 2018-07-13 17:55:11.58000 +
Modify: 2018-07-13 17:55:11.56000 +
Change: 2018-07-13 17:55:11.56400 +
Birth: -


Best Regards,
George
From: 

Re: [Gluster-devel] The ctime of fstat is not correct which lead to "tar" utility error

2018-07-22 Thread Raghavendra Gowdappa
On Mon, Jul 23, 2018 at 8:40 AM, Lian, George (NSB - CN/Hangzhou) <
george.l...@nokia-sbell.com> wrote:

> Hi,
>
>
>
> I tested both patchset1 and patchset2 of  https://review.gluster.org/20549,
> the ctime issue seems both be there.
>
> And I use my test c program and “dd” program, the issue both be there.
>

Strange. Tests on my laptop succeeded.


>
> But when use the patch of https://review.gluster.org/#/c/20410/11,
>
> My test C program and “dd” to an exist file will pass,
>
> ONLY “dd” to new file will be failed.
>

Thanks. I'll think more about this.


>
> Best Regards,
>
> George
>
>
>
>
>
>
>
> *From:* gluster-devel-boun...@gluster.org [mailto:gluster-devel-bounces@
> gluster.org] *On Behalf Of *Raghavendra Gowdappa
> *Sent:* Monday, July 23, 2018 10:37 AM
>
> *To:* Lian, George (NSB - CN/Hangzhou) 
> *Cc:* Zhang, Bingxuan (NSB - CN/Hangzhou) ;
> Raghavendra G ; Gluster-devel@gluster.org
> *Subject:* Re: [Gluster-devel] The ctime of fstat is not correct which
> lead to "tar" utility error
>
>
>
>
>
>
>
> On Sun, Jul 22, 2018 at 1:41 PM, Raghavendra Gowdappa 
> wrote:
>
> George,
>
> Sorry. I sent you a version of the fix which was stale. Can you try with:
> https://review.gluster.org/20549
>
> This patch passes the test case you've given.
>
>
>
> Patchset 1 solves this problem. However, it ran into dbench failures as
> md-cache was slow to update its cache. Once I fixed it, I am seeing
> failures again. with performance.stat-prefetch off, the error goes away.
> But, I can see only ctime changes. Wondering whether this is related to
> ctime translator or an issue in  md-cache. Note that md-cache caches stats
> from codepaths which don't result in stat updation in kernel too. So, it
> could be either,
>
> * a bug in md-cache
>
> * or a bug where in those codepaths wrong/changed stat was sent.
>
>
>
> I'll probe the first hypothesis. @Pranith/@Ravi,
>
>
>
> What do you think about second hypothesis?
>
>
>
> regards,
>
> Raghavendra
>
>
>
> regards,
>
> Raghavendra
>
>
>
> On Fri, Jul 20, 2018 at 2:59 PM, Lian, George (NSB - CN/Hangzhou) <
> george.l...@nokia-sbell.com> wrote:
>
> Hi,
>
>
>
> Sorry, there seems still have issue.
>
>
>
> We use “dd” application of linux tools instead of my demo program, and if
> the file is not exist before dd, the issue still be there.
>
>
>
> The test command is
>
> rm -rf /mnt/test/file.txt ; dd if=/dev/zero of=/mnt/test/file.txt bs=512
> count=1 oflag=sync;stat /mnt/test/file.txt;tar -czvf /tmp/abc.gz
>
> 1) If we set md-cache-timeout to 0, the issue will not happen
>
> 2) If we set md-cache-timeout to 1, the issue will 100% reproduced!
> (with new patch you mentioned in the mail)
>
>
>
> Please see detail test result as the below:
>
>
>
> bash-4.4# gluster v set export md-cache-timeout 0
>
> volume set: failed: Volume export does not exist
>
> bash-4.4# gluster v set test md-cache-timeout 0
>
> volume set: success
>
> bash-4.4# dd if=/dev/zero of=/mnt/test/file.txt bs=512 count=1
> oflag=sync;stat /mnt/test/file.txt;tar -czvf /tmp/abc.gz
> /mnt/test/file.txt;stat /mnt/test/file.txt^C
>
> bash-4.4# rm /mnt/test/file.txt
>
> bash-4.4# dd if=/dev/zero of=/mnt/test/file.txt bs=512 count=1
> oflag=sync;stat /mnt/test/file.txt;tar -czvf /tmp/abc.gz
> /mnt/test/file.txt;stat /mnt/test/file.txt
>
> 1+0 records in
>
> 1+0 records out
>
> 512 bytes copied, 0.00932571 s, 54.9 kB/s
>
>   File: /mnt/test/file.txt
>
>   Size: 512 Blocks: 1  IO Block: 131072 regular file
>
> Device: 33h/51d Inode: 9949244856126716752  Links: 1
>
> Access: (0644/-rw-r--r--)  Uid: (0/root)   Gid: (0/root)
>
> Access: 2018-07-13 17:55:02.75600 +
>
> Modify: 2018-07-13 17:55:02.76400 +
>
> Change: 2018-07-13 17:55:02.76800 +
>
> Birth: -
>
> tar: Removing leading `/' from member names
>
> /mnt/test/file.txt
>
>   File: /mnt/test/file.txt
>
>   Size: 512 Blocks: 1  IO Block: 131072 regular file
>
> Device: 33h/51d Inode: 9949244856126716752  Links: 1
>
> Access: (0644/-rw-r--r--)  Uid: (0/root)   Gid: (0/root)
>
> Access: 2018-07-13 17:55:02.77600 +
>
> Modify: 2018-07-13 17:55:02.76400 +
>
> Change: 2018-07-13 17:55:02.76800 +
>
> Birth: -
>
> bash-4.4# gluster v set test md-cache-timeout 1
>
> volume set: success
>
> bash-4.4# rm /mnt/test/file.txt
>
> bash-4.4# dd if=/dev/zero of=/mnt/test/file.txt bs=512 count=1
> oflag=sync;stat /mnt/test/file.txt;tar -czvf /tmp/abc.gz
> /mnt/test/file.txt;stat /mnt/test/file.txt
>
> 1+0 records in
>
> 1+0 records out
>
> 512 bytes copied, 0.0107589 s, 47.6 kB/s
>
>   File: /mnt/test/file.txt
>
>   Size: 512 Blocks: 1  IO Block: 131072 regular file
>
> Device: 33h/51d Inode: 13569976446871695205  Links: 1
>
> Access: (0644/-rw-r--r--)  Uid: (0/root)   Gid: (0/root)
>
> Access: 2018-07-13 17:55:11.54800 +
>
> Modify: 2018-07-13 17:55:11.56000 +
>
> Change: 2018-07-13 17:55:11.56000 

Re: [Gluster-devel] The ctime of fstat is not correct which lead to "tar" utility error

2018-07-22 Thread Raghavendra Gowdappa
On Sun, Jul 22, 2018 at 1:41 PM, Raghavendra Gowdappa 
wrote:

> George,
>
> Sorry. I sent you a version of the fix which was stale. Can you try with:
> https://review.gluster.org/20549
>
> This patch passes the test case you've given.
>

Patchset 1 solves this problem. However, it ran into dbench failures as
md-cache was slow to update its cache. Once I fixed it, I am seeing
failures again. with performance.stat-prefetch off, the error goes away.
But, I can see only ctime changes. Wondering whether this is related to
ctime translator or an issue in  md-cache. Note that md-cache caches stats
from codepaths which don't result in stat updation in kernel too. So, it
could be either,
* a bug in md-cache
* or a bug where in those codepaths wrong/changed stat was sent.

I'll probe the first hypothesis. @Pranith/@Ravi,

What do you think about second hypothesis?

regards,
Raghavendra

>
> regards,
> Raghavendra
>
> On Fri, Jul 20, 2018 at 2:59 PM, Lian, George (NSB - CN/Hangzhou) <
> george.l...@nokia-sbell.com> wrote:
>
>> Hi,
>>
>>
>>
>> Sorry, there seems still have issue.
>>
>>
>>
>> We use “dd” application of linux tools instead of my demo program, and
>> if the file is not exist before dd, the issue still be there.
>>
>>
>>
>> The test command is
>>
>> rm -rf /mnt/test/file.txt ; dd if=/dev/zero of=/mnt/test/file.txt bs=512
>> count=1 oflag=sync;stat /mnt/test/file.txt;tar -czvf /tmp/abc.gz
>>
>> 1) If we set md-cache-timeout to 0, the issue will not happen
>>
>> 2) If we set md-cache-timeout to 1, the issue will 100% reproduced!
>> (with new patch you mentioned in the mail)
>>
>>
>>
>> Please see detail test result as the below:
>>
>>
>>
>> bash-4.4# gluster v set export md-cache-timeout 0
>>
>> volume set: failed: Volume export does not exist
>>
>> bash-4.4# gluster v set test md-cache-timeout 0
>>
>> volume set: success
>>
>> bash-4.4# dd if=/dev/zero of=/mnt/test/file.txt bs=512 count=1
>> oflag=sync;stat /mnt/test/file.txt;tar -czvf /tmp/abc.gz
>> /mnt/test/file.txt;stat /mnt/test/file.txt^C
>>
>> bash-4.4# rm /mnt/test/file.txt
>>
>> bash-4.4# dd if=/dev/zero of=/mnt/test/file.txt bs=512 count=1
>> oflag=sync;stat /mnt/test/file.txt;tar -czvf /tmp/abc.gz
>> /mnt/test/file.txt;stat /mnt/test/file.txt
>>
>> 1+0 records in
>>
>> 1+0 records out
>>
>> 512 bytes copied, 0.00932571 s, 54.9 kB/s
>>
>>   File: /mnt/test/file.txt
>>
>>   Size: 512 Blocks: 1  IO Block: 131072 regular file
>>
>> Device: 33h/51d Inode: 9949244856126716752  Links: 1
>>
>> Access: (0644/-rw-r--r--)  Uid: (0/root)   Gid: (0/root)
>>
>> Access: 2018-07-13 17:55:02.75600 +
>>
>> Modify: 2018-07-13 17:55:02.76400 +
>>
>> Change: 2018-07-13 17:55:02.76800 +
>>
>> Birth: -
>>
>> tar: Removing leading `/' from member names
>>
>> /mnt/test/file.txt
>>
>>   File: /mnt/test/file.txt
>>
>>   Size: 512 Blocks: 1  IO Block: 131072 regular file
>>
>> Device: 33h/51d Inode: 9949244856126716752  Links: 1
>>
>> Access: (0644/-rw-r--r--)  Uid: (0/root)   Gid: (0/root)
>>
>> Access: 2018-07-13 17:55:02.77600 +
>>
>> Modify: 2018-07-13 17:55:02.76400 +
>>
>> Change: 2018-07-13 17:55:02.76800 +
>>
>> Birth: -
>>
>> bash-4.4# gluster v set test md-cache-timeout 1
>>
>> volume set: success
>>
>> bash-4.4# rm /mnt/test/file.txt
>>
>> bash-4.4# dd if=/dev/zero of=/mnt/test/file.txt bs=512 count=1
>> oflag=sync;stat /mnt/test/file.txt;tar -czvf /tmp/abc.gz
>> /mnt/test/file.txt;stat /mnt/test/file.txt
>>
>> 1+0 records in
>>
>> 1+0 records out
>>
>> 512 bytes copied, 0.0107589 s, 47.6 kB/s
>>
>>   File: /mnt/test/file.txt
>>
>>   Size: 512 Blocks: 1  IO Block: 131072 regular file
>>
>> Device: 33h/51d Inode: 13569976446871695205  Links: 1
>>
>> Access: (0644/-rw-r--r--)  Uid: (0/root)   Gid: (0/root)
>>
>> Access: 2018-07-13 17:55:11.54800 +
>>
>> Modify: 2018-07-13 17:55:11.56000 +
>>
>> Change: 2018-07-13 17:55:11.56000 +
>>
>> Birth: -
>>
>> tar: Removing leading `/' from member names
>>
>> /mnt/test/file.txt
>>
>> tar: /mnt/test/file.txt: file changed as we read it
>>
>>   File: /mnt/test/file.txt
>>
>>   Size: 512 Blocks: 1  IO Block: 131072 regular file
>>
>> Device: 33h/51d Inode: 13569976446871695205  Links: 1
>>
>> Access: (0644/-rw-r--r--)  Uid: (0/root)   Gid: (0/root)
>>
>> Access: 2018-07-13 17:55:11.58000 +
>>
>> Modify: 2018-07-13 17:55:11.56000 +
>>
>> Change: 2018-07-13 17:55:11.56400 +
>>
>> Birth: -
>>
>>
>>
>>
>>
>> Best Regards,
>>
>> George
>>
>> *From:* gluster-devel-boun...@gluster.org [mailto:gluster-devel-bounces@
>> gluster.org] *On Behalf Of *Raghavendra Gowdappa
>> *Sent:* Friday, July 20, 2018 4:01 PM
>> *To:* Lian, George (NSB - CN/Hangzhou) 
>> *Cc:* Zhang, Bingxuan (NSB - CN/Hangzhou) ;
>> Raghavendra G ; Gluster-devel@gluster.org
>>
>> *Subject:* Re: [Gluster-devel] The ctime of 

Re: [Gluster-devel] The ctime of fstat is not correct which lead to "tar" utility error

2018-07-22 Thread Raghavendra Gowdappa
George,

Sorry. I sent you a version of the fix which was stale. Can you try with:
https://review.gluster.org/20549

This patch passes the test case you've given.

regards,
Raghavendra

On Fri, Jul 20, 2018 at 2:59 PM, Lian, George (NSB - CN/Hangzhou) <
george.l...@nokia-sbell.com> wrote:

> Hi,
>
>
>
> Sorry, there seems still have issue.
>
>
>
> We use “dd” application of linux tools instead of my demo program, and if
> the file is not exist before dd, the issue still be there.
>
>
>
> The test command is
>
> rm -rf /mnt/test/file.txt ; dd if=/dev/zero of=/mnt/test/file.txt bs=512
> count=1 oflag=sync;stat /mnt/test/file.txt;tar -czvf /tmp/abc.gz
>
> 1) If we set md-cache-timeout to 0, the issue will not happen
>
> 2) If we set md-cache-timeout to 1, the issue will 100% reproduced!
> (with new patch you mentioned in the mail)
>
>
>
> Please see detail test result as the below:
>
>
>
> bash-4.4# gluster v set export md-cache-timeout 0
>
> volume set: failed: Volume export does not exist
>
> bash-4.4# gluster v set test md-cache-timeout 0
>
> volume set: success
>
> bash-4.4# dd if=/dev/zero of=/mnt/test/file.txt bs=512 count=1
> oflag=sync;stat /mnt/test/file.txt;tar -czvf /tmp/abc.gz
> /mnt/test/file.txt;stat /mnt/test/file.txt^C
>
> bash-4.4# rm /mnt/test/file.txt
>
> bash-4.4# dd if=/dev/zero of=/mnt/test/file.txt bs=512 count=1
> oflag=sync;stat /mnt/test/file.txt;tar -czvf /tmp/abc.gz
> /mnt/test/file.txt;stat /mnt/test/file.txt
>
> 1+0 records in
>
> 1+0 records out
>
> 512 bytes copied, 0.00932571 s, 54.9 kB/s
>
>   File: /mnt/test/file.txt
>
>   Size: 512 Blocks: 1  IO Block: 131072 regular file
>
> Device: 33h/51d Inode: 9949244856126716752  Links: 1
>
> Access: (0644/-rw-r--r--)  Uid: (0/root)   Gid: (0/root)
>
> Access: 2018-07-13 17:55:02.75600 +
>
> Modify: 2018-07-13 17:55:02.76400 +
>
> Change: 2018-07-13 17:55:02.76800 +
>
> Birth: -
>
> tar: Removing leading `/' from member names
>
> /mnt/test/file.txt
>
>   File: /mnt/test/file.txt
>
>   Size: 512 Blocks: 1  IO Block: 131072 regular file
>
> Device: 33h/51d Inode: 9949244856126716752  Links: 1
>
> Access: (0644/-rw-r--r--)  Uid: (0/root)   Gid: (0/root)
>
> Access: 2018-07-13 17:55:02.77600 +
>
> Modify: 2018-07-13 17:55:02.76400 +
>
> Change: 2018-07-13 17:55:02.76800 +
>
> Birth: -
>
> bash-4.4# gluster v set test md-cache-timeout 1
>
> volume set: success
>
> bash-4.4# rm /mnt/test/file.txt
>
> bash-4.4# dd if=/dev/zero of=/mnt/test/file.txt bs=512 count=1
> oflag=sync;stat /mnt/test/file.txt;tar -czvf /tmp/abc.gz
> /mnt/test/file.txt;stat /mnt/test/file.txt
>
> 1+0 records in
>
> 1+0 records out
>
> 512 bytes copied, 0.0107589 s, 47.6 kB/s
>
>   File: /mnt/test/file.txt
>
>   Size: 512 Blocks: 1  IO Block: 131072 regular file
>
> Device: 33h/51d Inode: 13569976446871695205  Links: 1
>
> Access: (0644/-rw-r--r--)  Uid: (0/root)   Gid: (0/root)
>
> Access: 2018-07-13 17:55:11.54800 +
>
> Modify: 2018-07-13 17:55:11.56000 +
>
> Change: 2018-07-13 17:55:11.56000 +
>
> Birth: -
>
> tar: Removing leading `/' from member names
>
> /mnt/test/file.txt
>
> tar: /mnt/test/file.txt: file changed as we read it
>
>   File: /mnt/test/file.txt
>
>   Size: 512 Blocks: 1  IO Block: 131072 regular file
>
> Device: 33h/51d Inode: 13569976446871695205  Links: 1
>
> Access: (0644/-rw-r--r--)  Uid: (0/root)   Gid: (0/root)
>
> Access: 2018-07-13 17:55:11.58000 +
>
> Modify: 2018-07-13 17:55:11.56000 +
>
> Change: 2018-07-13 17:55:11.56400 +
>
> Birth: -
>
>
>
>
>
> Best Regards,
>
> George
>
> *From:* gluster-devel-boun...@gluster.org [mailto:gluster-devel-bounces@
> gluster.org] *On Behalf Of *Raghavendra Gowdappa
> *Sent:* Friday, July 20, 2018 4:01 PM
> *To:* Lian, George (NSB - CN/Hangzhou) 
> *Cc:* Zhang, Bingxuan (NSB - CN/Hangzhou) ;
> Raghavendra G ; Gluster-devel@gluster.org
>
> *Subject:* Re: [Gluster-devel] The ctime of fstat is not correct which
> lead to "tar" utility error
>
>
>
>
>
>
>
> On Fri, Jul 20, 2018 at 1:22 PM, Lian, George (NSB - CN/Hangzhou) <
> george.l...@nokia-sbell.com> wrote:
>
> >>>We recently identified an issue with stat-prefetch. Fix can be found at:
>
> >>>https://review.gluster.org/#/c/20410/11
>
>
>
> >>>Can you let us know whether this helps?
>
>
>
>
>
> The patch can resolve this issue, I have verified in Gluster 4.2(master
> trunk branch) and Gluster 3.12.3!
>
>
>
> Thanks we'll merge it.
>
>
>
>
>
> Thanks & Best Regards,
>
> George
>
>
>
> *From:* gluster-devel-boun...@gluster.org [mailto:gluster-devel-bounces@
> gluster.org] *On Behalf Of *Raghavendra Gowdappa
> *Sent:* Thursday, July 19, 2018 5:06 PM
> *To:* Lian, George (NSB - CN/Hangzhou) 
> *Cc:* Zhang, Bingxuan (NSB - CN/Hangzhou) ;
> Gluster-devel@gluster.org; Raghavendra G 
> *Subject:* Re: [Gluster-devel] The 

Re: [Gluster-devel] The ctime of fstat is not correct which lead to "tar" utility error

2018-07-20 Thread Lian, George (NSB - CN/Hangzhou)
Hi,

Sorry, there seems still have issue.

We use “dd” application of linux tools instead of my demo program, and if the 
file is not exist before dd, the issue still be there.

The test command is
rm -rf /mnt/test/file.txt ; dd if=/dev/zero of=/mnt/test/file.txt bs=512 
count=1 oflag=sync;stat /mnt/test/file.txt;tar -czvf /tmp/abc.gz

1) If we set md-cache-timeout to 0, the issue will not happen

2) If we set md-cache-timeout to 1, the issue will 100% reproduced! (with 
new patch you mentioned in the mail)


Please see detail test result as the below:

bash-4.4# gluster v set export md-cache-timeout 0
volume set: failed: Volume export does not exist
bash-4.4# gluster v set test md-cache-timeout 0
volume set: success
bash-4.4# dd if=/dev/zero of=/mnt/test/file.txt bs=512 count=1 oflag=sync;stat 
/mnt/test/file.txt;tar -czvf /tmp/abc.gz /mnt/test/file.txt;stat 
/mnt/test/file.txt^C
bash-4.4# rm /mnt/test/file.txt
bash-4.4# dd if=/dev/zero of=/mnt/test/file.txt bs=512 count=1 oflag=sync;stat 
/mnt/test/file.txt;tar -czvf /tmp/abc.gz /mnt/test/file.txt;stat 
/mnt/test/file.txt
1+0 records in
1+0 records out
512 bytes copied, 0.00932571 s, 54.9 kB/s
  File: /mnt/test/file.txt
  Size: 512 Blocks: 1  IO Block: 131072 regular file
Device: 33h/51d Inode: 9949244856126716752  Links: 1
Access: (0644/-rw-r--r--)  Uid: (0/root)   Gid: (0/root)
Access: 2018-07-13 17:55:02.75600 +
Modify: 2018-07-13 17:55:02.76400 +
Change: 2018-07-13 17:55:02.76800 +
Birth: -
tar: Removing leading `/' from member names
/mnt/test/file.txt
  File: /mnt/test/file.txt
  Size: 512 Blocks: 1  IO Block: 131072 regular file
Device: 33h/51d Inode: 9949244856126716752  Links: 1
Access: (0644/-rw-r--r--)  Uid: (0/root)   Gid: (0/root)
Access: 2018-07-13 17:55:02.77600 +
Modify: 2018-07-13 17:55:02.76400 +
Change: 2018-07-13 17:55:02.76800 +
Birth: -
bash-4.4# gluster v set test md-cache-timeout 1
volume set: success
bash-4.4# rm /mnt/test/file.txt
bash-4.4# dd if=/dev/zero of=/mnt/test/file.txt bs=512 count=1 oflag=sync;stat 
/mnt/test/file.txt;tar -czvf /tmp/abc.gz /mnt/test/file.txt;stat 
/mnt/test/file.txt
1+0 records in
1+0 records out
512 bytes copied, 0.0107589 s, 47.6 kB/s
  File: /mnt/test/file.txt
  Size: 512 Blocks: 1  IO Block: 131072 regular file
Device: 33h/51d Inode: 13569976446871695205  Links: 1
Access: (0644/-rw-r--r--)  Uid: (0/root)   Gid: (0/root)
Access: 2018-07-13 17:55:11.54800 +
Modify: 2018-07-13 17:55:11.56000 +
Change: 2018-07-13 17:55:11.56000 +
Birth: -
tar: Removing leading `/' from member names
/mnt/test/file.txt
tar: /mnt/test/file.txt: file changed as we read it
  File: /mnt/test/file.txt
  Size: 512 Blocks: 1  IO Block: 131072 regular file
Device: 33h/51d Inode: 13569976446871695205  Links: 1
Access: (0644/-rw-r--r--)  Uid: (0/root)   Gid: (0/root)
Access: 2018-07-13 17:55:11.58000 +
Modify: 2018-07-13 17:55:11.56000 +
Change: 2018-07-13 17:55:11.56400 +
Birth: -


Best Regards,
George
From: gluster-devel-boun...@gluster.org 
[mailto:gluster-devel-boun...@gluster.org] On Behalf Of Raghavendra Gowdappa
Sent: Friday, July 20, 2018 4:01 PM
To: Lian, George (NSB - CN/Hangzhou) 
Cc: Zhang, Bingxuan (NSB - CN/Hangzhou) ; 
Raghavendra G ; Gluster-devel@gluster.org
Subject: Re: [Gluster-devel] The ctime of fstat is not correct which lead to 
"tar" utility error



On Fri, Jul 20, 2018 at 1:22 PM, Lian, George (NSB - CN/Hangzhou) 
mailto:george.l...@nokia-sbell.com>> wrote:
>>>We recently identified an issue with stat-prefetch. Fix can be found at:
>>>https://review.gluster.org/#/c/20410/11

>>>Can you let us know whether this helps?


The patch can resolve this issue, I have verified in Gluster 4.2(master trunk 
branch) and Gluster 3.12.3!

Thanks we'll merge it.


Thanks & Best Regards,
George

From: 
gluster-devel-boun...@gluster.org 
[mailto:gluster-devel-boun...@gluster.org]
 On Behalf Of Raghavendra Gowdappa
Sent: Thursday, July 19, 2018 5:06 PM
To: Lian, George (NSB - CN/Hangzhou) 
mailto:george.l...@nokia-sbell.com>>
Cc: Zhang, Bingxuan (NSB - CN/Hangzhou) 
mailto:bingxuan.zh...@nokia-sbell.com>>; 
Gluster-devel@gluster.org; Raghavendra G 
mailto:raghaven...@gluster.com>>
Subject: Re: [Gluster-devel] The ctime of fstat is not correct which lead to 
"tar" utility error



On Thu, Jul 19, 2018 at 2:29 PM, Lian, George (NSB - CN/Hangzhou) 
mailto:george.l...@nokia-sbell.com>> wrote:
Hi, Gluster Experts,

In glusterfs version 3.12.3, There seems a “fstat” issue for ctime after we use 
fsync,
We have a demo execute binary which write some data and then do fsync for this 
file, it named as “tt”,
Then run tar command right after “tt” command, it will always 

Re: [Gluster-devel] The ctime of fstat is not correct which lead to "tar" utility error

2018-07-20 Thread Lian, George (NSB - CN/Hangzhou)
>>>We recently identified an issue with stat-prefetch. Fix can be found at:
>>>https://review.gluster.org/#/c/20410/11

>>>Can you let us know whether this helps?


The patch can resolve this issue, I have verified in Gluster 4.2(master trunk 
branch) and Gluster 3.12.3!

Thanks & Best Regards,
George

From: gluster-devel-boun...@gluster.org 
[mailto:gluster-devel-boun...@gluster.org] On Behalf Of Raghavendra Gowdappa
Sent: Thursday, July 19, 2018 5:06 PM
To: Lian, George (NSB - CN/Hangzhou) 
Cc: Zhang, Bingxuan (NSB - CN/Hangzhou) ; 
Gluster-devel@gluster.org; Raghavendra G 
Subject: Re: [Gluster-devel] The ctime of fstat is not correct which lead to 
"tar" utility error



On Thu, Jul 19, 2018 at 2:29 PM, Lian, George (NSB - CN/Hangzhou) 
mailto:george.l...@nokia-sbell.com>> wrote:
Hi, Gluster Experts,

In glusterfs version 3.12.3, There seems a “fstat” issue for ctime after we use 
fsync,
We have a demo execute binary which write some data and then do fsync for this 
file, it named as “tt”,
Then run tar command right after “tt” command, it will always error with “tar: 
/mnt/test/file1.txt: file changed as we read it”

The command output is list as the below, the source code and volume info 
configuration attached FYI,
This issue will be 100% reproducible! (/mnt/test is mountpoint of glusterfs 
volume “test” , which the volume info is attached in mail)
--
./tt;tar -czvf /tmp/abc.gz /mnt/test/file1.txt
mtime:1531247107.27200
ctime:1531247107.27200
tar: Removing leading `/' from member names
/mnt/test/file1.txt
tar: /mnt/test/file1.txt: file changed as we read it
--

After my investigation, the xattrop for changelog is later than the fsync 
response , this is mean:
In function  “afr_fsync_cbk” will call afr_delayed_changelog_wake_resume (this, 
local->fd, stub);

In our case, it always a pending changelog , so glusterfs save the metadata 
information to stub, and handle pending changelog first,
But the changelog will also change the ctime, from the packet captured by 
tcpdump, the response packet of xattrop will not include the metadata 
information,  and the wake_resume also not handle this metadata changed case.

So in this case, the metadata in mdc_cache is not right, and when cache is 
valid, the application will get WRONG metadata!

For verify my guess, if I change the configuration for this volume
“gluster v set test md-cache-timeout 0” or
“gluster v set export stat-prefetch off”
This issue will be GONE!

We recently identified an issue with stat-prefetch. Fix can be found at:
https://review.gluster.org/#/c/20410/11

Can you let us know whether this helps?



And I restore the configuration to default, which mean stat-prefetch is on and 
md-cache-timeout is 1 second,
I try invalidate the md-cache in source code as the below in function 
mdc_fync_cbk on md-cache.c
The issue also will be GONE!

So GLusterFS Experts,
Could you please verify this issue, and share your comments on my investigation?
And your finally solutions is highly appreciated!

Does the following fix you've posted solves the problem?


changes in function “mdc_fsync_cbk”
int
mdc_fsync_cbk (call_frame_t *frame, void *cookie, xlator_t *this,
   int32_t op_ret, int32_t op_errno,
   struct iatt *prebuf, struct iatt *postbuf, dict_t *xdata)
{
mdc_local_t  *local = NULL;

local = frame->local;

if (op_ret != 0)
goto out;

if (!local)
goto out;

mdc_inode_iatt_set_validate(this, local->fd->inode, prebuf, postbuf,
 _gf_true);
/* new added for ctime issue*/
mdc_inode_iatt_invalidate(this, local->fd->inode);
/* new added end*/
out:
MDC_STACK_UNWIND (fsync, frame, op_ret, op_errno, prebuf, postbuf,
  xdata);

return 0;
}
-
Best Regards,
George

___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] The ctime of fstat is not correct which lead to "tar" utility error

2018-07-20 Thread Raghavendra Gowdappa
On Fri, Jul 20, 2018 at 1:22 PM, Lian, George (NSB - CN/Hangzhou) <
george.l...@nokia-sbell.com> wrote:

> >>>We recently identified an issue with stat-prefetch. Fix can be found at:
>
> >>>https://review.gluster.org/#/c/20410/11
>
>
>
> >>>Can you let us know whether this helps?
>
>
>
>
>
> The patch can resolve this issue, I have verified in Gluster 4.2(master
> trunk branch) and Gluster 3.12.3!
>

Thanks we'll merge it.


>
> Thanks & Best Regards,
>
> George
>
>
>
> *From:* gluster-devel-boun...@gluster.org [mailto:gluster-devel-bounces@
> gluster.org] *On Behalf Of *Raghavendra Gowdappa
> *Sent:* Thursday, July 19, 2018 5:06 PM
> *To:* Lian, George (NSB - CN/Hangzhou) 
> *Cc:* Zhang, Bingxuan (NSB - CN/Hangzhou) ;
> Gluster-devel@gluster.org; Raghavendra G 
> *Subject:* Re: [Gluster-devel] The ctime of fstat is not correct which
> lead to "tar" utility error
>
>
>
>
>
>
>
> On Thu, Jul 19, 2018 at 2:29 PM, Lian, George (NSB - CN/Hangzhou) <
> george.l...@nokia-sbell.com> wrote:
>
> Hi, Gluster Experts,
>
>
>
> In glusterfs version 3.12.3, There seems a “fstat” issue for ctime after
> we use fsync,
>
> We have a demo execute binary which write some data and then do fsync for
> this file, it named as “tt”,
>
> Then run tar command right after “tt” command, it will always error with “tar:
> /mnt/test/file1.txt: file changed as we read it”
>
>
>
> The command output is list as the below, the source code and volume info
> configuration attached FYI,
>
> This issue will be 100% reproducible! (/mnt/test is mountpoint of
> glusterfs volume “test” , which the volume info is attached in mail)
>
> --
>
> ./tt;tar -czvf /tmp/abc.gz /mnt/test/file1.txt
>
> mtime:1531247107.27200
>
> ctime:1531247107.27200
>
> tar: Removing leading `/' from member names
>
> /mnt/test/file1.txt
>
> tar: /mnt/test/file1.txt: file changed as we read it
>
> --
>
>
>
> After my investigation, the xattrop for changelog is later than the fsync
> response , this is mean:
>
> In function  “afr_fsync_cbk” will call afr_delayed_changelog_wake_resume
> (this, local->fd, stub);
>
>
>
> In our case, it always a pending changelog , so glusterfs save the
> metadata information to stub, and handle pending changelog first,
>
> But the changelog will also change the ctime, from the packet captured by
> tcpdump, the response packet of xattrop will not include the metadata
> information,  and the wake_resume also not handle this metadata changed
> case.
>
>
>
> So in this case, the metadata in mdc_cache is not right, and when cache is
> valid, the application will get WRONG metadata!
>
>
>
> For verify my guess, if I change the configuration for this volume
>
> “gluster v set test md-cache-timeout 0” or
>
> “gluster v set export stat-prefetch off”
>
> This issue will be GONE!
>
>
>
> We recently identified an issue with stat-prefetch. Fix can be found at:
>
> https://review.gluster.org/#/c/20410/11
>
>
>
> Can you let us know whether this helps?
>
>
>
>
>
>
>
> And I restore the configuration to default, which mean stat-prefetch is on
> and md-cache-timeout is 1 second,
>
> I try invalidate the md-cache in source code as the below in function
> mdc_fync_cbk on md-cache.c
>
> The issue also will be GONE!
>
>
>
> So GLusterFS Experts,
>
> Could you please verify this issue, and share your comments on my
> investigation?
>
> And your finally solutions is highly appreciated!
>
>
>
> Does the following fix you've posted solves the problem?
>
>
>
>
>
> changes in function “mdc_fsync_cbk”
>
> int
>
> mdc_fsync_cbk (call_frame_t *frame, void *cookie, xlator_t *this,
>
>int32_t op_ret, int32_t op_errno,
>
>struct iatt *prebuf, struct iatt *postbuf, dict_t *xdata)
>
> {
>
> mdc_local_t  *local = NULL;
>
>
>
> local = frame->local;
>
>
>
> if (op_ret != 0)
>
> goto out;
>
>
>
> if (!local)
>
> goto out;
>
>
>
> mdc_inode_iatt_set_validate(this, local->fd->inode, prebuf,
> postbuf,
>
>  _gf_true);
>
> /* new added for ctime issue*/
>
> mdc_inode_iatt_invalidate(this, local->fd->inode);
>
>
> /* new added end*/
>
> out:
>
> MDC_STACK_UNWIND (fsync, frame, op_ret, op_errno, prebuf, postbuf,
>
>   xdata);
>
>
>
> return 0;
>
> }
>
> 
> -
>
> Best Regards,
>
> George
>
>
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] The ctime of fstat is not correct which lead to "tar" utility error

2018-07-19 Thread Lian, George (NSB - CN/Hangzhou)
Hi, Gluster Experts,

In glusterfs version 3.12.3, There seems a “fstat” issue for ctime after we use 
fsync,
We have a demo execute binary which write some data and then do fsync for this 
file, it named as “tt”,
Then run tar command right after “tt” command, it will always error with “tar: 
/mnt/test/file1.txt: file changed as we read it”

The command output is list as the below, the source code and volume info 
configuration attached FYI,
This issue will be 100% reproducible! (/mnt/test is mountpoint of glusterfs 
volume “test” , which the volume info is attached in mail)
--
./tt;tar -czvf /tmp/abc.gz /mnt/test/file1.txt
mtime:1531247107.27200
ctime:1531247107.27200
tar: Removing leading `/' from member names
/mnt/test/file1.txt
tar: /mnt/test/file1.txt: file changed as we read it
--

After my investigation, the xattrop for changelog is later than the fsync 
response , this is mean:
In function  “afr_fsync_cbk” will call afr_delayed_changelog_wake_resume (this, 
local->fd, stub);

In our case, it always a pending changelog , so glusterfs save the metadata 
information to stub, and handle pending changelog first,
But the changelog will also change the ctime, from the packet captured by 
tcpdump, the response packet of xattrop will not include the metadata 
information,  and the wake_resume also not handle this metadata changed case.

So in this case, the metadata in mdc_cache is not right, and when cache is 
valid, the application will get WRONG metadata!

For verify my guess, if I change the configuration for this volume
“gluster v set test md-cache-timeout 0” or
“gluster v set export stat-prefetch off”
This issue will be GONE!


And I restore the configuration to default, which mean stat-prefetch is on and 
md-cache-timeout is 1 second,
I try invalidate the md-cache in source code as the below in function 
mdc_fync_cbk on md-cache.c
The issue also will be GONE!

So GLusterFS Experts,
Could you please verify this issue, and share your comments on my investigation?
And your finally solutions is highly appreciated!

changes in function “mdc_fsync_cbk”
int
mdc_fsync_cbk (call_frame_t *frame, void *cookie, xlator_t *this,
   int32_t op_ret, int32_t op_errno,
   struct iatt *prebuf, struct iatt *postbuf, dict_t *xdata)
{
mdc_local_t  *local = NULL;

local = frame->local;

if (op_ret != 0)
goto out;

if (!local)
goto out;

mdc_inode_iatt_set_validate(this, local->fd->inode, prebuf, postbuf,
 _gf_true);
/* new added for ctime issue*/
mdc_inode_iatt_invalidate(this, local->fd->inode);
/* new added end*/
out:
MDC_STACK_UNWIND (fsync, frame, op_ret, op_errno, prebuf, postbuf,
  xdata);

return 0;
}
-
Best Regards,
George

#include 
#include 
#include 
#include 
#include 
#include 

void main() {
char* fileName = "/mnt/test/file1.txt";
char buf[128];
struct stat st;
struct timeval tv_begin, tv_end;

// create and write a file, then fflush and fsync
FILE* stream = fopen(fileName,"w");
fwrite("0123456789", sizeof(char), 10, stream);
fflush(stream);
fsync(fileno(stream));
//fsync(stream);
fclose(stream);

// last file status change timestamp
stat(fileName, );
printf("mtime:%06d.%06d\n", st.st_mtim.tv_sec, st.st_mtim.tv_nsec);
printf("ctime:%06d.%06d\n", st.st_ctim.tv_sec, st.st_ctim.tv_nsec);

}
bash-4.4# gluster v info test

Volume Name: test
Type: Replicate
Volume ID: 9373eba9-eb84-4618-a54c-f2837345daec
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: rcp:/trunk/brick/test1/sn0
Brick2: rcp:/trunk/brick/test1/sn1
Brick3: rcp:/trunk/brick/test1/sn2 (arbiter)
Options Reconfigured:
performance.client-io-threads: off
nfs.disable: on
transport.address-family: inet
cluster.quorum-type: none
cluster.quorum-reads: no
cluster.favorite-child-policy: mtime
diagnostics.client-log-level: INFO___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] The ctime of fstat is not correct which lead to "tar" utility error

2018-07-19 Thread Raghavendra Gowdappa
On Thu, Jul 19, 2018 at 2:29 PM, Lian, George (NSB - CN/Hangzhou) <
george.l...@nokia-sbell.com> wrote:

> Hi, Gluster Experts,
>
>
>
> In glusterfs version 3.12.3, There seems a “fstat” issue for ctime after
> we use fsync,
>
> We have a demo execute binary which write some data and then do fsync for
> this file, it named as “tt”,
>
> Then run tar command right after “tt” command, it will always error with
> “tar: /mnt/test/file1.txt: file changed as we read it”
>
>
>
> The command output is list as the below, the source code and volume info
> configuration attached FYI,
>
> This issue will be 100% reproducible! (/mnt/test is mountpoint of
> glusterfs volume “test” , which the volume info is attached in mail)
>
> --
>
> ./tt;tar -czvf /tmp/abc.gz /mnt/test/file1.txt
>
> mtime:1531247107.27200
>
> ctime:1531247107.27200
>
> tar: Removing leading `/' from member names
>
> /mnt/test/file1.txt
>
> tar: /mnt/test/file1.txt: file changed as we read it
>
> --
>
>
>
> After my investigation, the xattrop for changelog is later than the fsync
> response , this is mean:
>
> In function  “afr_fsync_cbk” will call afr_delayed_changelog_wake_resume
> (this, local->fd, stub);
>
>
>
> In our case, it always a pending changelog , so glusterfs save the
> metadata information to stub, and handle pending changelog first,
>
> But the changelog will also change the ctime, from the packet captured by
> tcpdump, the response packet of xattrop will not include the metadata
> information,  and the wake_resume also not handle this metadata changed
> case.
>
>
>
> So in this case, the metadata in mdc_cache is not right, and when cache is
> valid, the application will get WRONG metadata!
>
>
>
> For verify my guess, if I change the configuration for this volume
>
> “gluster v set test md-cache-timeout 0” or
>
> “gluster v set export stat-prefetch off”
>
> This issue will be GONE!
>

We recently identified an issue with stat-prefetch. Fix can be found at:
https://review.gluster.org/#/c/20410/11

Can you let us know whether this helps?


>
>
>
> And I restore the configuration to default, which mean stat-prefetch is on
> and md-cache-timeout is 1 second,
>
> I try invalidate the md-cache in source code as the below in function
> mdc_fync_cbk on md-cache.c
>
> The issue also will be GONE!
>
>
>
> So GLusterFS Experts,
>
> Could you please verify this issue, and share your comments on my
> investigation?
>
> And your finally solutions is highly appreciated!
>
>
Does the following fix you've posted solves the problem?


>
> changes in function “mdc_fsync_cbk”
>
> int
>
> mdc_fsync_cbk (call_frame_t *frame, void *cookie, xlator_t *this,
>
>int32_t op_ret, int32_t op_errno,
>
>struct iatt *prebuf, struct iatt *postbuf, dict_t *xdata)
>
> {
>
> mdc_local_t  *local = NULL;
>
>
>
> local = frame->local;
>
>
>
> if (op_ret != 0)
>
> goto out;
>
>
>
> if (!local)
>
> goto out;
>
>
>
> mdc_inode_iatt_set_validate(this, local->fd->inode, prebuf,
> postbuf,
>
>  _gf_true);
>
> /* new added for ctime issue*/
>
> mdc_inode_iatt_invalidate(this, local->fd->inode);
>
>
> /* new added end*/
>
> out:
>
> MDC_STACK_UNWIND (fsync, frame, op_ret, op_errno, prebuf, postbuf,
>
>   xdata);
>
>
>
> return 0;
>
> }
>
> 
> -
>
> Best Regards,
>
> George
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] The ctime of fstat is not correct which lead to "tar" utility error

2018-07-19 Thread Pranith Kumar Karampuri
+Ravi

On Thu, Jul 19, 2018 at 2:29 PM, Lian, George (NSB - CN/Hangzhou) <
george.l...@nokia-sbell.com> wrote:

> Hi, Gluster Experts,
>
>
>
> In glusterfs version 3.12.3, There seems a “fstat” issue for ctime after
> we use fsync,
>
> We have a demo execute binary which write some data and then do fsync for
> this file, it named as “tt”,
>
> Then run tar command right after “tt” command, it will always error with
> “tar: /mnt/test/file1.txt: file changed as we read it”
>
>
>
> The command output is list as the below, the source code and volume info
> configuration attached FYI,
>
> This issue will be 100% reproducible! (/mnt/test is mountpoint of
> glusterfs volume “test” , which the volume info is attached in mail)
>
> --
>
> ./tt;tar -czvf /tmp/abc.gz /mnt/test/file1.txt
>
> mtime:1531247107.27200
>
> ctime:1531247107.27200
>
> tar: Removing leading `/' from member names
>
> /mnt/test/file1.txt
>
> tar: /mnt/test/file1.txt: file changed as we read it
>
> --
>
>
>
> After my investigation, the xattrop for changelog is later than the fsync
> response , this is mean:
>
> In function  “afr_fsync_cbk” will call afr_delayed_changelog_wake_resume
> (this, local->fd, stub);
>
>
>
> In our case, it always a pending changelog , so glusterfs save the
> metadata information to stub, and handle pending changelog first,
>
> But the changelog will also change the ctime, from the packet captured by
> tcpdump, the response packet of xattrop will not include the metadata
> information,  and the wake_resume also not handle this metadata changed
> case.
>
>
>
> So in this case, the metadata in mdc_cache is not right, and when cache is
> valid, the application will get WRONG metadata!
>
>
>
> For verify my guess, if I change the configuration for this volume
>
> “gluster v set test md-cache-timeout 0” or
>
> “gluster v set export stat-prefetch off”
>
> This issue will be GONE!
>
>
>
>
>
> And I restore the configuration to default, which mean stat-prefetch is on
> and md-cache-timeout is 1 second,
>
> I try invalidate the md-cache in source code as the below in function
> mdc_fync_cbk on md-cache.c
>
> The issue also will be GONE!
>
>
>
> So GLusterFS Experts,
>
> Could you please verify this issue, and share your comments on my
> investigation?
>
> And your finally solutions is highly appreciated!
>
>
>
> changes in function “mdc_fsync_cbk”
>
> int
>
> mdc_fsync_cbk (call_frame_t *frame, void *cookie, xlator_t *this,
>
>int32_t op_ret, int32_t op_errno,
>
>struct iatt *prebuf, struct iatt *postbuf, dict_t *xdata)
>
> {
>
> mdc_local_t  *local = NULL;
>
>
>
> local = frame->local;
>
>
>
> if (op_ret != 0)
>
> goto out;
>
>
>
> if (!local)
>
> goto out;
>
>
>
> mdc_inode_iatt_set_validate(this, local->fd->inode, prebuf,
> postbuf,
>
>  _gf_true);
>
> /* new added for ctime issue*/
>
> mdc_inode_iatt_invalidate(this, local->fd->inode);
>
>
> /* new added end*/
>
> out:
>
> MDC_STACK_UNWIND (fsync, frame, op_ret, op_errno, prebuf, postbuf,
>
>   xdata);
>
>
>
> return 0;
>
> }
>
> 
> -
>
> Best Regards,
>
> George
>



-- 
Pranith
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel