Re: [Gluster-devel] Issue about the size of fstat is less than the really size of the syslog file

2016-11-03 Thread Pranith Kumar Karampuri
On Thu, Nov 3, 2016 at 1:10 PM, Raghavendra G 
wrote:

>
>
> On Thu, Nov 3, 2016 at 10:27 AM, Raghavendra G 
> wrote:
>
>>
>>
>> On Thu, Nov 3, 2016 at 7:16 AM, Lian, George (Nokia - CN/Hangzhou) <
>> george.l...@nokia.com> wrote:
>>
>>> >Yes. I was assuming that the previous results were tested with:
>>> >1. write-behind on with the fix
>>> >2. quick-read and readdir-ahead off
>>> # gluster volume info log
>>>
>>> performance.quick-read: off
>>> performance.readdir-ahead: off
>>> performance.stat-prefetch: on
>>> performance.write-behind: on
>>>
>>>
>>> with the above configuration and write-behind.so with patch 2, the "tail
>>> truncated" issue still be there.
>>>
>>> # tail -f syslog >/dev/null
>>> tail: syslog: file truncated
>>> tail: syslog: file truncated
>>>
>>> FYI,
>>>
>>
>> Thanks George. I'll take a look.
>>
>
> Can you please test with following configuration?
>

George,
Do you have any way we can re-create this issue? I mean by creating a
similar workload as yours? It seems to catch good bugs in caching layer. It
would help in automating this case. If you have the steps, we can also find
the fix sooner because the iteration speed to test/verify would be faster.
Let us know.


> 1. write-behind on with my fix
> 2. readdir-ahead and quick-read off
> 3. performance.stat-prefetch on
> 4. performance.force-readdirp off
> 5. dht.force-readdirp off
> 6. Also mount glusterfs with option "use-readdirp=no"
>
> [root@booradley glusterfs]# mount -t glusterfs -o use-readdirp=no
> booradley:/newptop /mnt
>
> [root@booradley glusterfs]# ps ax | grep -i mnt
> 14418 ?Ssl0:00 /usr/local/sbin/glusterfs --use-readdirp=no
> --volfile-server=booradley --volfile-id=/newptop /mnt
>
> [root@booradley glusterfs]# gluster volume set newptop
> performance.write-behind on
> volume set: success
>
> [root@booradley glusterfs]# gluster volume set newptop
> performance.quick-read off
> volume set: success
>
> [root@booradley glusterfs]# gluster volume set newptop
> performance.stat-prefetch on
> volume set: success
>
> [root@booradley glusterfs]# gluster volume set newptop
> performance.force-readdirp off
> volume set: success
>
> [root@booradley glusterfs]# gluster volume set newptop dht.force-readdirp
> off
> volume set: success
>
> [root@booradley glusterfs]# gluster volume set newptop
> performance.readdir-ahead off
> volume set: success
>
> [root@booradley glusterfs]# gluster volume info newptop
>
> Volume Name: newptop
> Type: Distribute
> Volume ID: 092756e1-e095-4e05-9f14-3e9a6aed908c
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1
> Transport-type: tcp
> Bricks:
> Brick1: booradley:/home/export/newptop
> Options Reconfigured:
> dht.force-readdirp: off
> performance.force-readdirp: off
> performance.stat-prefetch: on
> performance.write-behind: on
> performance.quick-read: off
> transport.address-family: inet
> performance.readdir-ahead: off
> nfs.disable: on
>
>
>
>>
>>>
>>> Best Regards,
>>> George
>>>
>>>
>>> -Original Message-
>>> From: Raghavendra Gowdappa [mailto:rgowd...@redhat.com]
>>> Sent: Wednesday, November 02, 2016 5:41 PM
>>> To: Lian, George (Nokia - CN/Hangzhou) 
>>> Cc: Raghavendra G ; Gluster-devel@gluster.org;
>>> Zizka, Jan (Nokia - CZ/Prague) ; Zhang, Bingxuan
>>> (Nokia - CN/Hangzhou) 
>>> Subject: Re: [Gluster-devel] Issue about the size of fstat is less than
>>> the really size of the syslog file
>>>
>>>
>>>
>>> - Original Message -
>>> > From: "George Lian (Nokia - CN/Hangzhou)" 
>>> > To: "Raghavendra Gowdappa" 
>>> > Cc: "Raghavendra G" ,
>>> Gluster-devel@gluster.org, "Jan Zizka (Nokia - CZ/Prague)"
>>> > , "Bingxuan Zhang (Nokia - CN/Hangzhou)" <
>>> bingxuan.zh...@nokia.com>
>>> > Sent: Wednesday, November 2, 2016 1:38:44 PM
>>> > Subject: RE: [Gluster-devel] Issue about the size of fstat is less
>>> than the really size of the syslog file
>>> >
>>> > Yes, I confirm use the Patch 2.
>>> >
>>> > One update: the issue is occurred when readdir-ahead off and
>>> write-behind on.
>>> > Seems gone when write-behind and readdir-ahead and quick-read all off.
>>> > Not verified with readdir-ahead and quick-read both off and
>>> write-behind on
>>> > till now.
>>> >
>>> > Need I test it with write-behind on and readdir-ahead and quick-read
>>> both
>>> > off?
>>>
>>> Yes. I was assuming that the previous results were tested with:
>>> 1. write-behind on with the fix
>>> 2. quick-read and readdir-ahead off
>>>
>>> If not, test results with this configuration will help.
>>>
>>> >
>>> > Best Regards,
>>> > George
>>> >
>>> > -Original Message-
>>> > From: Raghavendra Gowdappa [mailto:rgowd...@redhat.com]
>>> > Sent: Wednesday, November 02, 2016 4:04 PM
>>> > To: Lian, George (Nokia - CN/Hangzhou) 

Re: [Gluster-devel] Issue about the size of fstat is less than the really size of the syslog file

2016-11-03 Thread Raghavendra G
On Thu, Nov 3, 2016 at 10:27 AM, Raghavendra G 
wrote:

>
>
> On Thu, Nov 3, 2016 at 7:16 AM, Lian, George (Nokia - CN/Hangzhou) <
> george.l...@nokia.com> wrote:
>
>> >Yes. I was assuming that the previous results were tested with:
>> >1. write-behind on with the fix
>> >2. quick-read and readdir-ahead off
>> # gluster volume info log
>>
>> performance.quick-read: off
>> performance.readdir-ahead: off
>> performance.stat-prefetch: on
>> performance.write-behind: on
>>
>>
>> with the above configuration and write-behind.so with patch 2, the "tail
>> truncated" issue still be there.
>>
>> # tail -f syslog >/dev/null
>> tail: syslog: file truncated
>> tail: syslog: file truncated
>>
>> FYI,
>>
>
> Thanks George. I'll take a look.
>

Can you please test with following configuration?

1. write-behind on with my fix
2. readdir-ahead and quick-read off
3. performance.stat-prefetch on
4. performance.force-readdirp off
5. dht.force-readdirp off
6. Also mount glusterfs with option "use-readdirp=no"

[root@booradley glusterfs]# mount -t glusterfs -o use-readdirp=no
booradley:/newptop /mnt

[root@booradley glusterfs]# ps ax | grep -i mnt
14418 ?Ssl0:00 /usr/local/sbin/glusterfs --use-readdirp=no
--volfile-server=booradley --volfile-id=/newptop /mnt

[root@booradley glusterfs]# gluster volume set newptop
performance.write-behind on
volume set: success

[root@booradley glusterfs]# gluster volume set newptop
performance.quick-read off
volume set: success

[root@booradley glusterfs]# gluster volume set newptop
performance.stat-prefetch on
volume set: success

[root@booradley glusterfs]# gluster volume set newptop
performance.force-readdirp off
volume set: success

[root@booradley glusterfs]# gluster volume set newptop dht.force-readdirp
off
volume set: success

[root@booradley glusterfs]# gluster volume set newptop
performance.readdir-ahead off
volume set: success

[root@booradley glusterfs]# gluster volume info newptop

Volume Name: newptop
Type: Distribute
Volume ID: 092756e1-e095-4e05-9f14-3e9a6aed908c
Status: Started
Snapshot Count: 0
Number of Bricks: 1
Transport-type: tcp
Bricks:
Brick1: booradley:/home/export/newptop
Options Reconfigured:
dht.force-readdirp: off
performance.force-readdirp: off
performance.stat-prefetch: on
performance.write-behind: on
performance.quick-read: off
transport.address-family: inet
performance.readdir-ahead: off
nfs.disable: on



>
>>
>> Best Regards,
>> George
>>
>>
>> -Original Message-
>> From: Raghavendra Gowdappa [mailto:rgowd...@redhat.com]
>> Sent: Wednesday, November 02, 2016 5:41 PM
>> To: Lian, George (Nokia - CN/Hangzhou) 
>> Cc: Raghavendra G ; Gluster-devel@gluster.org;
>> Zizka, Jan (Nokia - CZ/Prague) ; Zhang, Bingxuan
>> (Nokia - CN/Hangzhou) 
>> Subject: Re: [Gluster-devel] Issue about the size of fstat is less than
>> the really size of the syslog file
>>
>>
>>
>> - Original Message -
>> > From: "George Lian (Nokia - CN/Hangzhou)" 
>> > To: "Raghavendra Gowdappa" 
>> > Cc: "Raghavendra G" ,
>> Gluster-devel@gluster.org, "Jan Zizka (Nokia - CZ/Prague)"
>> > , "Bingxuan Zhang (Nokia - CN/Hangzhou)" <
>> bingxuan.zh...@nokia.com>
>> > Sent: Wednesday, November 2, 2016 1:38:44 PM
>> > Subject: RE: [Gluster-devel] Issue about the size of fstat is less than
>> the really size of the syslog file
>> >
>> > Yes, I confirm use the Patch 2.
>> >
>> > One update: the issue is occurred when readdir-ahead off and
>> write-behind on.
>> > Seems gone when write-behind and readdir-ahead and quick-read all off.
>> > Not verified with readdir-ahead and quick-read both off and
>> write-behind on
>> > till now.
>> >
>> > Need I test it with write-behind on and readdir-ahead and quick-read
>> both
>> > off?
>>
>> Yes. I was assuming that the previous results were tested with:
>> 1. write-behind on with the fix
>> 2. quick-read and readdir-ahead off
>>
>> If not, test results with this configuration will help.
>>
>> >
>> > Best Regards,
>> > George
>> >
>> > -Original Message-
>> > From: Raghavendra Gowdappa [mailto:rgowd...@redhat.com]
>> > Sent: Wednesday, November 02, 2016 4:04 PM
>> > To: Lian, George (Nokia - CN/Hangzhou) 
>> > Cc: Raghavendra G ; Gluster-devel@gluster.org;
>> > Zizka, Jan (Nokia - CZ/Prague) ; Zhang, Bingxuan
>> (Nokia
>> > - CN/Hangzhou) 
>> > Subject: Re: [Gluster-devel] Issue about the size of fstat is less than
>> the
>> > really size of the syslog file
>> >
>> >
>> >
>> > - Original Message -
>> > > From: "George Lian (Nokia - CN/Hangzhou)" 
>> > > To: "Raghavendra Gowdappa" 
>> > > Cc: "Raghavendra G" ,
>> Gluster-devel@gluster.org,
>> > > "Jan Zizka 

Re: [Gluster-devel] Issue about the size of fstat is less than the really size of the syslog file

2016-11-02 Thread Lian, George (Nokia - CN/Hangzhou)
Hi,

Sorry, it seems the issue still there with readdir-ahead off.

# tail -f syslog >syslog.2.log
tail: syslog: file truncated
tail: syslog: file truncated
tail: syslog: file truncated
tail: syslog: file truncated
tail: syslog: file truncated







tail: syslog: file truncated

The last line is new truncated error message after the following set with 
option readdir-ahead.


# gluster volume set log performance.readdir-ahead off
volume set: success
[root@SN-0(RNC-124) /root]
# gluster volume info log

Volume Name: log
Type: Replicate
Volume ID: 9d987a23-1d64-4857-980a-548f1dd0d1e6
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: 169.254.0.12:/mnt/bricks/log/brick
Brick2: 169.254.0.13:/mnt/bricks/log/brick
Options Reconfigured:
performance.readdir-ahead: off

BRs
George

-Original Message-
From: gluster-devel-boun...@gluster.org 
[mailto:gluster-devel-boun...@gluster.org] On Behalf Of Raghavendra Gowdappa
Sent: Tuesday, November 01, 2016 9:29 PM
To: Raghavendra G 
Cc: Gluster-devel@gluster.org; Zizka, Jan (Nokia - CZ/Prague) 
; Zhang, Bingxuan (Nokia - CN/Hangzhou) 

Subject: Re: [Gluster-devel] Issue about the size of fstat is less than the 
really size of the syslog file



- Original Message -
> From: "Raghavendra G" 
> To: "Pranith Kumar Karampuri" 
> Cc: Gluster-devel@gluster.org, "Jan Zizka (Nokia - CZ/Prague)" 
> , "Bingxuan Zhang (Nokia -
> CN/Hangzhou)" 
> Sent: Tuesday, November 1, 2016 6:55:31 PM
> Subject: Re: [Gluster-devel] Issue about the size of fstat is less than the 
> really size of the syslog file
> 
> Hi George,
> 
> You can try your tests in two scenarios:
> 
> 1. Switch off following performance translators:
> a. write-behind
> b. quick-read
> c. readdir-ahead
> 
> Once you see that the issue goes away in the above scenario, you can apply
> write-behind fix [1] and turn write-behind on.
> 
> We have a fix in quick-read at [2].
> 
> You can also try to test with the quick-read fix and turning quick-read on.
> However, since your issue is stale stat I doubt quick-read is not the

s/quick-read is not the culprit/quick-read is the culprit/

> culprit here (The issue with quick-read is stale data).
> 
> We see that readdir-ahead is not updating/flushing the stats stored along
> with dentries when write or file modifications happen. That seem to be the
> issue with readdir-ahead. For now, during both of the above two scenarios,
> please keep readdir-ahead off.
> 
> [1] http://review.gluster.org/15757/2
> [2] http://review.gluster.org/#/c/15759/2
> 
> regards,
> Raghavendra
> 
> On Tue, Nov 1, 2016 at 7:55 AM, Pranith Kumar Karampuri < pkara...@redhat.com
> > wrote:
> 
> 
> 
> 
> 
> On Tue, Nov 1, 2016 at 7:32 AM, Lian, George (Nokia - CN/Hangzhou) <
> george.l...@nokia.com > wrote:
> 
> 
> 
> 
> 
> Hi,
> 
> 
> 
> I will test it with your patches and update to you when I have result.
> 
> hi George,
> Please use http://review.gluster.org/#/c/15757/2 i.e. second version of
> Raghavendra's patch. I tested it and it worked fine. We are still trying to
> figure out quick-read and readdir-ahead as I type this mail.
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> Thanks a lots
> 
> 
> 
> Best Regards,
> 
> George
> 
> 
> 
> From: Pranith Kumar Karampuri [mailto: pkara...@redhat.com ]
> Sent: Monday, October 31, 2016 11:23 AM
> To: Lian, George (Nokia - CN/Hangzhou) < george.l...@nokia.com >
> Cc: Raghavendra Gowdappa < rgowd...@redhat.com >; Zhang, Bingxuan (Nokia -
> CN/Hangzhou) < bingxuan.zh...@nokia.com >; Gluster-devel@gluster.org ;
> Zizka, Jan (Nokia - CZ/Prague) < jan.zi...@nokia.com >
> 
> Subject: Re: [Gluster-devel] Issue about the size of fstat is less than the
> really size of the syslog file
> 
> 
> 
> 
> 
> 
> 
> 
> Removing i_ext_mbb_wcdma_swd3_da1_mat...@internal.nsn.com , it is causing
> mail delivery problems for me.
> 
> 
> George,
> 
> 
> Raghavendra and I made some progress on this issue. We were in parallel
> working on another issue which is similar where elastic search indices are
> getting corrupted because of wrong stat sizes in our opinion. So I have been
> running different translator stacks in identifying the problematic xlators
> which are leading to indices corruption.
> 
> 
> We found the list to be 1) Write-behind, 2) Quick-read, 3) Readdir-ahead.
> Raghavendra and I just had a chat and we are suspecting that lack of
> lookup/readdirp implementation in write-behind could be the reason for this
> problem. Similar problems may exist in other two xlators too. But we are
> working on write-behind with priority.
> 
> 
> Our theory is this:
> 
> 
> If we do a 4KB write for example and it is cached in write-behind and we do a
> lookup on the file/do a readdirp on the directory with this file we send out
> wrong stat value to the kernel. There are different caches between 

Re: [Gluster-devel] Issue about the size of fstat is less than the really size of the syslog file

2016-11-02 Thread Lian, George (Nokia - CN/Hangzhou)
Hi,

When those 3 options turn off, the issue seems gone in about 3 hours, 
otherwise, the issue will be occurred about every 10 minutes.

Best Regards,
George

-Original Message-
From: Raghavendra Gowdappa [mailto:rgowd...@redhat.com] 
Sent: Wednesday, November 02, 2016 1:07 PM
To: Lian, George (Nokia - CN/Hangzhou) 
Cc: Raghavendra G ; Gluster-devel@gluster.org; Zizka, 
Jan (Nokia - CZ/Prague) ; Zhang, Bingxuan (Nokia - 
CN/Hangzhou) 
Subject: Re: [Gluster-devel] Issue about the size of fstat is less than the 
really size of the syslog file

Can you try with following xlators turned off?

1. write-behind
2. readdir-ahead
3. quick-read

regards,
Raghavendra

- Original Message -
> From: "George Lian (Nokia - CN/Hangzhou)" 
> To: "Raghavendra Gowdappa" , "Raghavendra G" 
> 
> Cc: Gluster-devel@gluster.org, "Jan Zizka (Nokia - CZ/Prague)" 
> , "Bingxuan Zhang (Nokia -
> CN/Hangzhou)" 
> Sent: Wednesday, November 2, 2016 10:26:08 AM
> Subject: RE: [Gluster-devel] Issue about the size of fstat is less than the 
> really size of the syslog file
> 
> Hi,
> 
> Sorry, it seems the issue still there with readdir-ahead off.
> 
> # tail -f syslog >syslog.2.log
> tail: syslog: file truncated
> tail: syslog: file truncated
> tail: syslog: file truncated
> tail: syslog: file truncated
> tail: syslog: file truncated
> 
> 
> 
> 
> 
> 
> 
> tail: syslog: file truncated
> 
> The last line is new truncated error message after the following set with
> option readdir-ahead.
> 
> 
> # gluster volume set log performance.readdir-ahead off
> volume set: success
> [root@SN-0(RNC-124) /root]
> # gluster volume info log
> 
> Volume Name: log
> Type: Replicate
> Volume ID: 9d987a23-1d64-4857-980a-548f1dd0d1e6
> Status: Started
> Number of Bricks: 1 x 2 = 2
> Transport-type: tcp
> Bricks:
> Brick1: 169.254.0.12:/mnt/bricks/log/brick
> Brick2: 169.254.0.13:/mnt/bricks/log/brick
> Options Reconfigured:
> performance.readdir-ahead: off
> 
> BRs
> George
> 
> -Original Message-
> From: gluster-devel-boun...@gluster.org
> [mailto:gluster-devel-boun...@gluster.org] On Behalf Of Raghavendra Gowdappa
> Sent: Tuesday, November 01, 2016 9:29 PM
> To: Raghavendra G 
> Cc: Gluster-devel@gluster.org; Zizka, Jan (Nokia - CZ/Prague)
> ; Zhang, Bingxuan (Nokia - CN/Hangzhou)
> 
> Subject: Re: [Gluster-devel] Issue about the size of fstat is less than the
> really size of the syslog file
> 
> 
> 
> - Original Message -
> > From: "Raghavendra G" 
> > To: "Pranith Kumar Karampuri" 
> > Cc: Gluster-devel@gluster.org, "Jan Zizka (Nokia - CZ/Prague)"
> > , "Bingxuan Zhang (Nokia -
> > CN/Hangzhou)" 
> > Sent: Tuesday, November 1, 2016 6:55:31 PM
> > Subject: Re: [Gluster-devel] Issue about the size of fstat is less than the
> > really size of the syslog file
> > 
> > Hi George,
> > 
> > You can try your tests in two scenarios:
> > 
> > 1. Switch off following performance translators:
> > a. write-behind
> > b. quick-read
> > c. readdir-ahead
> > 
> > Once you see that the issue goes away in the above scenario, you can apply
> > write-behind fix [1] and turn write-behind on.
> > 
> > We have a fix in quick-read at [2].
> > 
> > You can also try to test with the quick-read fix and turning quick-read on.
> > However, since your issue is stale stat I doubt quick-read is not the
> 
> s/quick-read is not the culprit/quick-read is the culprit/
> 
> > culprit here (The issue with quick-read is stale data).
> > 
> > We see that readdir-ahead is not updating/flushing the stats stored along
> > with dentries when write or file modifications happen. That seem to be the
> > issue with readdir-ahead. For now, during both of the above two scenarios,
> > please keep readdir-ahead off.
> > 
> > [1] http://review.gluster.org/15757/2
> > [2] http://review.gluster.org/#/c/15759/2
> > 
> > regards,
> > Raghavendra
> > 
> > On Tue, Nov 1, 2016 at 7:55 AM, Pranith Kumar Karampuri <
> > pkara...@redhat.com
> > > wrote:
> > 
> > 
> > 
> > 
> > 
> > On Tue, Nov 1, 2016 at 7:32 AM, Lian, George (Nokia - CN/Hangzhou) <
> > george.l...@nokia.com > wrote:
> > 
> > 
> > 
> > 
> > 
> > Hi,
> > 
> > 
> > 
> > I will test it with your patches and update to you when I have result.
> > 
> > hi George,
> > Please use http://review.gluster.org/#/c/15757/2 i.e. second version of
> > Raghavendra's patch. I tested it and it worked fine. We are still trying to
> > figure out quick-read and readdir-ahead as I type this mail.
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > Thanks a lots
> > 
> > 
> > 
> > Best Regards,
> > 
> > George
> > 
> > 
> > 
> > From: Pranith Kumar Karampuri [mailto: 

Re: [Gluster-devel] Issue about the size of fstat is less than the really size of the syslog file

2016-11-02 Thread Raghavendra Gowdappa


- Original Message -
> From: "George Lian (Nokia - CN/Hangzhou)" 
> To: "Raghavendra Gowdappa" 
> Cc: "Raghavendra G" , Gluster-devel@gluster.org, 
> "Jan Zizka (Nokia - CZ/Prague)"
> , "Bingxuan Zhang (Nokia - CN/Hangzhou)" 
> 
> Sent: Wednesday, November 2, 2016 1:38:44 PM
> Subject: RE: [Gluster-devel] Issue about the size of fstat is less than the 
> really size of the syslog file
> 
> Yes, I confirm use the Patch 2.
> 
> One update: the issue is occurred when readdir-ahead off and write-behind on.
> Seems gone when write-behind and readdir-ahead and quick-read all off.
> Not verified with readdir-ahead and quick-read both off and write-behind on
> till now.
> 
> Need I test it with write-behind on and readdir-ahead and quick-read both
> off?

Yes. I was assuming that the previous results were tested with:
1. write-behind on with the fix
2. quick-read and readdir-ahead off

If not, test results with this configuration will help.

> 
> Best Regards,
> George
> 
> -Original Message-
> From: Raghavendra Gowdappa [mailto:rgowd...@redhat.com]
> Sent: Wednesday, November 02, 2016 4:04 PM
> To: Lian, George (Nokia - CN/Hangzhou) 
> Cc: Raghavendra G ; Gluster-devel@gluster.org;
> Zizka, Jan (Nokia - CZ/Prague) ; Zhang, Bingxuan (Nokia
> - CN/Hangzhou) 
> Subject: Re: [Gluster-devel] Issue about the size of fstat is less than the
> really size of the syslog file
> 
> 
> 
> - Original Message -
> > From: "George Lian (Nokia - CN/Hangzhou)" 
> > To: "Raghavendra Gowdappa" 
> > Cc: "Raghavendra G" , Gluster-devel@gluster.org,
> > "Jan Zizka (Nokia - CZ/Prague)"
> > , "Bingxuan Zhang (Nokia - CN/Hangzhou)"
> > 
> > Sent: Wednesday, November 2, 2016 1:29:13 PM
> > Subject: RE: [Gluster-devel] Issue about the size of fstat is less than the
> > really size of the syslog file
> > 
> > Hi,
> > 
> > When those 3 options turn off, the issue seems gone in about 3 hours,
> > otherwise, the issue will be occurred about every 10 minutes.
> 
> That's a good news. IIRC, you mentioned that you saw the issue with just
> write-behind on, with fix applied (readdir-ahead and quick-read off). Can
> you please confirm you had patcset 2 of http://review.gluster.org/15757?
> patchset 1 had some issues that I corrected in 2.
> 
> > 
> > Best Regards,
> > George
> > 
> > -Original Message-
> > From: Raghavendra Gowdappa [mailto:rgowd...@redhat.com]
> > Sent: Wednesday, November 02, 2016 1:07 PM
> > To: Lian, George (Nokia - CN/Hangzhou) 
> > Cc: Raghavendra G ; Gluster-devel@gluster.org;
> > Zizka, Jan (Nokia - CZ/Prague) ; Zhang, Bingxuan
> > (Nokia
> > - CN/Hangzhou) 
> > Subject: Re: [Gluster-devel] Issue about the size of fstat is less than the
> > really size of the syslog file
> > 
> > Can you try with following xlators turned off?
> > 
> > 1. write-behind
> > 2. readdir-ahead
> > 3. quick-read
> > 
> > regards,
> > Raghavendra
> > 
> > - Original Message -
> > > From: "George Lian (Nokia - CN/Hangzhou)" 
> > > To: "Raghavendra Gowdappa" , "Raghavendra G"
> > > 
> > > Cc: Gluster-devel@gluster.org, "Jan Zizka (Nokia - CZ/Prague)"
> > > , "Bingxuan Zhang (Nokia -
> > > CN/Hangzhou)" 
> > > Sent: Wednesday, November 2, 2016 10:26:08 AM
> > > Subject: RE: [Gluster-devel] Issue about the size of fstat is less than
> > > the
> > > really size of the syslog file
> > > 
> > > Hi,
> > > 
> > > Sorry, it seems the issue still there with readdir-ahead off.
> > > 
> > > # tail -f syslog >syslog.2.log
> > > tail: syslog: file truncated
> > > tail: syslog: file truncated
> > > tail: syslog: file truncated
> > > tail: syslog: file truncated
> > > tail: syslog: file truncated
> > > 
> > > 
> > > 
> > > 
> > > 
> > > 
> > > 
> > > tail: syslog: file truncated
> > > 
> > > The last line is new truncated error message after the following set with
> > > option readdir-ahead.
> > > 
> > > 
> > > # gluster volume set log performance.readdir-ahead off
> > > volume set: success
> > > [root@SN-0(RNC-124) /root]
> > > # gluster volume info log
> > > 
> > > Volume Name: log
> > > Type: Replicate
> > > Volume ID: 9d987a23-1d64-4857-980a-548f1dd0d1e6
> > > Status: Started
> > > Number of Bricks: 1 x 2 = 2
> > > Transport-type: tcp
> > > Bricks:
> > > Brick1: 169.254.0.12:/mnt/bricks/log/brick
> > > Brick2: 169.254.0.13:/mnt/bricks/log/brick
> > > Options Reconfigured:
> > > performance.readdir-ahead: off
> > > 
> > > BRs
> > > George
> > > 
> > > -Original Message-
> > > From: 

Re: [Gluster-devel] Issue about the size of fstat is less than the really size of the syslog file

2016-11-02 Thread Raghavendra Gowdappa


- Original Message -
> From: "George Lian (Nokia - CN/Hangzhou)" 
> To: "Raghavendra Gowdappa" 
> Cc: "Raghavendra G" , Gluster-devel@gluster.org, 
> "Jan Zizka (Nokia - CZ/Prague)"
> , "Bingxuan Zhang (Nokia - CN/Hangzhou)" 
> 
> Sent: Wednesday, November 2, 2016 1:29:13 PM
> Subject: RE: [Gluster-devel] Issue about the size of fstat is less than the 
> really size of the syslog file
> 
> Hi,
> 
> When those 3 options turn off, the issue seems gone in about 3 hours,
> otherwise, the issue will be occurred about every 10 minutes.

That's a good news. IIRC, you mentioned that you saw the issue with just 
write-behind on, with fix applied (readdir-ahead and quick-read off). Can you 
please confirm you had patcset 2 of http://review.gluster.org/15757? patchset 1 
had some issues that I corrected in 2.

> 
> Best Regards,
> George
> 
> -Original Message-
> From: Raghavendra Gowdappa [mailto:rgowd...@redhat.com]
> Sent: Wednesday, November 02, 2016 1:07 PM
> To: Lian, George (Nokia - CN/Hangzhou) 
> Cc: Raghavendra G ; Gluster-devel@gluster.org;
> Zizka, Jan (Nokia - CZ/Prague) ; Zhang, Bingxuan (Nokia
> - CN/Hangzhou) 
> Subject: Re: [Gluster-devel] Issue about the size of fstat is less than the
> really size of the syslog file
> 
> Can you try with following xlators turned off?
> 
> 1. write-behind
> 2. readdir-ahead
> 3. quick-read
> 
> regards,
> Raghavendra
> 
> - Original Message -
> > From: "George Lian (Nokia - CN/Hangzhou)" 
> > To: "Raghavendra Gowdappa" , "Raghavendra G"
> > 
> > Cc: Gluster-devel@gluster.org, "Jan Zizka (Nokia - CZ/Prague)"
> > , "Bingxuan Zhang (Nokia -
> > CN/Hangzhou)" 
> > Sent: Wednesday, November 2, 2016 10:26:08 AM
> > Subject: RE: [Gluster-devel] Issue about the size of fstat is less than the
> > really size of the syslog file
> > 
> > Hi,
> > 
> > Sorry, it seems the issue still there with readdir-ahead off.
> > 
> > # tail -f syslog >syslog.2.log
> > tail: syslog: file truncated
> > tail: syslog: file truncated
> > tail: syslog: file truncated
> > tail: syslog: file truncated
> > tail: syslog: file truncated
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > tail: syslog: file truncated
> > 
> > The last line is new truncated error message after the following set with
> > option readdir-ahead.
> > 
> > 
> > # gluster volume set log performance.readdir-ahead off
> > volume set: success
> > [root@SN-0(RNC-124) /root]
> > # gluster volume info log
> > 
> > Volume Name: log
> > Type: Replicate
> > Volume ID: 9d987a23-1d64-4857-980a-548f1dd0d1e6
> > Status: Started
> > Number of Bricks: 1 x 2 = 2
> > Transport-type: tcp
> > Bricks:
> > Brick1: 169.254.0.12:/mnt/bricks/log/brick
> > Brick2: 169.254.0.13:/mnt/bricks/log/brick
> > Options Reconfigured:
> > performance.readdir-ahead: off
> > 
> > BRs
> > George
> > 
> > -Original Message-
> > From: gluster-devel-boun...@gluster.org
> > [mailto:gluster-devel-boun...@gluster.org] On Behalf Of Raghavendra
> > Gowdappa
> > Sent: Tuesday, November 01, 2016 9:29 PM
> > To: Raghavendra G 
> > Cc: Gluster-devel@gluster.org; Zizka, Jan (Nokia - CZ/Prague)
> > ; Zhang, Bingxuan (Nokia - CN/Hangzhou)
> > 
> > Subject: Re: [Gluster-devel] Issue about the size of fstat is less than the
> > really size of the syslog file
> > 
> > 
> > 
> > - Original Message -
> > > From: "Raghavendra G" 
> > > To: "Pranith Kumar Karampuri" 
> > > Cc: Gluster-devel@gluster.org, "Jan Zizka (Nokia - CZ/Prague)"
> > > , "Bingxuan Zhang (Nokia -
> > > CN/Hangzhou)" 
> > > Sent: Tuesday, November 1, 2016 6:55:31 PM
> > > Subject: Re: [Gluster-devel] Issue about the size of fstat is less than
> > > the
> > > really size of the syslog file
> > > 
> > > Hi George,
> > > 
> > > You can try your tests in two scenarios:
> > > 
> > > 1. Switch off following performance translators:
> > > a. write-behind
> > > b. quick-read
> > > c. readdir-ahead
> > > 
> > > Once you see that the issue goes away in the above scenario, you can
> > > apply
> > > write-behind fix [1] and turn write-behind on.
> > > 
> > > We have a fix in quick-read at [2].
> > > 
> > > You can also try to test with the quick-read fix and turning quick-read
> > > on.
> > > However, since your issue is stale stat I doubt quick-read is not the
> > 
> > s/quick-read is not the culprit/quick-read is the culprit/
> > 
> > > culprit here (The issue with quick-read is stale data).
> > > 
> > > We see that readdir-ahead is not updating/flushing the stats stored along
> > > with dentries when write or file 

Re: [Gluster-devel] Issue about the size of fstat is less than the really size of the syslog file

2016-11-02 Thread Lian, George (Nokia - CN/Hangzhou)
Hi,

OK, I will test with readdir-ahead off now. and will update to you with result.

BTW, when will you fix the issue in readdir-ahead if there exist some issue?

Thanks & Best Regards,
George

-Original Message-
From: Raghavendra Gowdappa [mailto:rgowd...@redhat.com] 
Sent: Wednesday, November 02, 2016 11:40 AM
To: Lian, George (Nokia - CN/Hangzhou) 
Cc: Pranith Kumar Karampuri ; Zhang, Bingxuan (Nokia - 
CN/Hangzhou) ; Gluster-devel@gluster.org; Zizka, Jan 
(Nokia - CZ/Prague) 
Subject: Re: [Gluster-devel] Issue about the size of fstat is less than the 
really size of the syslog file



- Original Message -
> From: "George Lian (Nokia - CN/Hangzhou)" 
> To: "Pranith Kumar Karampuri" 
> Cc: "Raghavendra Gowdappa" , "Bingxuan Zhang (Nokia - 
> CN/Hangzhou)" ,
> Gluster-devel@gluster.org, "Jan Zizka (Nokia - CZ/Prague)" 
> 
> Sent: Wednesday, November 2, 2016 8:52:32 AM
> Subject: RE: [Gluster-devel] Issue about the size of fstat is less than the 
> really size of the syslog file
> 
> Hi,
> 
> As I used the patch from http://review.gluster.org/#/c/15757/2
> The issue still exist,

Was readdir-ahead turned off when you did the test? Please refer to my last 
mail in this thread for more details.

> 
> # tail -f syslog >syslog.2.log
> tail: syslog: file truncated
> 
> One more information, we used GlusterFS version with 3.6.9.
> 
> FYI,
> 
> Best Regards,
> George
> From: Pranith Kumar Karampuri [mailto:pkara...@redhat.com]
> Sent: Tuesday, November 01, 2016 10:25 AM
> To: Lian, George (Nokia - CN/Hangzhou) 
> Cc: Raghavendra Gowdappa ; Zhang, Bingxuan (Nokia -
> CN/Hangzhou) ; Gluster-devel@gluster.org; Zizka,
> Jan (Nokia - CZ/Prague) 
> Subject: Re: [Gluster-devel] Issue about the size of fstat is less than the
> really size of the syslog file
> 
> 
> 
> On Tue, Nov 1, 2016 at 7:32 AM, Lian, George (Nokia - CN/Hangzhou)
> > wrote:
> Hi,
> 
> I will test it with your patches and update to you when I have result.
> 
> hi George,
>   Please use http://review.gluster.org/#/c/15757/2 i.e. second version of
>   Raghavendra's patch. I tested it and it worked fine. We are still
>   trying to figure out quick-read and readdir-ahead as I type this mail.
> 
> Thanks a lots
> 
> Best Regards,
> George
> 
> From: Pranith Kumar Karampuri
> [mailto:pkara...@redhat.com]
> Sent: Monday, October 31, 2016 11:23 AM
> To: Lian, George (Nokia - CN/Hangzhou)
> >
> Cc: Raghavendra Gowdappa >;
> Zhang, Bingxuan (Nokia - CN/Hangzhou)
> >;
> Gluster-devel@gluster.org; Zizka, Jan
> (Nokia - CZ/Prague) >
> 
> Subject: Re: [Gluster-devel] Issue about the size of fstat is less than the
> really size of the syslog file
> 
> Removing
> i_ext_mbb_wcdma_swd3_da1_mat...@internal.nsn.com,
> it is causing mail delivery problems for me.
> George,
>  Raghavendra and I made some progress on this issue. We were in
>  parallel working on another issue which is similar where elastic
>  search indices are getting corrupted because of wrong stat sizes in
>  our opinion. So I have been running different translator stacks in
>  identifying the problematic xlators which are leading to indices
>  corruption.
>   We found the list to be 1) Write-behind, 2) Quick-read, 3)
>   Readdir-ahead. Raghavendra and I just had a chat and we are
>   suspecting that lack of lookup/readdirp implementation in
>   write-behind could be the reason for this problem. Similar
>   problems may exist in other two xlators too. But we are working on
>   write-behind with priority.
> Our theory is this:
> If we do a 4KB write for example and it is cached in write-behind and we do a
> lookup on the file/do a readdirp on the directory with this file we send out
> wrong stat value to the kernel. There are different caches between kernel
> and gluster which may lead to fstat never coming till write-behind. So we
> need to make sure that we don't get into this situation.
> Action items:
>  At the moment Raghavendra is working on a patch to implement lookup/readdirp
>  in write-behind. I am going to test the same for elastic search. Will it be
>  possible for you to test your application against the same patch and
>  confirm that the patch fixes the problem?
> 
> On Fri, Oct 28, 2016 at 12:08 PM, Pranith 

Re: [Gluster-devel] Issue about the size of fstat is less than the really size of the syslog file

2016-11-01 Thread Raghavendra Gowdappa
Can you try with following xlators turned off?

1. write-behind
2. readdir-ahead
3. quick-read

regards,
Raghavendra

- Original Message -
> From: "George Lian (Nokia - CN/Hangzhou)" 
> To: "Raghavendra Gowdappa" , "Raghavendra G" 
> 
> Cc: Gluster-devel@gluster.org, "Jan Zizka (Nokia - CZ/Prague)" 
> , "Bingxuan Zhang (Nokia -
> CN/Hangzhou)" 
> Sent: Wednesday, November 2, 2016 10:26:08 AM
> Subject: RE: [Gluster-devel] Issue about the size of fstat is less than the 
> really size of the syslog file
> 
> Hi,
> 
> Sorry, it seems the issue still there with readdir-ahead off.
> 
> # tail -f syslog >syslog.2.log
> tail: syslog: file truncated
> tail: syslog: file truncated
> tail: syslog: file truncated
> tail: syslog: file truncated
> tail: syslog: file truncated
> 
> 
> 
> 
> 
> 
> 
> tail: syslog: file truncated
> 
> The last line is new truncated error message after the following set with
> option readdir-ahead.
> 
> 
> # gluster volume set log performance.readdir-ahead off
> volume set: success
> [root@SN-0(RNC-124) /root]
> # gluster volume info log
> 
> Volume Name: log
> Type: Replicate
> Volume ID: 9d987a23-1d64-4857-980a-548f1dd0d1e6
> Status: Started
> Number of Bricks: 1 x 2 = 2
> Transport-type: tcp
> Bricks:
> Brick1: 169.254.0.12:/mnt/bricks/log/brick
> Brick2: 169.254.0.13:/mnt/bricks/log/brick
> Options Reconfigured:
> performance.readdir-ahead: off
> 
> BRs
> George
> 
> -Original Message-
> From: gluster-devel-boun...@gluster.org
> [mailto:gluster-devel-boun...@gluster.org] On Behalf Of Raghavendra Gowdappa
> Sent: Tuesday, November 01, 2016 9:29 PM
> To: Raghavendra G 
> Cc: Gluster-devel@gluster.org; Zizka, Jan (Nokia - CZ/Prague)
> ; Zhang, Bingxuan (Nokia - CN/Hangzhou)
> 
> Subject: Re: [Gluster-devel] Issue about the size of fstat is less than the
> really size of the syslog file
> 
> 
> 
> - Original Message -
> > From: "Raghavendra G" 
> > To: "Pranith Kumar Karampuri" 
> > Cc: Gluster-devel@gluster.org, "Jan Zizka (Nokia - CZ/Prague)"
> > , "Bingxuan Zhang (Nokia -
> > CN/Hangzhou)" 
> > Sent: Tuesday, November 1, 2016 6:55:31 PM
> > Subject: Re: [Gluster-devel] Issue about the size of fstat is less than the
> > really size of the syslog file
> > 
> > Hi George,
> > 
> > You can try your tests in two scenarios:
> > 
> > 1. Switch off following performance translators:
> > a. write-behind
> > b. quick-read
> > c. readdir-ahead
> > 
> > Once you see that the issue goes away in the above scenario, you can apply
> > write-behind fix [1] and turn write-behind on.
> > 
> > We have a fix in quick-read at [2].
> > 
> > You can also try to test with the quick-read fix and turning quick-read on.
> > However, since your issue is stale stat I doubt quick-read is not the
> 
> s/quick-read is not the culprit/quick-read is the culprit/
> 
> > culprit here (The issue with quick-read is stale data).
> > 
> > We see that readdir-ahead is not updating/flushing the stats stored along
> > with dentries when write or file modifications happen. That seem to be the
> > issue with readdir-ahead. For now, during both of the above two scenarios,
> > please keep readdir-ahead off.
> > 
> > [1] http://review.gluster.org/15757/2
> > [2] http://review.gluster.org/#/c/15759/2
> > 
> > regards,
> > Raghavendra
> > 
> > On Tue, Nov 1, 2016 at 7:55 AM, Pranith Kumar Karampuri <
> > pkara...@redhat.com
> > > wrote:
> > 
> > 
> > 
> > 
> > 
> > On Tue, Nov 1, 2016 at 7:32 AM, Lian, George (Nokia - CN/Hangzhou) <
> > george.l...@nokia.com > wrote:
> > 
> > 
> > 
> > 
> > 
> > Hi,
> > 
> > 
> > 
> > I will test it with your patches and update to you when I have result.
> > 
> > hi George,
> > Please use http://review.gluster.org/#/c/15757/2 i.e. second version of
> > Raghavendra's patch. I tested it and it worked fine. We are still trying to
> > figure out quick-read and readdir-ahead as I type this mail.
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > Thanks a lots
> > 
> > 
> > 
> > Best Regards,
> > 
> > George
> > 
> > 
> > 
> > From: Pranith Kumar Karampuri [mailto: pkara...@redhat.com ]
> > Sent: Monday, October 31, 2016 11:23 AM
> > To: Lian, George (Nokia - CN/Hangzhou) < george.l...@nokia.com >
> > Cc: Raghavendra Gowdappa < rgowd...@redhat.com >; Zhang, Bingxuan (Nokia -
> > CN/Hangzhou) < bingxuan.zh...@nokia.com >; Gluster-devel@gluster.org ;
> > Zizka, Jan (Nokia - CZ/Prague) < jan.zi...@nokia.com >
> > 
> > Subject: Re: [Gluster-devel] Issue about the size of fstat is less than the
> > really size of the syslog file
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > Removing i_ext_mbb_wcdma_swd3_da1_mat...@internal.nsn.com , it is causing
> > mail delivery problems for me.
> > 
> > 
> > George,
> > 
> > 

Re: [Gluster-devel] Issue about the size of fstat is less than the really size of the syslog file

2016-11-01 Thread Raghavendra Gowdappa


- Original Message -
> From: "George Lian (Nokia - CN/Hangzhou)" 
> To: "Pranith Kumar Karampuri" 
> Cc: "Raghavendra Gowdappa" , "Bingxuan Zhang (Nokia - 
> CN/Hangzhou)" ,
> Gluster-devel@gluster.org, "Jan Zizka (Nokia - CZ/Prague)" 
> 
> Sent: Wednesday, November 2, 2016 8:52:32 AM
> Subject: RE: [Gluster-devel] Issue about the size of fstat is less than the 
> really size of the syslog file
> 
> Hi,
> 
> As I used the patch from http://review.gluster.org/#/c/15757/2
> The issue still exist,

Was readdir-ahead turned off when you did the test? Please refer to my last 
mail in this thread for more details.

> 
> # tail -f syslog >syslog.2.log
> tail: syslog: file truncated
> 
> One more information, we used GlusterFS version with 3.6.9.
> 
> FYI,
> 
> Best Regards,
> George
> From: Pranith Kumar Karampuri [mailto:pkara...@redhat.com]
> Sent: Tuesday, November 01, 2016 10:25 AM
> To: Lian, George (Nokia - CN/Hangzhou) 
> Cc: Raghavendra Gowdappa ; Zhang, Bingxuan (Nokia -
> CN/Hangzhou) ; Gluster-devel@gluster.org; Zizka,
> Jan (Nokia - CZ/Prague) 
> Subject: Re: [Gluster-devel] Issue about the size of fstat is less than the
> really size of the syslog file
> 
> 
> 
> On Tue, Nov 1, 2016 at 7:32 AM, Lian, George (Nokia - CN/Hangzhou)
> > wrote:
> Hi,
> 
> I will test it with your patches and update to you when I have result.
> 
> hi George,
>   Please use http://review.gluster.org/#/c/15757/2 i.e. second version of
>   Raghavendra's patch. I tested it and it worked fine. We are still
>   trying to figure out quick-read and readdir-ahead as I type this mail.
> 
> Thanks a lots
> 
> Best Regards,
> George
> 
> From: Pranith Kumar Karampuri
> [mailto:pkara...@redhat.com]
> Sent: Monday, October 31, 2016 11:23 AM
> To: Lian, George (Nokia - CN/Hangzhou)
> >
> Cc: Raghavendra Gowdappa >;
> Zhang, Bingxuan (Nokia - CN/Hangzhou)
> >;
> Gluster-devel@gluster.org; Zizka, Jan
> (Nokia - CZ/Prague) >
> 
> Subject: Re: [Gluster-devel] Issue about the size of fstat is less than the
> really size of the syslog file
> 
> Removing
> i_ext_mbb_wcdma_swd3_da1_mat...@internal.nsn.com,
> it is causing mail delivery problems for me.
> George,
>  Raghavendra and I made some progress on this issue. We were in
>  parallel working on another issue which is similar where elastic
>  search indices are getting corrupted because of wrong stat sizes in
>  our opinion. So I have been running different translator stacks in
>  identifying the problematic xlators which are leading to indices
>  corruption.
>   We found the list to be 1) Write-behind, 2) Quick-read, 3)
>   Readdir-ahead. Raghavendra and I just had a chat and we are
>   suspecting that lack of lookup/readdirp implementation in
>   write-behind could be the reason for this problem. Similar
>   problems may exist in other two xlators too. But we are working on
>   write-behind with priority.
> Our theory is this:
> If we do a 4KB write for example and it is cached in write-behind and we do a
> lookup on the file/do a readdirp on the directory with this file we send out
> wrong stat value to the kernel. There are different caches between kernel
> and gluster which may lead to fstat never coming till write-behind. So we
> need to make sure that we don't get into this situation.
> Action items:
>  At the moment Raghavendra is working on a patch to implement lookup/readdirp
>  in write-behind. I am going to test the same for elastic search. Will it be
>  possible for you to test your application against the same patch and
>  confirm that the patch fixes the problem?
> 
> On Fri, Oct 28, 2016 at 12:08 PM, Pranith Kumar Karampuri
> > wrote:
> hi George,
>It would help if we can identify the bare minimum xlators which are
>contributing to the issue like Raghavendra was mentioning earlier. We
>were wondering if it is possible for you to help us in identifying
>the issue by running the workload on a modified setup? We can suggest
>testing out using custom volfiles so that we can slowly build the
>graph which could be causing this issue. We would like you guys to
>try out this problem with just posix-xlator and fuse and nothing
>else.
> 
> On Thu, Oct 27, 2016 at 1:40 PM, Lian, George 

Re: [Gluster-devel] Issue about the size of fstat is less than the really size of the syslog file

2016-11-01 Thread Lian, George (Nokia - CN/Hangzhou)
Hi,

As I used the patch from http://review.gluster.org/#/c/15757/2
The issue still exist,

# tail -f syslog >syslog.2.log
tail: syslog: file truncated

One more information, we used GlusterFS version with 3.6.9.

FYI,

Best Regards,
George
From: Pranith Kumar Karampuri [mailto:pkara...@redhat.com]
Sent: Tuesday, November 01, 2016 10:25 AM
To: Lian, George (Nokia - CN/Hangzhou) 
Cc: Raghavendra Gowdappa ; Zhang, Bingxuan (Nokia - 
CN/Hangzhou) ; Gluster-devel@gluster.org; Zizka, Jan 
(Nokia - CZ/Prague) 
Subject: Re: [Gluster-devel] Issue about the size of fstat is less than the 
really size of the syslog file



On Tue, Nov 1, 2016 at 7:32 AM, Lian, George (Nokia - CN/Hangzhou) 
> wrote:
Hi,

I will test it with your patches and update to you when I have result.

hi George,
  Please use http://review.gluster.org/#/c/15757/2 i.e. second version of 
Raghavendra's patch. I tested it and it worked fine. We are still trying to 
figure out quick-read and readdir-ahead as I type this mail.

Thanks a lots

Best Regards,
George

From: Pranith Kumar Karampuri 
[mailto:pkara...@redhat.com]
Sent: Monday, October 31, 2016 11:23 AM
To: Lian, George (Nokia - CN/Hangzhou) 
>
Cc: Raghavendra Gowdappa >; 
Zhang, Bingxuan (Nokia - CN/Hangzhou) 
>; 
Gluster-devel@gluster.org; Zizka, Jan (Nokia 
- CZ/Prague) >

Subject: Re: [Gluster-devel] Issue about the size of fstat is less than the 
really size of the syslog file

Removing 
i_ext_mbb_wcdma_swd3_da1_mat...@internal.nsn.com,
 it is causing mail delivery problems for me.
George,
 Raghavendra and I made some progress on this issue. We were in 
parallel working on another issue which is similar where elastic search indices 
are getting corrupted because of wrong stat sizes in our opinion. So I have 
been running different translator stacks in identifying the problematic xlators 
which are leading to indices corruption.
  We found the list to be 1) Write-behind, 2) Quick-read, 3) 
Readdir-ahead. Raghavendra and I just had a chat and we are suspecting that 
lack of lookup/readdirp implementation in write-behind could be the reason for 
this problem. Similar problems may exist in other two xlators too. But we are 
working on write-behind with priority.
Our theory is this:
If we do a 4KB write for example and it is cached in write-behind and we do a 
lookup on the file/do a readdirp on the directory with this file we send out 
wrong stat value to the kernel. There are different caches between kernel and 
gluster which may lead to fstat never coming till write-behind. So we need to 
make sure that we don't get into this situation.
Action items:
 At the moment Raghavendra is working on a patch to implement lookup/readdirp 
in write-behind. I am going to test the same for elastic search. Will it be 
possible for you to test your application against the same patch and confirm 
that the patch fixes the problem?

On Fri, Oct 28, 2016 at 12:08 PM, Pranith Kumar Karampuri 
> wrote:
hi George,
   It would help if we can identify the bare minimum xlators which are 
contributing to the issue like Raghavendra was mentioning earlier. We were 
wondering if it is possible for you to help us in identifying the issue by 
running the workload on a modified setup? We can suggest testing out using 
custom volfiles so that we can slowly build the graph which could be causing 
this issue. We would like you guys to try out this problem with just 
posix-xlator and fuse and nothing else.

On Thu, Oct 27, 2016 at 1:40 PM, Lian, George (Nokia - CN/Hangzhou) 
> wrote:
Hi, Raghavendra,

Could you please give some suggestion for this issue? we try to find the clue 
for this issue for a long time, but it has no progress:(

Thanks & Best Regards,
George

-Original Message-
From: Lian, George (Nokia - CN/Hangzhou)
Sent: Wednesday, October 19, 2016 4:40 PM
To: 'Raghavendra Gowdappa' >
Cc: Gluster-devel@gluster.org; 
I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS 
>;
 Zhang, Bingxuan (Nokia - CN/Hangzhou) 
>; Zizka, Jan (Nokia 
- CZ/Prague) >
Subject: RE: [Gluster-devel] Issue about the size of fstat is less than the 
really size of the syslog file

Re: [Gluster-devel] Issue about the size of fstat is less than the really size of the syslog file

2016-11-01 Thread Raghavendra Gowdappa


- Original Message -
> From: "Raghavendra G" 
> To: "Pranith Kumar Karampuri" 
> Cc: Gluster-devel@gluster.org, "Jan Zizka (Nokia - CZ/Prague)" 
> , "Bingxuan Zhang (Nokia -
> CN/Hangzhou)" 
> Sent: Tuesday, November 1, 2016 6:55:31 PM
> Subject: Re: [Gluster-devel] Issue about the size of fstat is less than the 
> really size of the syslog file
> 
> Hi George,
> 
> You can try your tests in two scenarios:
> 
> 1. Switch off following performance translators:
> a. write-behind
> b. quick-read
> c. readdir-ahead
> 
> Once you see that the issue goes away in the above scenario, you can apply
> write-behind fix [1] and turn write-behind on.
> 
> We have a fix in quick-read at [2].
> 
> You can also try to test with the quick-read fix and turning quick-read on.
> However, since your issue is stale stat I doubt quick-read is not the

s/quick-read is not the culprit/quick-read is the culprit/

> culprit here (The issue with quick-read is stale data).
> 
> We see that readdir-ahead is not updating/flushing the stats stored along
> with dentries when write or file modifications happen. That seem to be the
> issue with readdir-ahead. For now, during both of the above two scenarios,
> please keep readdir-ahead off.
> 
> [1] http://review.gluster.org/15757/2
> [2] http://review.gluster.org/#/c/15759/2
> 
> regards,
> Raghavendra
> 
> On Tue, Nov 1, 2016 at 7:55 AM, Pranith Kumar Karampuri < pkara...@redhat.com
> > wrote:
> 
> 
> 
> 
> 
> On Tue, Nov 1, 2016 at 7:32 AM, Lian, George (Nokia - CN/Hangzhou) <
> george.l...@nokia.com > wrote:
> 
> 
> 
> 
> 
> Hi,
> 
> 
> 
> I will test it with your patches and update to you when I have result.
> 
> hi George,
> Please use http://review.gluster.org/#/c/15757/2 i.e. second version of
> Raghavendra's patch. I tested it and it worked fine. We are still trying to
> figure out quick-read and readdir-ahead as I type this mail.
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> Thanks a lots
> 
> 
> 
> Best Regards,
> 
> George
> 
> 
> 
> From: Pranith Kumar Karampuri [mailto: pkara...@redhat.com ]
> Sent: Monday, October 31, 2016 11:23 AM
> To: Lian, George (Nokia - CN/Hangzhou) < george.l...@nokia.com >
> Cc: Raghavendra Gowdappa < rgowd...@redhat.com >; Zhang, Bingxuan (Nokia -
> CN/Hangzhou) < bingxuan.zh...@nokia.com >; Gluster-devel@gluster.org ;
> Zizka, Jan (Nokia - CZ/Prague) < jan.zi...@nokia.com >
> 
> Subject: Re: [Gluster-devel] Issue about the size of fstat is less than the
> really size of the syslog file
> 
> 
> 
> 
> 
> 
> 
> 
> Removing i_ext_mbb_wcdma_swd3_da1_mat...@internal.nsn.com , it is causing
> mail delivery problems for me.
> 
> 
> George,
> 
> 
> Raghavendra and I made some progress on this issue. We were in parallel
> working on another issue which is similar where elastic search indices are
> getting corrupted because of wrong stat sizes in our opinion. So I have been
> running different translator stacks in identifying the problematic xlators
> which are leading to indices corruption.
> 
> 
> We found the list to be 1) Write-behind, 2) Quick-read, 3) Readdir-ahead.
> Raghavendra and I just had a chat and we are suspecting that lack of
> lookup/readdirp implementation in write-behind could be the reason for this
> problem. Similar problems may exist in other two xlators too. But we are
> working on write-behind with priority.
> 
> 
> Our theory is this:
> 
> 
> If we do a 4KB write for example and it is cached in write-behind and we do a
> lookup on the file/do a readdirp on the directory with this file we send out
> wrong stat value to the kernel. There are different caches between kernel
> and gluster which may lead to fstat never coming till write-behind. So we
> need to make sure that we don't get into this situation.
> 
> 
> Action items:
> 
> 
> At the moment Raghavendra is working on a patch to implement lookup/readdirp
> in write-behind. I am going to test the same for elastic search. Will it be
> possible for you to test your application against the same patch and confirm
> that the patch fixes the problem?
> 
> 
> 
> 
> 
> On Fri, Oct 28, 2016 at 12:08 PM, Pranith Kumar Karampuri <
> pkara...@redhat.com > wrote:
> 
> 
> 
> 
> hi George,
> 
> 
> It would help if we can identify the bare minimum xlators which are
> contributing to the issue like Raghavendra was mentioning earlier. We were
> wondering if it is possible for you to help us in identifying the issue by
> running the workload on a modified setup? We can suggest testing out using
> custom volfiles so that we can slowly build the graph which could be causing
> this issue. We would like you guys to try out this problem with just
> posix-xlator and fuse and nothing else.
> 
> 
> 
> 
> 
> On Thu, Oct 27, 2016 at 1:40 PM, Lian, George (Nokia - CN/Hangzhou) <
> george.l...@nokia.com > wrote:
> 
> 
> 
> Hi, Raghavendra,
> 
> Could you please give some suggestion for this issue? we try to find the clue
> 

Re: [Gluster-devel] Issue about the size of fstat is less than the really size of the syslog file

2016-11-01 Thread Raghavendra G
Hi George,

You can try your tests in two scenarios:

1. Switch off following performance translators:
a. write-behind
b. quick-read
c. readdir-ahead

Once you see that the issue goes away in the above scenario, you can apply
write-behind fix [1]  and turn write-behind on.

We have a fix in quick-read at [2].

You can also try to test with the quick-read fix and turning quick-read on.
However, since your issue is stale stat I doubt quick-read is not the
culprit here (The issue with quick-read is stale data).

We see that readdir-ahead is not updating/flushing the stats stored along
with dentries when write or file modifications happen. That seem to be the
issue with readdir-ahead. For now, during both of the above two scenarios,
please keep readdir-ahead off.

[1] http://review.gluster.org/15757/2
[2] http://review.gluster.org/#/c/15759/2

regards,
Raghavendra

On Tue, Nov 1, 2016 at 7:55 AM, Pranith Kumar Karampuri  wrote:

>
>
> On Tue, Nov 1, 2016 at 7:32 AM, Lian, George (Nokia - CN/Hangzhou) <
> george.l...@nokia.com> wrote:
>
>> Hi,
>>
>>
>>
>> I will test it with your patches and update to you when I have result.
>>
>
> hi George,
>   Please use http://review.gluster.org/#/c/15757/2 i.e. second
> version of Raghavendra's patch. I tested it and it worked fine. We are
> still trying to figure out quick-read and readdir-ahead as I type this mail.
>
>
>>
>> Thanks a lots
>>
>>
>>
>> Best Regards,
>>
>> George
>>
>>
>>
>> *From:* Pranith Kumar Karampuri [mailto:pkara...@redhat.com]
>> *Sent:* Monday, October 31, 2016 11:23 AM
>> *To:* Lian, George (Nokia - CN/Hangzhou) 
>> *Cc:* Raghavendra Gowdappa ; Zhang, Bingxuan (Nokia
>> - CN/Hangzhou) ; Gluster-devel@gluster.org;
>> Zizka, Jan (Nokia - CZ/Prague) 
>>
>> *Subject:* Re: [Gluster-devel] Issue about the size of fstat is less
>> than the really size of the syslog file
>>
>>
>>
>> Removing i_ext_mbb_wcdma_swd3_da1_mat...@internal.nsn.com, it is causing
>> mail delivery problems for me.
>>
>> George,
>>
>>  Raghavendra and I made some progress on this issue. We were in
>> parallel working on another issue which is similar where elastic search
>> indices are getting corrupted because of wrong stat sizes in our opinion.
>> So I have been running different translator stacks in identifying the
>> problematic xlators which are leading to indices corruption.
>>
>>   We found the list to be 1) Write-behind, 2) Quick-read, 3)
>> Readdir-ahead. Raghavendra and I just had a chat and we are suspecting that
>> lack of lookup/readdirp implementation in write-behind could be the reason
>> for this problem. Similar problems may exist in other two xlators too. But
>> we are working on write-behind with priority.
>>
>> Our theory is this:
>>
>> If we do a 4KB write for example and it is cached in write-behind and we
>> do a lookup on the file/do a readdirp on the directory with this file we
>> send out wrong stat value to the kernel. There are different caches between
>> kernel and gluster which may lead to fstat never coming till write-behind.
>> So we need to make sure that we don't get into this situation.
>>
>> Action items:
>>
>>  At the moment Raghavendra is working on a patch to implement
>> lookup/readdirp in write-behind. I am going to test the same for elastic
>> search. Will it be possible for you to test your application against the
>> same patch and confirm that the patch fixes the problem?
>>
>>
>>
>> On Fri, Oct 28, 2016 at 12:08 PM, Pranith Kumar Karampuri <
>> pkara...@redhat.com> wrote:
>>
>> hi George,
>>
>>It would help if we can identify the bare minimum xlators which
>> are contributing to the issue like Raghavendra was mentioning earlier. We
>> were wondering if it is possible for you to help us in identifying the
>> issue by running the workload on a modified setup? We can suggest testing
>> out using custom volfiles so that we can slowly build the graph which could
>> be causing this issue. We would like you guys to try out this problem with
>> just posix-xlator and fuse and nothing else.
>>
>>
>>
>> On Thu, Oct 27, 2016 at 1:40 PM, Lian, George (Nokia - CN/Hangzhou) <
>> george.l...@nokia.com> wrote:
>>
>> Hi, Raghavendra,
>>
>> Could you please give some suggestion for this issue? we try to find the
>> clue for this issue for a long time, but it has no progress:(
>>
>> Thanks & Best Regards,
>> George
>>
>> -Original Message-
>> From: Lian, George (Nokia - CN/Hangzhou)
>> Sent: Wednesday, October 19, 2016 4:40 PM
>> To: 'Raghavendra Gowdappa' 
>> Cc: Gluster-devel@gluster.org; I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS <
>> i_ext_mbb_wcdma_swd3_da1_mat...@internal.nsn.com>; Zhang, Bingxuan
>> (Nokia - CN/Hangzhou) ; Zizka, Jan (Nokia -
>> CZ/Prague) 
>> Subject: RE: [Gluster-devel] Issue about the size of fstat is less than
>> the really 

Re: [Gluster-devel] Issue about the size of fstat is less than the really size of the syslog file

2016-11-01 Thread Lian, George (Nokia - CN/Hangzhou)
Hi,

I will test it with your patches and update to you when I have result.

Thanks a lots

Best Regards,
George

From: Pranith Kumar Karampuri [mailto:pkara...@redhat.com]
Sent: Monday, October 31, 2016 11:23 AM
To: Lian, George (Nokia - CN/Hangzhou) 
Cc: Raghavendra Gowdappa ; Zhang, Bingxuan (Nokia - 
CN/Hangzhou) ; Gluster-devel@gluster.org; Zizka, Jan 
(Nokia - CZ/Prague) 
Subject: Re: [Gluster-devel] Issue about the size of fstat is less than the 
really size of the syslog file

Removing 
i_ext_mbb_wcdma_swd3_da1_mat...@internal.nsn.com,
 it is causing mail delivery problems for me.
George,
 Raghavendra and I made some progress on this issue. We were in 
parallel working on another issue which is similar where elastic search indices 
are getting corrupted because of wrong stat sizes in our opinion. So I have 
been running different translator stacks in identifying the problematic xlators 
which are leading to indices corruption.
  We found the list to be 1) Write-behind, 2) Quick-read, 3) 
Readdir-ahead. Raghavendra and I just had a chat and we are suspecting that 
lack of lookup/readdirp implementation in write-behind could be the reason for 
this problem. Similar problems may exist in other two xlators too. But we are 
working on write-behind with priority.
Our theory is this:
If we do a 4KB write for example and it is cached in write-behind and we do a 
lookup on the file/do a readdirp on the directory with this file we send out 
wrong stat value to the kernel. There are different caches between kernel and 
gluster which may lead to fstat never coming till write-behind. So we need to 
make sure that we don't get into this situation.
Action items:
 At the moment Raghavendra is working on a patch to implement lookup/readdirp 
in write-behind. I am going to test the same for elastic search. Will it be 
possible for you to test your application against the same patch and confirm 
that the patch fixes the problem?

On Fri, Oct 28, 2016 at 12:08 PM, Pranith Kumar Karampuri 
> wrote:
hi George,
   It would help if we can identify the bare minimum xlators which are 
contributing to the issue like Raghavendra was mentioning earlier. We were 
wondering if it is possible for you to help us in identifying the issue by 
running the workload on a modified setup? We can suggest testing out using 
custom volfiles so that we can slowly build the graph which could be causing 
this issue. We would like you guys to try out this problem with just 
posix-xlator and fuse and nothing else.

On Thu, Oct 27, 2016 at 1:40 PM, Lian, George (Nokia - CN/Hangzhou) 
> wrote:
Hi, Raghavendra,

Could you please give some suggestion for this issue? we try to find the clue 
for this issue for a long time, but it has no progress:(

Thanks & Best Regards,
George

-Original Message-
From: Lian, George (Nokia - CN/Hangzhou)
Sent: Wednesday, October 19, 2016 4:40 PM
To: 'Raghavendra Gowdappa' >
Cc: Gluster-devel@gluster.org; 
I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS 
>;
 Zhang, Bingxuan (Nokia - CN/Hangzhou) 
>; Zizka, Jan (Nokia 
- CZ/Prague) >
Subject: RE: [Gluster-devel] Issue about the size of fstat is less than the 
really size of the syslog file

Hi, Raghavendra

Just now, we test it with glusterfs log with debug-level "TRACE", and let some 
application trigger "glusterfs" produce large log, in that case, when we set 
write-behind and stat-prefetch both OFF,
Tail the glusterfs log such like mnt-{VOLUME-NAME}.log, it still failed with 
"file truncated",

So that means if file's IO in huge amount, the issue will still be there even 
write-behind and stat-prefetch both OFF.

Best Regards,
George

-Original Message-
From: Raghavendra Gowdappa 
[mailto:rgowd...@redhat.com]
Sent: Wednesday, October 19, 2016 2:54 PM
To: Lian, George (Nokia - CN/Hangzhou) 
>
Cc: Gluster-devel@gluster.org; 
I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS 
>;
 Zhang, Bingxuan (Nokia - CN/Hangzhou) 
>; Zizka, Jan (Nokia 
- CZ/Prague) >
Subject: Re: [Gluster-devel] Issue about the size of fstat is less than the 
really size of the syslog file



- Original Message -
> From: "George 

Re: [Gluster-devel] Issue about the size of fstat is less than the really size of the syslog file

2016-10-31 Thread Pranith Kumar Karampuri
On Tue, Nov 1, 2016 at 7:32 AM, Lian, George (Nokia - CN/Hangzhou) <
george.l...@nokia.com> wrote:

> Hi,
>
>
>
> I will test it with your patches and update to you when I have result.
>

hi George,
  Please use http://review.gluster.org/#/c/15757/2 i.e. second version
of Raghavendra's patch. I tested it and it worked fine. We are still trying
to figure out quick-read and readdir-ahead as I type this mail.


>
> Thanks a lots
>
>
>
> Best Regards,
>
> George
>
>
>
> *From:* Pranith Kumar Karampuri [mailto:pkara...@redhat.com]
> *Sent:* Monday, October 31, 2016 11:23 AM
> *To:* Lian, George (Nokia - CN/Hangzhou) 
> *Cc:* Raghavendra Gowdappa ; Zhang, Bingxuan (Nokia
> - CN/Hangzhou) ; Gluster-devel@gluster.org;
> Zizka, Jan (Nokia - CZ/Prague) 
>
> *Subject:* Re: [Gluster-devel] Issue about the size of fstat is less than
> the really size of the syslog file
>
>
>
> Removing i_ext_mbb_wcdma_swd3_da1_mat...@internal.nsn.com, it is causing
> mail delivery problems for me.
>
> George,
>
>  Raghavendra and I made some progress on this issue. We were in
> parallel working on another issue which is similar where elastic search
> indices are getting corrupted because of wrong stat sizes in our opinion.
> So I have been running different translator stacks in identifying the
> problematic xlators which are leading to indices corruption.
>
>   We found the list to be 1) Write-behind, 2) Quick-read, 3)
> Readdir-ahead. Raghavendra and I just had a chat and we are suspecting that
> lack of lookup/readdirp implementation in write-behind could be the reason
> for this problem. Similar problems may exist in other two xlators too. But
> we are working on write-behind with priority.
>
> Our theory is this:
>
> If we do a 4KB write for example and it is cached in write-behind and we
> do a lookup on the file/do a readdirp on the directory with this file we
> send out wrong stat value to the kernel. There are different caches between
> kernel and gluster which may lead to fstat never coming till write-behind.
> So we need to make sure that we don't get into this situation.
>
> Action items:
>
>  At the moment Raghavendra is working on a patch to implement
> lookup/readdirp in write-behind. I am going to test the same for elastic
> search. Will it be possible for you to test your application against the
> same patch and confirm that the patch fixes the problem?
>
>
>
> On Fri, Oct 28, 2016 at 12:08 PM, Pranith Kumar Karampuri <
> pkara...@redhat.com> wrote:
>
> hi George,
>
>It would help if we can identify the bare minimum xlators which are
> contributing to the issue like Raghavendra was mentioning earlier. We were
> wondering if it is possible for you to help us in identifying the issue by
> running the workload on a modified setup? We can suggest testing out using
> custom volfiles so that we can slowly build the graph which could be
> causing this issue. We would like you guys to try out this problem with
> just posix-xlator and fuse and nothing else.
>
>
>
> On Thu, Oct 27, 2016 at 1:40 PM, Lian, George (Nokia - CN/Hangzhou) <
> george.l...@nokia.com> wrote:
>
> Hi, Raghavendra,
>
> Could you please give some suggestion for this issue? we try to find the
> clue for this issue for a long time, but it has no progress:(
>
> Thanks & Best Regards,
> George
>
> -Original Message-
> From: Lian, George (Nokia - CN/Hangzhou)
> Sent: Wednesday, October 19, 2016 4:40 PM
> To: 'Raghavendra Gowdappa' 
> Cc: Gluster-devel@gluster.org; I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS <
> i_ext_mbb_wcdma_swd3_da1_mat...@internal.nsn.com>; Zhang, Bingxuan (Nokia
> - CN/Hangzhou) ; Zizka, Jan (Nokia - CZ/Prague)
> 
> Subject: RE: [Gluster-devel] Issue about the size of fstat is less than
> the really size of the syslog file
>
> Hi, Raghavendra
>
> Just now, we test it with glusterfs log with debug-level "TRACE", and let
> some application trigger "glusterfs" produce large log, in that case, when
> we set write-behind and stat-prefetch both OFF,
> Tail the glusterfs log such like mnt-{VOLUME-NAME}.log, it still failed
> with "file truncated",
>
> So that means if file's IO in huge amount, the issue will still be there
> even write-behind and stat-prefetch both OFF.
>
> Best Regards,
> George
>
> -Original Message-
> From: Raghavendra Gowdappa [mailto:rgowd...@redhat.com]
>
> Sent: Wednesday, October 19, 2016 2:54 PM
> To: Lian, George (Nokia - CN/Hangzhou) 
> Cc: Gluster-devel@gluster.org; I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS <
> i_ext_mbb_wcdma_swd3_da1_mat...@internal.nsn.com>; Zhang, Bingxuan (Nokia
> - CN/Hangzhou) ; Zizka, Jan (Nokia - CZ/Prague)
> 
> Subject: Re: [Gluster-devel] Issue about the size of fstat is less than
> the really size of the syslog file
>
>
>
> - 

Re: [Gluster-devel] Issue about the size of fstat is less than the really size of the syslog file

2016-10-31 Thread Raghavendra Gowdappa


- Original Message -
> From: "Raghavendra Gowdappa" 
> To: "George Lian (Nokia - CN/Hangzhou)" 
> Cc: "I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS" 
> , "Bingxuan Zhang (Nokia
> - CN/Hangzhou)" , Gluster-devel@gluster.org, "Jan 
> Zizka (Nokia - CZ/Prague)"
> 
> Sent: Tuesday, November 1, 2016 7:46:47 AM
> Subject: Re: [Gluster-devel] Issue about the size of fstat is less than the 
> really size of the syslog file
> 
> 
> 
> - Original Message -
> > From: "George Lian (Nokia - CN/Hangzhou)" 
> > To: "Raghavendra Gowdappa" , "Jan Zizka (Nokia -
> > CZ/Prague)" , "Bingxuan
> > Zhang (Nokia - CN/Hangzhou)" 
> > Cc: "Pranith Kumar Karampuri" ,
> > "I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS"
> > ,
> > Gluster-devel@gluster.org
> > Sent: Tuesday, November 1, 2016 6:35:10 AM
> > Subject: RE: [Gluster-devel] Issue about the size of fstat is less than the
> > really size of the syslog file
> > 
> > Hi, Raghavendra,
> > 
> > Thanks a lots for your update!
> > 
> > >IIUC, the "tail issue" can happen if 'tail -f' reads a stat with st_size
> > >lesser than previously read value (and hence the complaint - file
> > >truncated). In this case, even though fstat at T2 doesn't account the
> > >write
> > >at T0, it doesn't prove that st_size of fstat at T2 is lesser than that at
> > >any time before T2.
> > 
> > I just mean the st_size of fstat maybe less than the previously read value
> > in
> > that time, and it will lead to the "tail truncated" issue. Do you agree
> > with
> > me?
> 
> Yes. But, in your example there is only one fstat. For this to happen we need
> atleast two fstats and the latest st_size should be less than the oldest
> one. Am I missing anything here?
> 
> > 
> > >As to the relative ordering of write at T0 and fstat at T2, POSIX leaves
> > >it
> > >undefined. Unless write and fstat happen from same
> > >thread/single-threaded-application there is no requirement for maintaining
> > >that order (If they are issued from same thread fstat should account write
> > >at T0). Also note that it is not mentioned here fstat at T2 is issued
> > >_after_ write at T0 is _complete_. If that is the case, mdc_writev_cbk
> > >would've updated correct stat in cache and fstat would get correct value.
> > >If it is not the case, then there is no well defined order here.
> > 
> > >So, I don't think there is a bug here, unless I've missed out something.
> > 
> > Do you mean the GlusterFS not conflict with the requirement, so that the
> > application like "tail" should consider the case in network file system?
> 
> No. Applications shouldn't do anything different to work on Glusterfs.
> Otherwise its a bug :). What I am saying is that the issue with 'tail -f'
> might be because of a different bug than the example you gave. In other
> words, the RCA you posted may not be correct. It might be because of issues
> with write-behind (and other xlators) as I posted in other mail.
> 
> Priliminary testing by Pranith showed that Elasticsearch works fine with just
> write-behind. 

with patch http://review.gluster.org/15757 applied.

> So, that's a progress. Will keep you posted with our efforts
> on getting Elasticsearch working on Gluster. I've a feeling that, it will
> solve your issue (tail -f) too.
> 
> regards,
> Raghavendra
> 
> > 
> > @Jan & @Bingxuan, do you have some comments for the above information?
> > 
> > 
> > Best Regards,
> > George
> > 
> > -Original Message-
> > From: Raghavendra Gowdappa [mailto:rgowd...@redhat.com]
> > Sent: Monday, October 31, 2016 6:35 PM
> > To: Lian, George (Nokia - CN/Hangzhou) 
> > Cc: Pranith Kumar Karampuri ;
> > I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS
> > ; Zhang, Bingxuan (Nokia
> > -
> > CN/Hangzhou) ; Gluster-devel@gluster.org; Zizka,
> > Jan (Nokia - CZ/Prague) 
> > Subject: Re: [Gluster-devel] Issue about the size of fstat is less than the
> > really size of the syslog file
> > 
> > 
> > 
> > - Original Message -
> > > From: "George Lian (Nokia - CN/Hangzhou)" 
> > > To: "Pranith Kumar Karampuri" , "Raghavendra
> > > Gowdappa"
> > > 
> > > Cc: "I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS"
> > > , "Bingxuan Zhang
> > > (Nokia
> > > - CN/Hangzhou)" , Gluster-devel@gluster.org,
> > > "Jan
> > > Zizka (Nokia - CZ/Prague)"
> > > 
> > > Sent: Monday, October 31, 2016 2:32:34 PM
> > > Subject: RE: [Gluster-devel] Issue about the size of fstat is less than
> > > the
> > > really size of 

Re: [Gluster-devel] Issue about the size of fstat is less than the really size of the syslog file

2016-10-31 Thread Raghavendra Gowdappa


- Original Message -
> From: "George Lian (Nokia - CN/Hangzhou)" 
> To: "Raghavendra Gowdappa" , "Jan Zizka (Nokia - 
> CZ/Prague)" , "Bingxuan
> Zhang (Nokia - CN/Hangzhou)" 
> Cc: "Pranith Kumar Karampuri" , 
> "I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS"
> , Gluster-devel@gluster.org
> Sent: Tuesday, November 1, 2016 6:35:10 AM
> Subject: RE: [Gluster-devel] Issue about the size of fstat is less than the 
> really size of the syslog file
> 
> Hi, Raghavendra,
> 
> Thanks a lots for your update!
> 
> >IIUC, the "tail issue" can happen if 'tail -f' reads a stat with st_size
> >lesser than previously read value (and hence the complaint - file
> >truncated). In this case, even though fstat at T2 doesn't account the write
> >at T0, it doesn't prove that st_size of fstat at T2 is lesser than that at
> >any time before T2.
> 
> I just mean the st_size of fstat maybe less than the previously read value in
> that time, and it will lead to the "tail truncated" issue. Do you agree with
> me?

Yes. But, in your example there is only one fstat. For this to happen we need 
atleast two fstats and the latest st_size should be less than the oldest one. 
Am I missing anything here?

> 
> >As to the relative ordering of write at T0 and fstat at T2, POSIX leaves it
> >undefined. Unless write and fstat happen from same
> >thread/single-threaded-application there is no requirement for maintaining
> >that order (If they are issued from same thread fstat should account write
> >at T0). Also note that it is not mentioned here fstat at T2 is issued
> >_after_ write at T0 is _complete_. If that is the case, mdc_writev_cbk
> >would've updated correct stat in cache and fstat would get correct value.
> >If it is not the case, then there is no well defined order here.
> 
> >So, I don't think there is a bug here, unless I've missed out something.
> 
> Do you mean the GlusterFS not conflict with the requirement, so that the
> application like "tail" should consider the case in network file system?

No. Applications shouldn't do anything different to work on Glusterfs. 
Otherwise its a bug :). What I am saying is that the issue with 'tail -f' might 
be because of a different bug than the example you gave. In other words, the 
RCA you posted may not be correct. It might be because of issues with 
write-behind (and other xlators) as I posted in other mail.

Priliminary testing by Pranith showed that Elasticsearch works fine with just 
write-behind. So, that's a progress. Will keep you posted with our efforts on 
getting Elasticsearch working on Gluster. I've a feeling that, it will solve 
your issue (tail -f) too.

regards,
Raghavendra

> 
> @Jan & @Bingxuan, do you have some comments for the above information?
> 
> 
> Best Regards,
> George
> 
> -Original Message-
> From: Raghavendra Gowdappa [mailto:rgowd...@redhat.com]
> Sent: Monday, October 31, 2016 6:35 PM
> To: Lian, George (Nokia - CN/Hangzhou) 
> Cc: Pranith Kumar Karampuri ;
> I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS
> ; Zhang, Bingxuan (Nokia -
> CN/Hangzhou) ; Gluster-devel@gluster.org; Zizka,
> Jan (Nokia - CZ/Prague) 
> Subject: Re: [Gluster-devel] Issue about the size of fstat is less than the
> really size of the syslog file
> 
> 
> 
> - Original Message -
> > From: "George Lian (Nokia - CN/Hangzhou)" 
> > To: "Pranith Kumar Karampuri" , "Raghavendra Gowdappa"
> > 
> > Cc: "I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS"
> > , "Bingxuan Zhang (Nokia
> > - CN/Hangzhou)" , Gluster-devel@gluster.org, "Jan
> > Zizka (Nokia - CZ/Prague)"
> > 
> > Sent: Monday, October 31, 2016 2:32:34 PM
> > Subject: RE: [Gluster-devel] Issue about the size of fstat is less than the
> > really size of the syslog file
> > 
> > Hi,
> > 
> > I suppose there seems a defect on mdc_writev_cbk  and mdc_fstat
> > Let’s assume in 2 timestamp which called write and fstat operation in
> > application:
> > T0:  write (process a)
> > T1: read (process b) with the data of T0 of process a.
> > T2: fstat   (process c)
> > In my view, mdc_write is non-block operation and have some lock to protect
> > in
> > afr xlator,  because mdc_fstat not check the lock in AFR xaltor, so
> > mdc_writev_cbk which called “mdc_inode_iatt_set_validate” maybe later than
> > mdc_fstat.
> > Such like
> > T3: fstat result of T2  without the “mdc_inode_iatt_set_validate” of T0
> > when
> > stat-prefetch options is on.
> > T4: “mdc_inode_iatt_set_validate” is called of T0 in mdc_writev_cbk.
> > 
> > Lets’ assume T0 > 

Re: [Gluster-devel] Issue about the size of fstat is less than the really size of the syslog file

2016-10-31 Thread Lian, George (Nokia - CN/Hangzhou)
Hi, Raghavendra,

Thanks a lots for your update!

>IIUC, the "tail issue" can happen if 'tail -f' reads a stat with st_size 
>lesser than previously read value (and hence the complaint - file truncated). 
>In this case, even though fstat at T2 doesn't account the write at T0, it 
>doesn't prove that st_size of fstat at T2 is lesser than that at any time 
>before T2.

I just mean the st_size of fstat maybe less than the previously read value in 
that time, and it will lead to the "tail truncated" issue. Do you agree with me?

>As to the relative ordering of write at T0 and fstat at T2, POSIX leaves it 
>undefined. Unless write and fstat happen from same 
>thread/single-threaded-application there is no requirement for maintaining 
>that order (If they are issued from same thread fstat should account write at 
>T0). Also note that it is not mentioned here fstat at T2 is issued _after_ 
>write at T0 is _complete_. If that is the case, mdc_writev_cbk would've 
>updated correct stat in cache and fstat would get correct value. If it is not 
>the case, then there is no well defined order here.

>So, I don't think there is a bug here, unless I've missed out something.

Do you mean the GlusterFS not conflict with the requirement, so that the 
application like "tail" should consider the case in network file system?

@Jan & @Bingxuan, do you have some comments for the above information?


Best Regards,
George

-Original Message-
From: Raghavendra Gowdappa [mailto:rgowd...@redhat.com] 
Sent: Monday, October 31, 2016 6:35 PM
To: Lian, George (Nokia - CN/Hangzhou) 
Cc: Pranith Kumar Karampuri ; 
I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS 
; Zhang, Bingxuan (Nokia - 
CN/Hangzhou) ; Gluster-devel@gluster.org; Zizka, Jan 
(Nokia - CZ/Prague) 
Subject: Re: [Gluster-devel] Issue about the size of fstat is less than the 
really size of the syslog file



- Original Message -
> From: "George Lian (Nokia - CN/Hangzhou)" 
> To: "Pranith Kumar Karampuri" , "Raghavendra Gowdappa" 
> 
> Cc: "I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS" 
> , "Bingxuan Zhang (Nokia
> - CN/Hangzhou)" , Gluster-devel@gluster.org, "Jan 
> Zizka (Nokia - CZ/Prague)"
> 
> Sent: Monday, October 31, 2016 2:32:34 PM
> Subject: RE: [Gluster-devel] Issue about the size of fstat is less than the 
> really size of the syslog file
> 
> Hi,
> 
> I suppose there seems a defect on mdc_writev_cbk  and mdc_fstat
> Let’s assume in 2 timestamp which called write and fstat operation in
> application:
> T0:  write (process a)
> T1: read (process b) with the data of T0 of process a.
> T2: fstat   (process c)
> In my view, mdc_write is non-block operation and have some lock to protect in
> afr xlator,  because mdc_fstat not check the lock in AFR xaltor, so
> mdc_writev_cbk which called “mdc_inode_iatt_set_validate” maybe later than
> mdc_fstat.
> Such like
> T3: fstat result of T2  without the “mdc_inode_iatt_set_validate” of T0 when
> stat-prefetch options is on.
> T4: “mdc_inode_iatt_set_validate” is called of T0 in mdc_writev_cbk.
> 
> Lets’ assume T0 in multi-process environment and the load of CPU is high?
> If it is reasonable, then issue of “tail issue” will be happened.

IIUC, the "tail issue" can happen if 'tail -f' reads a stat with st_size lesser 
than previously read value (and hence the complaint - file truncated). In this 
case, even though fstat at T2 doesn't account the write at T0, it doesn't prove 
that st_size of fstat at T2 is lesser than that at any time before T2.

As to the relative ordering of write at T0 and fstat at T2, POSIX leaves it 
undefined. Unless write and fstat happen from same 
thread/single-threaded-application there is no requirement for maintaining that 
order (If they are issued from same thread fstat should account write at T0). 
Also note that it is not mentioned here fstat at T2 is issued _after_ write at 
T0 is _complete_. If that is the case, mdc_writev_cbk would've updated correct 
stat in cache and fstat would get correct value. If it is not the case, then 
there is no well defined order here.

So, I don't think there is a bug here, unless I've missed out something.


> 
> So maybe a fix suggestion is on mdc_fstat operation , we should add an
> operation to check whether the writev operation is ongoing or not, if
> write-operation is ongoing, should goto uncached label in mdc_fstat
> function.
> 
> Could you please confirm the above assumption and suggestion?
> 
> 
> Thanks & Best Regards,
> George
> 
> 
> From: Lian, George (Nokia - CN/Hangzhou)
> Sent: Monday, October 31, 2016 4:25 PM
> To: Pranith Kumar Karampuri ; Raghavendra Gowdappa
> 

Re: [Gluster-devel] Issue about the size of fstat is less than the really size of the syslog file

2016-10-31 Thread Lian, George (Nokia - CN/Hangzhou)
Hi,

How can we enable debug.trace so that we can inspect the debug data on 
different xlator?
I just set “debug.trace on” and “debug.log-file yes” seems not work now.

And one more update for this issue, if we set performance.stat-prefetch to off, 
the issue will not be occurred. (our previous test maybe not correct☺ )

Thanks & Best Regards,
George

From: Pranith Kumar Karampuri [mailto:pkara...@redhat.com]
Sent: Friday, October 28, 2016 2:39 PM
To: Lian, George (Nokia - CN/Hangzhou) 
Cc: Raghavendra Gowdappa ; 
I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS 
; Zhang, Bingxuan (Nokia - 
CN/Hangzhou) ; Gluster-devel@gluster.org; Zizka, Jan 
(Nokia - CZ/Prague) 
Subject: Re: [Gluster-devel] Issue about the size of fstat is less than the 
really size of the syslog file

hi George,
   It would help if we can identify the bare minimum xlators which are 
contributing to the issue like Raghavendra was mentioning earlier. We were 
wondering if it is possible for you to help us in identifying the issue by 
running the workload on a modified setup? We can suggest testing out using 
custom volfiles so that we can slowly build the graph which could be causing 
this issue. We would like you guys to try out this problem with just 
posix-xlator and fuse and nothing else.

On Thu, Oct 27, 2016 at 1:40 PM, Lian, George (Nokia - CN/Hangzhou) 
> wrote:
Hi, Raghavendra,

Could you please give some suggestion for this issue? we try to find the clue 
for this issue for a long time, but it has no progress:(

Thanks & Best Regards,
George

-Original Message-
From: Lian, George (Nokia - CN/Hangzhou)
Sent: Wednesday, October 19, 2016 4:40 PM
To: 'Raghavendra Gowdappa' >
Cc: Gluster-devel@gluster.org; 
I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS 
>;
 Zhang, Bingxuan (Nokia - CN/Hangzhou) 
>; Zizka, Jan (Nokia 
- CZ/Prague) >
Subject: RE: [Gluster-devel] Issue about the size of fstat is less than the 
really size of the syslog file

Hi, Raghavendra

Just now, we test it with glusterfs log with debug-level "TRACE", and let some 
application trigger "glusterfs" produce large log, in that case, when we set 
write-behind and stat-prefetch both OFF,
Tail the glusterfs log such like mnt-{VOLUME-NAME}.log, it still failed with 
"file truncated",

So that means if file's IO in huge amount, the issue will still be there even 
write-behind and stat-prefetch both OFF.

Best Regards,
George

-Original Message-
From: Raghavendra Gowdappa 
[mailto:rgowd...@redhat.com]
Sent: Wednesday, October 19, 2016 2:54 PM
To: Lian, George (Nokia - CN/Hangzhou) 
>
Cc: Gluster-devel@gluster.org; 
I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS 
>;
 Zhang, Bingxuan (Nokia - CN/Hangzhou) 
>; Zizka, Jan (Nokia 
- CZ/Prague) >
Subject: Re: [Gluster-devel] Issue about the size of fstat is less than the 
really size of the syslog file



- Original Message -
> From: "George Lian (Nokia - CN/Hangzhou)" 
> >
> To: "Raghavendra Gowdappa" >
> Cc: Gluster-devel@gluster.org, 
> "I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS"
> >,
>  "Bingxuan Zhang (Nokia - CN/Hangzhou)"
> >, "Jan Zizka 
> (Nokia - CZ/Prague)" >
> Sent: Wednesday, October 19, 2016 12:05:01 PM
> Subject: RE: [Gluster-devel] Issue about the size of fstat is less than the 
> really size of the syslog file
>
> Hi, Raghavendra,
>
> Thanks a lots for your quickly update!
> In my case, there are so many process(write) is writing to the syslog file,
> it do involve the writer is in the same host and writing in same mount point
> while the tail(reader) is reading it.
>
> The bug I just guess is:
> When a writer write the data with write-behind, it call the call-back
> function " mdc_writev_cbk" and called "mdc_inode_iatt_set_validate" to
> validate the "iatt" data, but with the code I mentioned last mail, it do
> nothing.

mdc_inode_iatt_set_validate has following code


if 

Re: [Gluster-devel] Issue about the size of fstat is less than the really size of the syslog file

2016-10-31 Thread Raghavendra Gowdappa


- Original Message -
> From: "George Lian (Nokia - CN/Hangzhou)" 
> To: "Pranith Kumar Karampuri" , "Raghavendra Gowdappa" 
> 
> Cc: "I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS" 
> , "Bingxuan Zhang (Nokia
> - CN/Hangzhou)" , Gluster-devel@gluster.org, "Jan 
> Zizka (Nokia - CZ/Prague)"
> 
> Sent: Monday, October 31, 2016 2:32:34 PM
> Subject: RE: [Gluster-devel] Issue about the size of fstat is less than the 
> really size of the syslog file
> 
> Hi,
> 
> I suppose there seems a defect on mdc_writev_cbk  and mdc_fstat
> Let’s assume in 2 timestamp which called write and fstat operation in
> application:
> T0:  write (process a)
> T1: read (process b) with the data of T0 of process a.
> T2: fstat   (process c)
> In my view, mdc_write is non-block operation and have some lock to protect in
> afr xlator,  because mdc_fstat not check the lock in AFR xaltor, so
> mdc_writev_cbk which called “mdc_inode_iatt_set_validate” maybe later than
> mdc_fstat.
> Such like
> T3: fstat result of T2  without the “mdc_inode_iatt_set_validate” of T0 when
> stat-prefetch options is on.
> T4: “mdc_inode_iatt_set_validate” is called of T0 in mdc_writev_cbk.
> 
> Lets’ assume T0 in multi-process environment and the load of CPU is high?
> If it is reasonable, then issue of “tail issue” will be happened.

IIUC, the "tail issue" can happen if 'tail -f' reads a stat with st_size lesser 
than previously read value (and hence the complaint - file truncated). In this 
case, even though fstat at T2 doesn't account the write at T0, it doesn't prove 
that st_size of fstat at T2 is lesser than that at any time before T2.

As to the relative ordering of write at T0 and fstat at T2, POSIX leaves it 
undefined. Unless write and fstat happen from same 
thread/single-threaded-application there is no requirement for maintaining that 
order (If they are issued from same thread fstat should account write at T0). 
Also note that it is not mentioned here fstat at T2 is issued _after_ write at 
T0 is _complete_. If that is the case, mdc_writev_cbk would've updated correct 
stat in cache and fstat would get correct value. If it is not the case, then 
there is no well defined order here.

So, I don't think there is a bug here, unless I've missed out something.


> 
> So maybe a fix suggestion is on mdc_fstat operation , we should add an
> operation to check whether the writev operation is ongoing or not, if
> write-operation is ongoing, should goto uncached label in mdc_fstat
> function.
> 
> Could you please confirm the above assumption and suggestion?
> 
> 
> Thanks & Best Regards,
> George
> 
> 
> From: Lian, George (Nokia - CN/Hangzhou)
> Sent: Monday, October 31, 2016 4:25 PM
> To: Pranith Kumar Karampuri ; Raghavendra Gowdappa
> 
> Cc: I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS
> ; Zhang, Bingxuan (Nokia -
> CN/Hangzhou) ; Gluster-devel@gluster.org; Zizka,
> Jan (Nokia - CZ/Prague) 
> Subject: RE: [Gluster-devel] Issue about the size of fstat is less than the
> really size of the syslog file
> 
> Hi,
> 
> How can we enable debug.trace so that we can inspect the debug data on
> different xlator?
> I just set “debug.trace on” and “debug.log-file yes” seems not work now.
> 
> And one more update for this issue, if we set performance.stat-prefetch to
> off, the issue will not be occurred. (our previous test maybe not correct☺ )
> 
> Thanks & Best Regards,
> George
> 
> From: Pranith Kumar Karampuri [mailto:pkara...@redhat.com]
> Sent: Friday, October 28, 2016 2:39 PM
> To: Lian, George (Nokia - CN/Hangzhou)
> >
> Cc: Raghavendra Gowdappa >;
> I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS
> >;
> Zhang, Bingxuan (Nokia - CN/Hangzhou)
> >;
> Gluster-devel@gluster.org; Zizka, Jan
> (Nokia - CZ/Prague) >
> Subject: Re: [Gluster-devel] Issue about the size of fstat is less than the
> really size of the syslog file
> 
> hi George,
>It would help if we can identify the bare minimum xlators which are
>contributing to the issue like Raghavendra was mentioning earlier. We
>were wondering if it is possible for you to help us in identifying
>the issue by running the workload on a modified setup? We can suggest
>testing out using custom volfiles so that we can slowly build the
>graph which could be causing this issue. We would like 

Re: [Gluster-devel] Issue about the size of fstat is less than the really size of the syslog file

2016-10-31 Thread Lian, George (Nokia - CN/Hangzhou)
Hi,

I suppose there seems a defect on mdc_writev_cbk  and mdc_fstat
Let’s assume in 2 timestamp which called write and fstat operation in 
application:
T0:  write (process a)
T1: read (process b) with the data of T0 of process a.
T2: fstat   (process c)
In my view, mdc_write is non-block operation and have some lock to protect in 
afr xlator,  because mdc_fstat not check the lock in AFR xaltor, so 
mdc_writev_cbk which called “mdc_inode_iatt_set_validate” maybe later than 
mdc_fstat.
Such like
T3: fstat result of T2  without the “mdc_inode_iatt_set_validate” of T0 when 
stat-prefetch options is on.
T4: “mdc_inode_iatt_set_validate” is called of T0 in mdc_writev_cbk.

Lets’ assume T0; Raghavendra Gowdappa 

Cc: I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS 
; Zhang, Bingxuan (Nokia - 
CN/Hangzhou) ; Gluster-devel@gluster.org; Zizka, Jan 
(Nokia - CZ/Prague) 
Subject: RE: [Gluster-devel] Issue about the size of fstat is less than the 
really size of the syslog file

Hi,

How can we enable debug.trace so that we can inspect the debug data on 
different xlator?
I just set “debug.trace on” and “debug.log-file yes” seems not work now.

And one more update for this issue, if we set performance.stat-prefetch to off, 
the issue will not be occurred. (our previous test maybe not correct☺ )

Thanks & Best Regards,
George

From: Pranith Kumar Karampuri [mailto:pkara...@redhat.com]
Sent: Friday, October 28, 2016 2:39 PM
To: Lian, George (Nokia - CN/Hangzhou) 
>
Cc: Raghavendra Gowdappa >; 
I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS 
>;
 Zhang, Bingxuan (Nokia - CN/Hangzhou) 
>; 
Gluster-devel@gluster.org; Zizka, Jan (Nokia 
- CZ/Prague) >
Subject: Re: [Gluster-devel] Issue about the size of fstat is less than the 
really size of the syslog file

hi George,
   It would help if we can identify the bare minimum xlators which are 
contributing to the issue like Raghavendra was mentioning earlier. We were 
wondering if it is possible for you to help us in identifying the issue by 
running the workload on a modified setup? We can suggest testing out using 
custom volfiles so that we can slowly build the graph which could be causing 
this issue. We would like you guys to try out this problem with just 
posix-xlator and fuse and nothing else.

On Thu, Oct 27, 2016 at 1:40 PM, Lian, George (Nokia - CN/Hangzhou) 
> wrote:
Hi, Raghavendra,

Could you please give some suggestion for this issue? we try to find the clue 
for this issue for a long time, but it has no progress:(

Thanks & Best Regards,
George

-Original Message-
From: Lian, George (Nokia - CN/Hangzhou)
Sent: Wednesday, October 19, 2016 4:40 PM
To: 'Raghavendra Gowdappa' >
Cc: Gluster-devel@gluster.org; 
I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS 
>;
 Zhang, Bingxuan (Nokia - CN/Hangzhou) 
>; Zizka, Jan (Nokia 
- CZ/Prague) >
Subject: RE: [Gluster-devel] Issue about the size of fstat is less than the 
really size of the syslog file

Hi, Raghavendra

Just now, we test it with glusterfs log with debug-level "TRACE", and let some 
application trigger "glusterfs" produce large log, in that case, when we set 
write-behind and stat-prefetch both OFF,
Tail the glusterfs log such like mnt-{VOLUME-NAME}.log, it still failed with 
"file truncated",

So that means if file's IO in huge amount, the issue will still be there even 
write-behind and stat-prefetch both OFF.

Best Regards,
George

-Original Message-
From: Raghavendra Gowdappa 
[mailto:rgowd...@redhat.com]
Sent: Wednesday, October 19, 2016 2:54 PM
To: Lian, George (Nokia - 

Re: [Gluster-devel] Issue about the size of fstat is less than the really size of the syslog file

2016-10-30 Thread Raghavendra Gowdappa
Thanks Pranith for the update.

Patch for fixing write-behind to not send stale stat can be found at:
http://review.gluster.org/#/c/15757/1

If possible, please test it and let us know the results. Your results are very 
important for us to resolve the issue.

regards,
Raghavendra

- Original Message -
> From: "Pranith Kumar Karampuri" 
> To: "George Lian (Nokia - CN/Hangzhou)" 
> Cc: "Raghavendra Gowdappa" , "Bingxuan Zhang (Nokia - 
> CN/Hangzhou)" ,
> Gluster-devel@gluster.org, "Jan Zizka (Nokia - CZ/Prague)" 
> 
> Sent: Monday, October 31, 2016 8:53:07 AM
> Subject: Re: [Gluster-devel] Issue about the size of fstat is less than the 
> really size of the syslog file
> 
> Removing i_ext_mbb_wcdma_swd3_da1_mat...@internal.nsn.com, it is causing
> mail delivery problems for me.
> 
> George,
>  Raghavendra and I made some progress on this issue. We were in
> parallel working on another issue which is similar where elastic search
> indices are getting corrupted because of wrong stat sizes in our opinion.
> So I have been running different translator stacks in identifying the
> problematic xlators which are leading to indices corruption.
>   We found the list to be 1) Write-behind, 2) Quick-read, 3)
> Readdir-ahead. Raghavendra and I just had a chat and we are suspecting that
> lack of lookup/readdirp implementation in write-behind could be the reason
> for this problem. Similar problems may exist in other two xlators too. But
> we are working on write-behind with priority.
> 
> Our theory is this:
> If we do a 4KB write for example and it is cached in write-behind and we do
> a lookup on the file/do a readdirp on the directory with this file we send
> out wrong stat value to the kernel. There are different caches between
> kernel and gluster which may lead to fstat never coming till write-behind.
> So we need to make sure that we don't get into this situation.
> 
> Action items:
>  At the moment Raghavendra is working on a patch to implement
> lookup/readdirp in write-behind. I am going to test the same for elastic
> search. Will it be possible for you to test your application against the
> same patch and confirm that the patch fixes the problem?
> 
> 
> On Fri, Oct 28, 2016 at 12:08 PM, Pranith Kumar Karampuri <
> pkara...@redhat.com> wrote:
> 
> > hi George,
> >It would help if we can identify the bare minimum xlators which are
> > contributing to the issue like Raghavendra was mentioning earlier. We were
> > wondering if it is possible for you to help us in identifying the issue by
> > running the workload on a modified setup? We can suggest testing out using
> > custom volfiles so that we can slowly build the graph which could be
> > causing this issue. We would like you guys to try out this problem with
> > just posix-xlator and fuse and nothing else.
> >
> > On Thu, Oct 27, 2016 at 1:40 PM, Lian, George (Nokia - CN/Hangzhou) <
> > george.l...@nokia.com> wrote:
> >
> >> Hi, Raghavendra,
> >>
> >> Could you please give some suggestion for this issue? we try to find the
> >> clue for this issue for a long time, but it has no progress:(
> >>
> >> Thanks & Best Regards,
> >> George
> >>
> >> -Original Message-
> >> From: Lian, George (Nokia - CN/Hangzhou)
> >> Sent: Wednesday, October 19, 2016 4:40 PM
> >> To: 'Raghavendra Gowdappa' 
> >> Cc: Gluster-devel@gluster.org; I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS <
> >> i_ext_mbb_wcdma_swd3_da1_mat...@internal.nsn.com>; Zhang, Bingxuan
> >> (Nokia - CN/Hangzhou) ; Zizka, Jan (Nokia -
> >> CZ/Prague) 
> >> Subject: RE: [Gluster-devel] Issue about the size of fstat is less than
> >> the really size of the syslog file
> >>
> >> Hi, Raghavendra
> >>
> >> Just now, we test it with glusterfs log with debug-level "TRACE", and let
> >> some application trigger "glusterfs" produce large log, in that case, when
> >> we set write-behind and stat-prefetch both OFF,
> >> Tail the glusterfs log such like mnt-{VOLUME-NAME}.log, it still failed
> >> with "file truncated",
> >>
> >> So that means if file's IO in huge amount, the issue will still be there
> >> even write-behind and stat-prefetch both OFF.
> >>
> >> Best Regards,
> >> George
> >>
> >> -Original Message-
> >> From: Raghavendra Gowdappa [mailto:rgowd...@redhat.com]
> >> Sent: Wednesday, October 19, 2016 2:54 PM
> >> To: Lian, George (Nokia - CN/Hangzhou) 
> >> Cc: Gluster-devel@gluster.org; I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS <
> >> i_ext_mbb_wcdma_swd3_da1_mat...@internal.nsn.com>; Zhang, Bingxuan
> >> (Nokia - CN/Hangzhou) ; Zizka, Jan (Nokia -
> >> CZ/Prague) 
> >> Subject: Re: [Gluster-devel] Issue about the size of fstat is less than
> >> the really size of the syslog file
> >>
> >>
> >>
> >> - Original Message -

Re: [Gluster-devel] Issue about the size of fstat is less than the really size of the syslog file

2016-10-28 Thread Pranith Kumar Karampuri
hi George,
   It would help if we can identify the bare minimum xlators which are
contributing to the issue like Raghavendra was mentioning earlier. We were
wondering if it is possible for you to help us in identifying the issue by
running the workload on a modified setup? We can suggest testing out using
custom volfiles so that we can slowly build the graph which could be
causing this issue. We would like you guys to try out this problem with
just posix-xlator and fuse and nothing else.

On Thu, Oct 27, 2016 at 1:40 PM, Lian, George (Nokia - CN/Hangzhou) <
george.l...@nokia.com> wrote:

> Hi, Raghavendra,
>
> Could you please give some suggestion for this issue? we try to find the
> clue for this issue for a long time, but it has no progress:(
>
> Thanks & Best Regards,
> George
>
> -Original Message-
> From: Lian, George (Nokia - CN/Hangzhou)
> Sent: Wednesday, October 19, 2016 4:40 PM
> To: 'Raghavendra Gowdappa' 
> Cc: Gluster-devel@gluster.org; I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS <
> i_ext_mbb_wcdma_swd3_da1_mat...@internal.nsn.com>; Zhang, Bingxuan (Nokia
> - CN/Hangzhou) ; Zizka, Jan (Nokia - CZ/Prague)
> 
> Subject: RE: [Gluster-devel] Issue about the size of fstat is less than
> the really size of the syslog file
>
> Hi, Raghavendra
>
> Just now, we test it with glusterfs log with debug-level "TRACE", and let
> some application trigger "glusterfs" produce large log, in that case, when
> we set write-behind and stat-prefetch both OFF,
> Tail the glusterfs log such like mnt-{VOLUME-NAME}.log, it still failed
> with "file truncated",
>
> So that means if file's IO in huge amount, the issue will still be there
> even write-behind and stat-prefetch both OFF.
>
> Best Regards,
> George
>
> -Original Message-
> From: Raghavendra Gowdappa [mailto:rgowd...@redhat.com]
> Sent: Wednesday, October 19, 2016 2:54 PM
> To: Lian, George (Nokia - CN/Hangzhou) 
> Cc: Gluster-devel@gluster.org; I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS <
> i_ext_mbb_wcdma_swd3_da1_mat...@internal.nsn.com>; Zhang, Bingxuan (Nokia
> - CN/Hangzhou) ; Zizka, Jan (Nokia - CZ/Prague)
> 
> Subject: Re: [Gluster-devel] Issue about the size of fstat is less than
> the really size of the syslog file
>
>
>
> - Original Message -
> > From: "George Lian (Nokia - CN/Hangzhou)" 
> > To: "Raghavendra Gowdappa" 
> > Cc: Gluster-devel@gluster.org, "I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS"
> > , "Bingxuan Zhang
> (Nokia - CN/Hangzhou)"
> > , "Jan Zizka (Nokia - CZ/Prague)" <
> jan.zi...@nokia.com>
> > Sent: Wednesday, October 19, 2016 12:05:01 PM
> > Subject: RE: [Gluster-devel] Issue about the size of fstat is less than
> the really size of the syslog file
> >
> > Hi, Raghavendra,
> >
> > Thanks a lots for your quickly update!
> > In my case, there are so many process(write) is writing to the syslog
> file,
> > it do involve the writer is in the same host and writing in same mount
> point
> > while the tail(reader) is reading it.
> >
> > The bug I just guess is:
> > When a writer write the data with write-behind, it call the call-back
> > function " mdc_writev_cbk" and called "mdc_inode_iatt_set_validate" to
> > validate the "iatt" data, but with the code I mentioned last mail, it do
> > nothing.
>
> mdc_inode_iatt_set_validate has following code
>
> 
> if (!iatt || !iatt->ia_ctime) {
> mdc->ia_time = 0;
> goto unlock;
> }
> 
>
> Which means a NULL iatt sets mdc->ia_time to 0. This results in subsequent
> lookup/stat calls to be NOT served from md-cache. Instead, the stat is
> served from backend bricks. So, I don't see an issue here.
>
> However, one case where a NULL iatt is different from a valid iatt (which
> differs from the value stored in md-cache) is that the latter results in a
> call to inode_invalidate. This invalidation propagates to kernel and all
> dentry and page cache corresponding to file is purged. So, I am suspecting
> whether the stale stat you saw was served from kernel cache (not from
> glusterfs). If this is the case, having mount options "attribute-timeout=0"
> and "entry-timeout=0" should've helped.
>
> I am still at loss to point out the RCA for this issue.
>
>
> > And in same time, the reader(tail) read the "iatt" data, but in case of
> the
> > cache-time is not timeout, it will return the "iatt" data without the
> last
> > change.
> >
> > Do your think it is a possible bug?
> >
> > Thanks & Best Regards,
> > George
> >
> > -Original Message-
> > From: Raghavendra Gowdappa [mailto:rgowd...@redhat.com]
> > Sent: Wednesday, October 19, 2016 2:06 PM
> > To: Lian, George (Nokia - CN/Hangzhou) 
> > Cc: Gluster-devel@gluster.org; 

Re: [Gluster-devel] Issue about the size of fstat is less than the really size of the syslog file

2016-10-27 Thread Lian, George (Nokia - CN/Hangzhou)
Hi, Raghavendra,

Could you please give some suggestion for this issue? we try to find the clue 
for this issue for a long time, but it has no progress:(

Thanks & Best Regards,
George

-Original Message-
From: Lian, George (Nokia - CN/Hangzhou) 
Sent: Wednesday, October 19, 2016 4:40 PM
To: 'Raghavendra Gowdappa' 
Cc: Gluster-devel@gluster.org; I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS 
; Zhang, Bingxuan (Nokia - 
CN/Hangzhou) ; Zizka, Jan (Nokia - CZ/Prague) 

Subject: RE: [Gluster-devel] Issue about the size of fstat is less than the 
really size of the syslog file

Hi, Raghavendra

Just now, we test it with glusterfs log with debug-level "TRACE", and let some 
application trigger "glusterfs" produce large log, in that case, when we set 
write-behind and stat-prefetch both OFF,
Tail the glusterfs log such like mnt-{VOLUME-NAME}.log, it still failed with 
"file truncated",

So that means if file's IO in huge amount, the issue will still be there even 
write-behind and stat-prefetch both OFF.

Best Regards,
George

-Original Message-
From: Raghavendra Gowdappa [mailto:rgowd...@redhat.com] 
Sent: Wednesday, October 19, 2016 2:54 PM
To: Lian, George (Nokia - CN/Hangzhou) 
Cc: Gluster-devel@gluster.org; I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS 
; Zhang, Bingxuan (Nokia - 
CN/Hangzhou) ; Zizka, Jan (Nokia - CZ/Prague) 

Subject: Re: [Gluster-devel] Issue about the size of fstat is less than the 
really size of the syslog file



- Original Message -
> From: "George Lian (Nokia - CN/Hangzhou)" 
> To: "Raghavendra Gowdappa" 
> Cc: Gluster-devel@gluster.org, "I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS"
> , "Bingxuan Zhang (Nokia - 
> CN/Hangzhou)"
> , "Jan Zizka (Nokia - CZ/Prague)" 
> 
> Sent: Wednesday, October 19, 2016 12:05:01 PM
> Subject: RE: [Gluster-devel] Issue about the size of fstat is less than the 
> really size of the syslog file
> 
> Hi, Raghavendra,
> 
> Thanks a lots for your quickly update!
> In my case, there are so many process(write) is writing to the syslog file,
> it do involve the writer is in the same host and writing in same mount point
> while the tail(reader) is reading it.
> 
> The bug I just guess is:
> When a writer write the data with write-behind, it call the call-back
> function " mdc_writev_cbk" and called "mdc_inode_iatt_set_validate" to
> validate the "iatt" data, but with the code I mentioned last mail, it do
> nothing.

mdc_inode_iatt_set_validate has following code


if (!iatt || !iatt->ia_ctime) {
mdc->ia_time = 0;
goto unlock;
}


Which means a NULL iatt sets mdc->ia_time to 0. This results in subsequent 
lookup/stat calls to be NOT served from md-cache. Instead, the stat is served 
from backend bricks. So, I don't see an issue here.

However, one case where a NULL iatt is different from a valid iatt (which 
differs from the value stored in md-cache) is that the latter results in a call 
to inode_invalidate. This invalidation propagates to kernel and all dentry and 
page cache corresponding to file is purged. So, I am suspecting whether the 
stale stat you saw was served from kernel cache (not from glusterfs). If this 
is the case, having mount options "attribute-timeout=0" and "entry-timeout=0" 
should've helped.

I am still at loss to point out the RCA for this issue.


> And in same time, the reader(tail) read the "iatt" data, but in case of the
> cache-time is not timeout, it will return the "iatt" data without the last
> change.
> 
> Do your think it is a possible bug?
> 
> Thanks & Best Regards,
> George
> 
> -Original Message-
> From: Raghavendra Gowdappa [mailto:rgowd...@redhat.com]
> Sent: Wednesday, October 19, 2016 2:06 PM
> To: Lian, George (Nokia - CN/Hangzhou) 
> Cc: Gluster-devel@gluster.org; I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS
> ; Zhang, Bingxuan (Nokia -
> CN/Hangzhou) ; Zizka, Jan (Nokia - CZ/Prague)
> 
> Subject: Re: [Gluster-devel] Issue about the size of fstat is less than the
> really size of the syslog file
> 
> 
> 
> - Original Message -
> > From: "George Lian (Nokia - CN/Hangzhou)" 
> > To: "Raghavendra Gowdappa" 
> > Cc: Gluster-devel@gluster.org, "I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS"
> > , "Bingxuan Zhang (Nokia
> > - CN/Hangzhou)"
> > , "Jan Zizka (Nokia - CZ/Prague)"
> > 
> > Sent: Wednesday, October 19, 2016 

Re: [Gluster-devel] Issue about the size of fstat is less than the really size of the syslog file

2016-10-19 Thread Lian, George (Nokia - CN/Hangzhou)
Hi, Raghavendra,

How can I confirm that the "stat" value is from glusterfs cache or kernel cache?
Now the issue can easy reproduce without the stat-prefetch or write-behind on.

Best Regards,
George

-Original Message-
From: Raghavendra Gowdappa [mailto:rgowd...@redhat.com] 
Sent: Wednesday, October 19, 2016 2:54 PM
To: Lian, George (Nokia - CN/Hangzhou) 
Cc: Gluster-devel@gluster.org; I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS 
; Zhang, Bingxuan (Nokia - 
CN/Hangzhou) ; Zizka, Jan (Nokia - CZ/Prague) 

Subject: Re: [Gluster-devel] Issue about the size of fstat is less than the 
really size of the syslog file



- Original Message -
> From: "George Lian (Nokia - CN/Hangzhou)" 
> To: "Raghavendra Gowdappa" 
> Cc: Gluster-devel@gluster.org, "I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS"
> , "Bingxuan Zhang (Nokia - 
> CN/Hangzhou)"
> , "Jan Zizka (Nokia - CZ/Prague)" 
> 
> Sent: Wednesday, October 19, 2016 12:05:01 PM
> Subject: RE: [Gluster-devel] Issue about the size of fstat is less than the 
> really size of the syslog file
> 
> Hi, Raghavendra,
> 
> Thanks a lots for your quickly update!
> In my case, there are so many process(write) is writing to the syslog file,
> it do involve the writer is in the same host and writing in same mount point
> while the tail(reader) is reading it.
> 
> The bug I just guess is:
> When a writer write the data with write-behind, it call the call-back
> function " mdc_writev_cbk" and called "mdc_inode_iatt_set_validate" to
> validate the "iatt" data, but with the code I mentioned last mail, it do
> nothing.

mdc_inode_iatt_set_validate has following code


if (!iatt || !iatt->ia_ctime) {
mdc->ia_time = 0;
goto unlock;
}


Which means a NULL iatt sets mdc->ia_time to 0. This results in subsequent 
lookup/stat calls to be NOT served from md-cache. Instead, the stat is served 
from backend bricks. So, I don't see an issue here.

However, one case where a NULL iatt is different from a valid iatt (which 
differs from the value stored in md-cache) is that the latter results in a call 
to inode_invalidate. This invalidation propagates to kernel and all dentry and 
page cache corresponding to file is purged. So, I am suspecting whether the 
stale stat you saw was served from kernel cache (not from glusterfs). If this 
is the case, having mount options "attribute-timeout=0" and "entry-timeout=0" 
should've helped.

I am still at loss to point out the RCA for this issue.


> And in same time, the reader(tail) read the "iatt" data, but in case of the
> cache-time is not timeout, it will return the "iatt" data without the last
> change.
> 
> Do your think it is a possible bug?
> 
> Thanks & Best Regards,
> George
> 
> -Original Message-
> From: Raghavendra Gowdappa [mailto:rgowd...@redhat.com]
> Sent: Wednesday, October 19, 2016 2:06 PM
> To: Lian, George (Nokia - CN/Hangzhou) 
> Cc: Gluster-devel@gluster.org; I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS
> ; Zhang, Bingxuan (Nokia -
> CN/Hangzhou) ; Zizka, Jan (Nokia - CZ/Prague)
> 
> Subject: Re: [Gluster-devel] Issue about the size of fstat is less than the
> really size of the syslog file
> 
> 
> 
> - Original Message -
> > From: "George Lian (Nokia - CN/Hangzhou)" 
> > To: "Raghavendra Gowdappa" 
> > Cc: Gluster-devel@gluster.org, "I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS"
> > , "Bingxuan Zhang (Nokia
> > - CN/Hangzhou)"
> > , "Jan Zizka (Nokia - CZ/Prague)"
> > 
> > Sent: Wednesday, October 19, 2016 10:51:24 AM
> > Subject: RE: [Gluster-devel] Issue about the size of fstat is less than the
> > really size of the syslog file
> > 
> > Hi, Raghavendra,
> > 
> > When we disable md-cache(gluster volume set log
> > performance.md-cache-timeout
> > 0),  the issue seems gone.
> > (we can't disable with " gluster volume set log performance.md-cache off"
> > why?)
> 
> Please use
> #gluster volume set log performance.stat-prefetch off
> 
> > 
> > So I double confuse that the code I abstract in last mail maybe have some
> > issue for this case.
> > Could you please share your comments?
> 
> Please find my comments below.
> 
> > 
> > Thanks & Best Regards,
> > George
> > 
> > -Original Message-
> > From: Lian, George (Nokia - CN/Hangzhou)
> > Sent: Friday, October 14, 2016 1:44 PM
> > To: 'Raghavendra Gowdappa' 
> > Cc: Gluster-devel@gluster.org; I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS
> > 

Re: [Gluster-devel] Issue about the size of fstat is less than the really size of the syslog file

2016-10-19 Thread Lian, George (Nokia - CN/Hangzhou)
Hi, Raghavendra,

When we disable md-cache(gluster volume set log performance.md-cache-timeout 
0),  the issue seems gone.  
(we can't disable with " gluster volume set log performance.md-cache off" why?)

So I double confuse that the code I abstract in last mail maybe have some issue 
for this case.
Could you please share your comments?

Thanks & Best Regards,
George

-Original Message-
From: Lian, George (Nokia - CN/Hangzhou) 
Sent: Friday, October 14, 2016 1:44 PM
To: 'Raghavendra Gowdappa' 
Cc: Gluster-devel@gluster.org; I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS 
; Zhang, Bingxuan (Nokia - 
CN/Hangzhou) ; Zizka, Jan (Nokia - CZ/Prague) 

Subject: RE: [Gluster-devel] Issue about the size of fstat is less than the 
really size of the syslog file

Hi, Raghavendra,

Our version of GlusterFS is 3.6.9, and I also check the newest code of main 
branch, the function of " mdc_inode_iatt_set_validate" is almost same, from the 
following code of this function, 
We could see a "TODO" comments inline, does it mean if we enhance write-behind 
feature, the "iatt" field in callback will be NULL, so that inode_invalidate 
will not be called? So the size of file will not update since "write behind" 
enabled ?
Is it the root cause for "tail" application failed with "file truncated" issue ?

LOCK (>lock);
{
if (!iatt || !iatt->ia_ctime) {
mdc->ia_time = 0;
goto unlock;
}

/*
 * Invalidate the inode if the mtime or ctime has changed
 * and the prebuf doesn't match the value we have cached.
 * TODO: writev returns with a NULL iatt due to
 * performance/write-behind, causing invalidation on writes.
 */
if (IA_ISREG(inode->ia_type) &&
((iatt->ia_mtime != mdc->md_mtime) ||
(iatt->ia_ctime != mdc->md_ctime)))
if (!prebuf || (prebuf->ia_ctime != mdc->md_ctime) ||
(prebuf->ia_mtime != mdc->md_mtime))
inode_invalidate(inode);

mdc_from_iatt (mdc, iatt);

time (>ia_time);
}

Best Regards,
George
-Original Message-
From: Raghavendra Gowdappa [mailto:rgowd...@redhat.com] 
Sent: Thursday, October 13, 2016 8:58 PM
To: Lian, George (Nokia - CN/Hangzhou) 
Cc: Gluster-devel@gluster.org; I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS 
; Zhang, Bingxuan (Nokia - 
CN/Hangzhou) ; Zizka, Jan (Nokia - CZ/Prague) 

Subject: Re: [Gluster-devel] Issue about the size of fstat is less than the 
really size of the syslog file



- Original Message -
> From: "George Lian (Nokia - CN/Hangzhou)" 
> To: Gluster-devel@gluster.org
> Cc: "I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS" 
> , "Bingxuan Zhang (Nokia
> - CN/Hangzhou)" , "Jan Zizka (Nokia - CZ/Prague)" 
> 
> Sent: Thursday, October 13, 2016 2:33:53 PM
> Subject: [Gluster-devel] Issue about the size of fstat is less than the 
> really size of the syslog file
> 
> Hi, Dear Expert,
> We have use glusterfs as a network filesystem, and syslog store in there,
> some clients on different host may write the syslog file via “glusterfs”
> mount point.
> Now we encounter an issue when we “tail” the syslog file, it will occasional
> failed with error “ file truncated ”
> As we study and trace with the “tail” source code, it failed with the
> following code:
> if ( S_ISREG (mode) && stats.st_size < f[i].size )
> {
> error (0, 0, _("%s: file truncated"), quotef (name));
> /* Assume the file was truncated to 0,
> and therefore output all "new" data. */
> xlseek (fd, 0, SEEK_SET, name);
> f[i].size = 0;
> }
> When stats.st_size < f[i].size, what mean the size report by fstat is less
> than “tail” had read, it lead to “file truncated”, we also use “strace”
> tools to trace the tail application, the related tail strace log as the
> below:
> nanosleep({1, 0}, NULL) = 0
> fstat(3, {st_mode=S_IFREG|0644, st_size=192543105, ...}) = 0
> nanosleep({1, 0}, NULL) = 0
> fstat(3, {st_mode=S_IFREG|0644, st_size=192543105, ...}) = 0
> nanosleep({1, 0}, NULL) = 0
> fstat(3, {st_mode=S_IFREG|0644, st_size=192543105, ...}) = 0
> nanosleep({1, 0}, NULL) = 0
> fstat(3, {st_mode=S_IFREG|0644, st_size=192544549, ...}) = 0
> read(3, " Data … -"..., 8192) = 1444
> read(3, " Data.. "..., 8192) = 720
> read(3, "", 8192) = 0
> fstat(3, {st_mode=S_IFREG|0644, st_size=192544789, ...}) = 0
> write(1, “DATA…..” ) = 2164
> write(2, "tail: ", 6tail: ) = 6
> write(2, "/mnt/log/master/syslog: file tru"..., 38/mnt/log/master/syslog:

Re: [Gluster-devel] Issue about the size of fstat is less than the really size of the syslog file

2016-10-19 Thread Lian, George (Nokia - CN/Hangzhou)
Hi, Raghavendra,

Thanks a lots for your quickly update!
In my case, there are so many process(write) is writing to the syslog file, it 
do involve the writer is in the same host and writing in same mount point while 
the tail(reader) is reading it.

The bug I just guess is: 
When a writer write the data with write-behind, it call the call-back function 
" mdc_writev_cbk" and called "mdc_inode_iatt_set_validate" to validate the 
"iatt" data, but with the code I mentioned last mail, it do nothing.
And in same time, the reader(tail) read the "iatt" data, but in case of the 
cache-time is not timeout, it will return the "iatt" data without the last 
change.

Do your think it is a possible bug?

Thanks & Best Regards,
George

-Original Message-
From: Raghavendra Gowdappa [mailto:rgowd...@redhat.com] 
Sent: Wednesday, October 19, 2016 2:06 PM
To: Lian, George (Nokia - CN/Hangzhou) 
Cc: Gluster-devel@gluster.org; I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS 
; Zhang, Bingxuan (Nokia - 
CN/Hangzhou) ; Zizka, Jan (Nokia - CZ/Prague) 

Subject: Re: [Gluster-devel] Issue about the size of fstat is less than the 
really size of the syslog file



- Original Message -
> From: "George Lian (Nokia - CN/Hangzhou)" 
> To: "Raghavendra Gowdappa" 
> Cc: Gluster-devel@gluster.org, "I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS"
> , "Bingxuan Zhang (Nokia - 
> CN/Hangzhou)"
> , "Jan Zizka (Nokia - CZ/Prague)" 
> 
> Sent: Wednesday, October 19, 2016 10:51:24 AM
> Subject: RE: [Gluster-devel] Issue about the size of fstat is less than the 
> really size of the syslog file
> 
> Hi, Raghavendra,
> 
> When we disable md-cache(gluster volume set log performance.md-cache-timeout
> 0),  the issue seems gone.
> (we can't disable with " gluster volume set log performance.md-cache off"
> why?)

Please use
#gluster volume set log performance.stat-prefetch off

> 
> So I double confuse that the code I abstract in last mail maybe have some
> issue for this case.
> Could you please share your comments?

Please find my comments below.

> 
> Thanks & Best Regards,
> George
> 
> -Original Message-
> From: Lian, George (Nokia - CN/Hangzhou)
> Sent: Friday, October 14, 2016 1:44 PM
> To: 'Raghavendra Gowdappa' 
> Cc: Gluster-devel@gluster.org; I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS
> ; Zhang, Bingxuan (Nokia -
> CN/Hangzhou) ; Zizka, Jan (Nokia - CZ/Prague)
> 
> Subject: RE: [Gluster-devel] Issue about the size of fstat is less than the
> really size of the syslog file
> 
> Hi, Raghavendra,
> 
> Our version of GlusterFS is 3.6.9, and I also check the newest code of main
> branch, the function of " mdc_inode_iatt_set_validate" is almost same, from
> the following code of this function,
> We could see a "TODO" comments inline, does it mean if we enhance
> write-behind feature, the "iatt" field in callback will be NULL, so that
> inode_invalidate will not be called? So the size of file will not update
> since "write behind" enabled ?
> Is it the root cause for "tail" application failed with "file truncated"
> issue ?
> 
> LOCK (>lock);
> {
> if (!iatt || !iatt->ia_ctime) {
> mdc->ia_time = 0;
> goto unlock;
> }
> 
>   /*
>* Invalidate the inode if the mtime or ctime has changed
>* and the prebuf doesn't match the value we have cached.
>* TODO: writev returns with a NULL iatt due to
>* performance/write-behind, causing invalidation on writes.
>*/

The issue explained in this comment is hit only when writes are done. But, in 
your use-case only "tail" is the application running on the mount (If I am not 
wrong, the  writer is running on a different mountpoint). So, I doubt you are 
hitting this issue. But, you are saying that the issue goes away when 
write-behind/md-cache is turned off pointing to some interaction between 
md-cache and write-behind causing the issue. I need more time to look into this 
issue. Can you file a bug on this?

>   if (IA_ISREG(inode->ia_type) &&
>   ((iatt->ia_mtime != mdc->md_mtime) ||
>   (iatt->ia_ctime != mdc->md_ctime)))
>   if (!prebuf || (prebuf->ia_ctime != mdc->md_ctime) ||
>   (prebuf->ia_mtime != mdc->md_mtime))
>   inode_invalidate(inode);
> 
> mdc_from_iatt (mdc, iatt);
> 
> time (>ia_time);
> }
> 
> Best Regards,
> George
> -Original Message-
> From: Raghavendra Gowdappa 

Re: [Gluster-devel] Issue about the size of fstat is less than the really size of the syslog file

2016-10-19 Thread Lian, George (Nokia - CN/Hangzhou)
Hi, Raghavendra

Just now, we test it with glusterfs log with debug-level "TRACE", and let some 
application trigger "glusterfs" produce large log, in that case, when we set 
write-behind and stat-prefetch both OFF,
Tail the glusterfs log such like mnt-{VOLUME-NAME}.log, it still failed with 
"file truncated",

So that means if file's IO in huge amount, the issue will still be there even 
write-behind and stat-prefetch both OFF.

Best Regards,
George

-Original Message-
From: Raghavendra Gowdappa [mailto:rgowd...@redhat.com] 
Sent: Wednesday, October 19, 2016 2:54 PM
To: Lian, George (Nokia - CN/Hangzhou) 
Cc: Gluster-devel@gluster.org; I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS 
; Zhang, Bingxuan (Nokia - 
CN/Hangzhou) ; Zizka, Jan (Nokia - CZ/Prague) 

Subject: Re: [Gluster-devel] Issue about the size of fstat is less than the 
really size of the syslog file



- Original Message -
> From: "George Lian (Nokia - CN/Hangzhou)" 
> To: "Raghavendra Gowdappa" 
> Cc: Gluster-devel@gluster.org, "I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS"
> , "Bingxuan Zhang (Nokia - 
> CN/Hangzhou)"
> , "Jan Zizka (Nokia - CZ/Prague)" 
> 
> Sent: Wednesday, October 19, 2016 12:05:01 PM
> Subject: RE: [Gluster-devel] Issue about the size of fstat is less than the 
> really size of the syslog file
> 
> Hi, Raghavendra,
> 
> Thanks a lots for your quickly update!
> In my case, there are so many process(write) is writing to the syslog file,
> it do involve the writer is in the same host and writing in same mount point
> while the tail(reader) is reading it.
> 
> The bug I just guess is:
> When a writer write the data with write-behind, it call the call-back
> function " mdc_writev_cbk" and called "mdc_inode_iatt_set_validate" to
> validate the "iatt" data, but with the code I mentioned last mail, it do
> nothing.

mdc_inode_iatt_set_validate has following code


if (!iatt || !iatt->ia_ctime) {
mdc->ia_time = 0;
goto unlock;
}


Which means a NULL iatt sets mdc->ia_time to 0. This results in subsequent 
lookup/stat calls to be NOT served from md-cache. Instead, the stat is served 
from backend bricks. So, I don't see an issue here.

However, one case where a NULL iatt is different from a valid iatt (which 
differs from the value stored in md-cache) is that the latter results in a call 
to inode_invalidate. This invalidation propagates to kernel and all dentry and 
page cache corresponding to file is purged. So, I am suspecting whether the 
stale stat you saw was served from kernel cache (not from glusterfs). If this 
is the case, having mount options "attribute-timeout=0" and "entry-timeout=0" 
should've helped.

I am still at loss to point out the RCA for this issue.


> And in same time, the reader(tail) read the "iatt" data, but in case of the
> cache-time is not timeout, it will return the "iatt" data without the last
> change.
> 
> Do your think it is a possible bug?
> 
> Thanks & Best Regards,
> George
> 
> -Original Message-
> From: Raghavendra Gowdappa [mailto:rgowd...@redhat.com]
> Sent: Wednesday, October 19, 2016 2:06 PM
> To: Lian, George (Nokia - CN/Hangzhou) 
> Cc: Gluster-devel@gluster.org; I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS
> ; Zhang, Bingxuan (Nokia -
> CN/Hangzhou) ; Zizka, Jan (Nokia - CZ/Prague)
> 
> Subject: Re: [Gluster-devel] Issue about the size of fstat is less than the
> really size of the syslog file
> 
> 
> 
> - Original Message -
> > From: "George Lian (Nokia - CN/Hangzhou)" 
> > To: "Raghavendra Gowdappa" 
> > Cc: Gluster-devel@gluster.org, "I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS"
> > , "Bingxuan Zhang (Nokia
> > - CN/Hangzhou)"
> > , "Jan Zizka (Nokia - CZ/Prague)"
> > 
> > Sent: Wednesday, October 19, 2016 10:51:24 AM
> > Subject: RE: [Gluster-devel] Issue about the size of fstat is less than the
> > really size of the syslog file
> > 
> > Hi, Raghavendra,
> > 
> > When we disable md-cache(gluster volume set log
> > performance.md-cache-timeout
> > 0),  the issue seems gone.
> > (we can't disable with " gluster volume set log performance.md-cache off"
> > why?)
> 
> Please use
> #gluster volume set log performance.stat-prefetch off
> 
> > 
> > So I double confuse that the code I abstract in last mail maybe have some
> > issue for this case.
> > Could you please share your comments?
> 
> Please find my comments below.
> 
> > 
> > Thanks & Best Regards,
> > George
> > 
> > -Original 

Re: [Gluster-devel] Issue about the size of fstat is less than the really size of the syslog file

2016-10-19 Thread Raghavendra Gowdappa


- Original Message -
> From: "George Lian (Nokia - CN/Hangzhou)" 
> To: "Raghavendra Gowdappa" 
> Cc: Gluster-devel@gluster.org, "I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS"
> , "Bingxuan Zhang (Nokia - 
> CN/Hangzhou)"
> , "Jan Zizka (Nokia - CZ/Prague)" 
> 
> Sent: Wednesday, October 19, 2016 12:05:01 PM
> Subject: RE: [Gluster-devel] Issue about the size of fstat is less than the 
> really size of the syslog file
> 
> Hi, Raghavendra,
> 
> Thanks a lots for your quickly update!
> In my case, there are so many process(write) is writing to the syslog file,
> it do involve the writer is in the same host and writing in same mount point
> while the tail(reader) is reading it.
> 
> The bug I just guess is:
> When a writer write the data with write-behind, it call the call-back
> function " mdc_writev_cbk" and called "mdc_inode_iatt_set_validate" to
> validate the "iatt" data, but with the code I mentioned last mail, it do
> nothing.

mdc_inode_iatt_set_validate has following code


if (!iatt || !iatt->ia_ctime) {
mdc->ia_time = 0;
goto unlock;
}


Which means a NULL iatt sets mdc->ia_time to 0. This results in subsequent 
lookup/stat calls to be NOT served from md-cache. Instead, the stat is served 
from backend bricks. So, I don't see an issue here.

However, one case where a NULL iatt is different from a valid iatt (which 
differs from the value stored in md-cache) is that the latter results in a call 
to inode_invalidate. This invalidation propagates to kernel and all dentry and 
page cache corresponding to file is purged. So, I am suspecting whether the 
stale stat you saw was served from kernel cache (not from glusterfs). If this 
is the case, having mount options "attribute-timeout=0" and "entry-timeout=0" 
should've helped.

I am still at loss to point out the RCA for this issue.


> And in same time, the reader(tail) read the "iatt" data, but in case of the
> cache-time is not timeout, it will return the "iatt" data without the last
> change.
> 
> Do your think it is a possible bug?
> 
> Thanks & Best Regards,
> George
> 
> -Original Message-
> From: Raghavendra Gowdappa [mailto:rgowd...@redhat.com]
> Sent: Wednesday, October 19, 2016 2:06 PM
> To: Lian, George (Nokia - CN/Hangzhou) 
> Cc: Gluster-devel@gluster.org; I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS
> ; Zhang, Bingxuan (Nokia -
> CN/Hangzhou) ; Zizka, Jan (Nokia - CZ/Prague)
> 
> Subject: Re: [Gluster-devel] Issue about the size of fstat is less than the
> really size of the syslog file
> 
> 
> 
> - Original Message -
> > From: "George Lian (Nokia - CN/Hangzhou)" 
> > To: "Raghavendra Gowdappa" 
> > Cc: Gluster-devel@gluster.org, "I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS"
> > , "Bingxuan Zhang (Nokia
> > - CN/Hangzhou)"
> > , "Jan Zizka (Nokia - CZ/Prague)"
> > 
> > Sent: Wednesday, October 19, 2016 10:51:24 AM
> > Subject: RE: [Gluster-devel] Issue about the size of fstat is less than the
> > really size of the syslog file
> > 
> > Hi, Raghavendra,
> > 
> > When we disable md-cache(gluster volume set log
> > performance.md-cache-timeout
> > 0),  the issue seems gone.
> > (we can't disable with " gluster volume set log performance.md-cache off"
> > why?)
> 
> Please use
> #gluster volume set log performance.stat-prefetch off
> 
> > 
> > So I double confuse that the code I abstract in last mail maybe have some
> > issue for this case.
> > Could you please share your comments?
> 
> Please find my comments below.
> 
> > 
> > Thanks & Best Regards,
> > George
> > 
> > -Original Message-
> > From: Lian, George (Nokia - CN/Hangzhou)
> > Sent: Friday, October 14, 2016 1:44 PM
> > To: 'Raghavendra Gowdappa' 
> > Cc: Gluster-devel@gluster.org; I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS
> > ; Zhang, Bingxuan (Nokia
> > -
> > CN/Hangzhou) ; Zizka, Jan (Nokia - CZ/Prague)
> > 
> > Subject: RE: [Gluster-devel] Issue about the size of fstat is less than the
> > really size of the syslog file
> > 
> > Hi, Raghavendra,
> > 
> > Our version of GlusterFS is 3.6.9, and I also check the newest code of main
> > branch, the function of " mdc_inode_iatt_set_validate" is almost same, from
> > the following code of this function,
> > We could see a "TODO" comments inline, does it mean if we enhance
> > write-behind feature, the "iatt" field in callback will be NULL, so that
> > inode_invalidate will not be called? So the size of file will not update
> > since "write 

Re: [Gluster-devel] Issue about the size of fstat is less than the really size of the syslog file

2016-10-19 Thread Raghavendra Gowdappa


- Original Message -
> From: "George Lian (Nokia - CN/Hangzhou)" 
> To: "Raghavendra Gowdappa" 
> Cc: Gluster-devel@gluster.org, "I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS"
> , "Bingxuan Zhang (Nokia - 
> CN/Hangzhou)"
> , "Jan Zizka (Nokia - CZ/Prague)" 
> 
> Sent: Wednesday, October 19, 2016 10:51:24 AM
> Subject: RE: [Gluster-devel] Issue about the size of fstat is less than the 
> really size of the syslog file
> 
> Hi, Raghavendra,
> 
> When we disable md-cache(gluster volume set log performance.md-cache-timeout
> 0),  the issue seems gone.
> (we can't disable with " gluster volume set log performance.md-cache off"
> why?)

Please use
#gluster volume set log performance.stat-prefetch off

> 
> So I double confuse that the code I abstract in last mail maybe have some
> issue for this case.
> Could you please share your comments?

Please find my comments below.

> 
> Thanks & Best Regards,
> George
> 
> -Original Message-
> From: Lian, George (Nokia - CN/Hangzhou)
> Sent: Friday, October 14, 2016 1:44 PM
> To: 'Raghavendra Gowdappa' 
> Cc: Gluster-devel@gluster.org; I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS
> ; Zhang, Bingxuan (Nokia -
> CN/Hangzhou) ; Zizka, Jan (Nokia - CZ/Prague)
> 
> Subject: RE: [Gluster-devel] Issue about the size of fstat is less than the
> really size of the syslog file
> 
> Hi, Raghavendra,
> 
> Our version of GlusterFS is 3.6.9, and I also check the newest code of main
> branch, the function of " mdc_inode_iatt_set_validate" is almost same, from
> the following code of this function,
> We could see a "TODO" comments inline, does it mean if we enhance
> write-behind feature, the "iatt" field in callback will be NULL, so that
> inode_invalidate will not be called? So the size of file will not update
> since "write behind" enabled ?
> Is it the root cause for "tail" application failed with "file truncated"
> issue ?
> 
> LOCK (>lock);
> {
> if (!iatt || !iatt->ia_ctime) {
> mdc->ia_time = 0;
> goto unlock;
> }
> 
>   /*
>* Invalidate the inode if the mtime or ctime has changed
>* and the prebuf doesn't match the value we have cached.
>* TODO: writev returns with a NULL iatt due to
>* performance/write-behind, causing invalidation on writes.
>*/

The issue explained in this comment is hit only when writes are done. But, in 
your use-case only "tail" is the application running on the mount (If I am not 
wrong, the  writer is running on a different mountpoint). So, I doubt you are 
hitting this issue. But, you are saying that the issue goes away when 
write-behind/md-cache is turned off pointing to some interaction between 
md-cache and write-behind causing the issue. I need more time to look into this 
issue. Can you file a bug on this?

>   if (IA_ISREG(inode->ia_type) &&
>   ((iatt->ia_mtime != mdc->md_mtime) ||
>   (iatt->ia_ctime != mdc->md_ctime)))
>   if (!prebuf || (prebuf->ia_ctime != mdc->md_ctime) ||
>   (prebuf->ia_mtime != mdc->md_mtime))
>   inode_invalidate(inode);
> 
> mdc_from_iatt (mdc, iatt);
> 
> time (>ia_time);
> }
> 
> Best Regards,
> George
> -Original Message-
> From: Raghavendra Gowdappa [mailto:rgowd...@redhat.com]
> Sent: Thursday, October 13, 2016 8:58 PM
> To: Lian, George (Nokia - CN/Hangzhou) 
> Cc: Gluster-devel@gluster.org; I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS
> ; Zhang, Bingxuan (Nokia -
> CN/Hangzhou) ; Zizka, Jan (Nokia - CZ/Prague)
> 
> Subject: Re: [Gluster-devel] Issue about the size of fstat is less than the
> really size of the syslog file
> 
> 
> 
> - Original Message -
> > From: "George Lian (Nokia - CN/Hangzhou)" 
> > To: Gluster-devel@gluster.org
> > Cc: "I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS"
> > , "Bingxuan Zhang (Nokia
> > - CN/Hangzhou)" , "Jan Zizka (Nokia - CZ/Prague)"
> > 
> > Sent: Thursday, October 13, 2016 2:33:53 PM
> > Subject: [Gluster-devel] Issue about the size of fstat is less than the
> > really size of the syslog file
> > 
> > Hi, Dear Expert,
> > We have use glusterfs as a network filesystem, and syslog store in there,
> > some clients on different host may write the syslog file via “glusterfs”
> > mount point.
> > Now we encounter an issue when we “tail” the syslog file, 

Re: [Gluster-devel] Issue about the size of fstat is less than the really size of the syslog file

2016-10-14 Thread Lian, George (Nokia - CN/Hangzhou)
Hi, Raghavendra,

Our version of GlusterFS is 3.6.9, and I also check the newest code of main 
branch, the function of " mdc_inode_iatt_set_validate" is almost same, from the 
following code of this function, 
We could see a "TODO" comments inline, does it mean if we enhance write-behind 
feature, the "iatt" field in callback will be NULL, so that inode_invalidate 
will not be called? So the size of file will not update since "write behind" 
enabled ?
Is it the root cause for "tail" application failed with "file truncated" issue ?

LOCK (>lock);
{
if (!iatt || !iatt->ia_ctime) {
mdc->ia_time = 0;
goto unlock;
}

/*
 * Invalidate the inode if the mtime or ctime has changed
 * and the prebuf doesn't match the value we have cached.
 * TODO: writev returns with a NULL iatt due to
 * performance/write-behind, causing invalidation on writes.
 */
if (IA_ISREG(inode->ia_type) &&
((iatt->ia_mtime != mdc->md_mtime) ||
(iatt->ia_ctime != mdc->md_ctime)))
if (!prebuf || (prebuf->ia_ctime != mdc->md_ctime) ||
(prebuf->ia_mtime != mdc->md_mtime))
inode_invalidate(inode);

mdc_from_iatt (mdc, iatt);

time (>ia_time);
}

Best Regards,
George
-Original Message-
From: Raghavendra Gowdappa [mailto:rgowd...@redhat.com] 
Sent: Thursday, October 13, 2016 8:58 PM
To: Lian, George (Nokia - CN/Hangzhou) 
Cc: Gluster-devel@gluster.org; I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS 
; Zhang, Bingxuan (Nokia - 
CN/Hangzhou) ; Zizka, Jan (Nokia - CZ/Prague) 

Subject: Re: [Gluster-devel] Issue about the size of fstat is less than the 
really size of the syslog file



- Original Message -
> From: "George Lian (Nokia - CN/Hangzhou)" 
> To: Gluster-devel@gluster.org
> Cc: "I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS" 
> , "Bingxuan Zhang (Nokia
> - CN/Hangzhou)" , "Jan Zizka (Nokia - CZ/Prague)" 
> 
> Sent: Thursday, October 13, 2016 2:33:53 PM
> Subject: [Gluster-devel] Issue about the size of fstat is less than the 
> really size of the syslog file
> 
> Hi, Dear Expert,
> We have use glusterfs as a network filesystem, and syslog store in there,
> some clients on different host may write the syslog file via “glusterfs”
> mount point.
> Now we encounter an issue when we “tail” the syslog file, it will occasional
> failed with error “ file truncated ”
> As we study and trace with the “tail” source code, it failed with the
> following code:
> if ( S_ISREG (mode) && stats.st_size < f[i].size )
> {
> error (0, 0, _("%s: file truncated"), quotef (name));
> /* Assume the file was truncated to 0,
> and therefore output all "new" data. */
> xlseek (fd, 0, SEEK_SET, name);
> f[i].size = 0;
> }
> When stats.st_size < f[i].size, what mean the size report by fstat is less
> than “tail” had read, it lead to “file truncated”, we also use “strace”
> tools to trace the tail application, the related tail strace log as the
> below:
> nanosleep({1, 0}, NULL) = 0
> fstat(3, {st_mode=S_IFREG|0644, st_size=192543105, ...}) = 0
> nanosleep({1, 0}, NULL) = 0
> fstat(3, {st_mode=S_IFREG|0644, st_size=192543105, ...}) = 0
> nanosleep({1, 0}, NULL) = 0
> fstat(3, {st_mode=S_IFREG|0644, st_size=192543105, ...}) = 0
> nanosleep({1, 0}, NULL) = 0
> fstat(3, {st_mode=S_IFREG|0644, st_size=192544549, ...}) = 0
> read(3, " Data … -"..., 8192) = 1444
> read(3, " Data.. "..., 8192) = 720
> read(3, "", 8192) = 0
> fstat(3, {st_mode=S_IFREG|0644, st_size=192544789, ...}) = 0
> write(1, “DATA…..” ) = 2164
> write(2, "tail: ", 6tail: ) = 6
> write(2, "/mnt/log/master/syslog: file tru"..., 38/mnt/log/master/syslog:
> file truncated) = 38
> as the above strace log, tail has read 1444+720=2164 bytes,
> but fstat tell “tail” 192544789 – 192543105 = 1664 which less than 2164, so
> it lead to “tail” application “file truncated”.
> And if we turn off “write-behind” feature, the issue will not be reproduced
> any more.

That seems strange. There are no writes happening on the fd/inode through which 
tail is reading/stating from. So, it seems strange that write-behind is 
involved here. I suspect whether any of md-cache/read-ahead/io-cache is causing 
the issue. Can you,

1. Turn off md-cache, read-ahead, io-cache xlators
2. mount glusterfs with --attribute-timeout=0
3. set write-behind on

and rerun the tests? If you don't hit the issue, you can experiment by turning 
on/off of md-cache, read-ahead and io-cache translators and see what are the 
minimal number of xlators that need to 

Re: [Gluster-devel] Issue about the size of fstat is less than the really size of the syslog file

2016-10-13 Thread Lian, George (Nokia - CN/Hangzhou)
Hi, Raghavendra, 

Thanks a lots for your update!

We will test it upon your instructions, and will update you when has results.

Thanks & Best Regards,
George

-Original Message-
From: Raghavendra Gowdappa [mailto:rgowd...@redhat.com] 
Sent: Thursday, October 13, 2016 8:58 PM
To: Lian, George (Nokia - CN/Hangzhou) 
Cc: Gluster-devel@gluster.org; I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS 
; Zhang, Bingxuan (Nokia - 
CN/Hangzhou) ; Zizka, Jan (Nokia - CZ/Prague) 

Subject: Re: [Gluster-devel] Issue about the size of fstat is less than the 
really size of the syslog file



- Original Message -
> From: "George Lian (Nokia - CN/Hangzhou)" 
> To: Gluster-devel@gluster.org
> Cc: "I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS" 
> , "Bingxuan Zhang (Nokia
> - CN/Hangzhou)" , "Jan Zizka (Nokia - CZ/Prague)" 
> 
> Sent: Thursday, October 13, 2016 2:33:53 PM
> Subject: [Gluster-devel] Issue about the size of fstat is less than the 
> really size of the syslog file
> 
> Hi, Dear Expert,
> We have use glusterfs as a network filesystem, and syslog store in there,
> some clients on different host may write the syslog file via “glusterfs”
> mount point.
> Now we encounter an issue when we “tail” the syslog file, it will occasional
> failed with error “ file truncated ”
> As we study and trace with the “tail” source code, it failed with the
> following code:
> if ( S_ISREG (mode) && stats.st_size < f[i].size )
> {
> error (0, 0, _("%s: file truncated"), quotef (name));
> /* Assume the file was truncated to 0,
> and therefore output all "new" data. */
> xlseek (fd, 0, SEEK_SET, name);
> f[i].size = 0;
> }
> When stats.st_size < f[i].size, what mean the size report by fstat is less
> than “tail” had read, it lead to “file truncated”, we also use “strace”
> tools to trace the tail application, the related tail strace log as the
> below:
> nanosleep({1, 0}, NULL) = 0
> fstat(3, {st_mode=S_IFREG|0644, st_size=192543105, ...}) = 0
> nanosleep({1, 0}, NULL) = 0
> fstat(3, {st_mode=S_IFREG|0644, st_size=192543105, ...}) = 0
> nanosleep({1, 0}, NULL) = 0
> fstat(3, {st_mode=S_IFREG|0644, st_size=192543105, ...}) = 0
> nanosleep({1, 0}, NULL) = 0
> fstat(3, {st_mode=S_IFREG|0644, st_size=192544549, ...}) = 0
> read(3, " Data … -"..., 8192) = 1444
> read(3, " Data.. "..., 8192) = 720
> read(3, "", 8192) = 0
> fstat(3, {st_mode=S_IFREG|0644, st_size=192544789, ...}) = 0
> write(1, “DATA…..” ) = 2164
> write(2, "tail: ", 6tail: ) = 6
> write(2, "/mnt/log/master/syslog: file tru"..., 38/mnt/log/master/syslog:
> file truncated) = 38
> as the above strace log, tail has read 1444+720=2164 bytes,
> but fstat tell “tail” 192544789 – 192543105 = 1664 which less than 2164, so
> it lead to “tail” application “file truncated”.
> And if we turn off “write-behind” feature, the issue will not be reproduced
> any more.

That seems strange. There are no writes happening on the fd/inode through which 
tail is reading/stating from. So, it seems strange that write-behind is 
involved here. I suspect whether any of md-cache/read-ahead/io-cache is causing 
the issue. Can you,

1. Turn off md-cache, read-ahead, io-cache xlators
2. mount glusterfs with --attribute-timeout=0
3. set write-behind on

and rerun the tests? If you don't hit the issue, you can experiment by turning 
on/off of md-cache, read-ahead and io-cache translators and see what are the 
minimal number of xlators that need to be turned off to not hit the issue (with 
write-behind on)?

regards,
Raghavendra

> So we think it may be related to cache consistence issue due to performance
> consider, but we still have concern that:
> The syslog file is used only with “Append” mode, so the size of file
> shouldn’t be reduced, when a client read the file, why “fstat” can’t return
> the really size match to the cache?
> From current investigation, we doubt that the current implement of
> “glusterfs” has a bug on “fstat” when cache is on.
> Your comments is our highly appreciated!
> Thanks & Best Regards
> George
> 
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Issue about the size of fstat is less than the really size of the syslog file

2016-10-13 Thread Raghavendra Gowdappa


- Original Message -
> From: "George Lian (Nokia - CN/Hangzhou)" 
> To: Gluster-devel@gluster.org
> Cc: "I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS" 
> , "Bingxuan Zhang (Nokia
> - CN/Hangzhou)" , "Jan Zizka (Nokia - CZ/Prague)" 
> 
> Sent: Thursday, October 13, 2016 2:33:53 PM
> Subject: [Gluster-devel] Issue about the size of fstat is less than the 
> really size of the syslog file
> 
> Hi, Dear Expert,
> We have use glusterfs as a network filesystem, and syslog store in there,
> some clients on different host may write the syslog file via “glusterfs”
> mount point.
> Now we encounter an issue when we “tail” the syslog file, it will occasional
> failed with error “ file truncated ”
> As we study and trace with the “tail” source code, it failed with the
> following code:
> if ( S_ISREG (mode) && stats.st_size < f[i].size )
> {
> error (0, 0, _("%s: file truncated"), quotef (name));
> /* Assume the file was truncated to 0,
> and therefore output all "new" data. */
> xlseek (fd, 0, SEEK_SET, name);
> f[i].size = 0;
> }
> When stats.st_size < f[i].size, what mean the size report by fstat is less
> than “tail” had read, it lead to “file truncated”, we also use “strace”
> tools to trace the tail application, the related tail strace log as the
> below:
> nanosleep({1, 0}, NULL) = 0
> fstat(3, {st_mode=S_IFREG|0644, st_size=192543105, ...}) = 0
> nanosleep({1, 0}, NULL) = 0
> fstat(3, {st_mode=S_IFREG|0644, st_size=192543105, ...}) = 0
> nanosleep({1, 0}, NULL) = 0
> fstat(3, {st_mode=S_IFREG|0644, st_size=192543105, ...}) = 0
> nanosleep({1, 0}, NULL) = 0
> fstat(3, {st_mode=S_IFREG|0644, st_size=192544549, ...}) = 0
> read(3, " Data … -"..., 8192) = 1444
> read(3, " Data.. "..., 8192) = 720
> read(3, "", 8192) = 0
> fstat(3, {st_mode=S_IFREG|0644, st_size=192544789, ...}) = 0
> write(1, “DATA…..” ) = 2164
> write(2, "tail: ", 6tail: ) = 6
> write(2, "/mnt/log/master/syslog: file tru"..., 38/mnt/log/master/syslog:
> file truncated) = 38
> as the above strace log, tail has read 1444+720=2164 bytes,
> but fstat tell “tail” 192544789 – 192543105 = 1664 which less than 2164, so
> it lead to “tail” application “file truncated”.
> And if we turn off “write-behind” feature, the issue will not be reproduced
> any more.

That seems strange. There are no writes happening on the fd/inode through which 
tail is reading/stating from. So, it seems strange that write-behind is 
involved here. I suspect whether any of md-cache/read-ahead/io-cache is causing 
the issue. Can you,

1. Turn off md-cache, read-ahead, io-cache xlators
2. mount glusterfs with --attribute-timeout=0
3. set write-behind on

and rerun the tests? If you don't hit the issue, you can experiment by turning 
on/off of md-cache, read-ahead and io-cache translators and see what are the 
minimal number of xlators that need to be turned off to not hit the issue (with 
write-behind on)?

regards,
Raghavendra

> So we think it may be related to cache consistence issue due to performance
> consider, but we still have concern that:
> The syslog file is used only with “Append” mode, so the size of file
> shouldn’t be reduced, when a client read the file, why “fstat” can’t return
> the really size match to the cache?
> From current investigation, we doubt that the current implement of
> “glusterfs” has a bug on “fstat” when cache is on.
> Your comments is our highly appreciated!
> Thanks & Best Regards
> George
> 
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] Issue about the size of fstat is less than the really size of the syslog file

2016-10-13 Thread Lian, George (Nokia - CN/Hangzhou)
Hi, Dear Expert,

We have use glusterfs as a network filesystem, and syslog store in there, some 
clients on different host may write the syslog file via "glusterfs" mount point.

Now we encounter an issue when we "tail" the syslog file, it will occasional 
failed with error "file truncated"
As we study and trace with the "tail" source code, it failed with the following 
code:
if (S_ISREG (mode) && stats.st_size < f[i].size)
{
  error (0, 0, _("%s: file truncated"), quotef (name));
  /* Assume the file was truncated to 0,
 and therefore output all "new" data.  */
  xlseek (fd, 0, SEEK_SET, name);
  f[i].size = 0;
}

When stats.st_size < f[i].size, what mean the size report by fstat is less than 
"tail" had read, it lead to "file truncated", we also use "strace" tools to 
trace the tail application, the related tail strace log as the below:

nanosleep({1, 0}, NULL) = 0
fstat(3, {st_mode=S_IFREG|0644, st_size=192543105, ...}) = 0
nanosleep({1, 0}, NULL) = 0
fstat(3, {st_mode=S_IFREG|0644, st_size=192543105, ...}) = 0
nanosleep({1, 0}, NULL) = 0
fstat(3, {st_mode=S_IFREG|0644, st_size=192543105, ...}) = 0
nanosleep({1, 0}, NULL) = 0
fstat(3, {st_mode=S_IFREG|0644, st_size=192544549, ...}) = 0
read(3, "Data ...-"..., 8192) = 1444
read(3, "Data.."..., 8192) = 720
read(3, "", 8192)   = 0
fstat(3, {st_mode=S_IFREG|0644, st_size=192544789, ...}) = 0
write(1, "DATA.") = 2164
write(2, "tail: ", 6tail: )   = 6
write(2, "/mnt/log/master/syslog: file tru"..., 38/mnt/log/master/syslog: file 
truncated) = 38

as the above strace log, tail has read 1444+720=2164 bytes,
but fstat tell "tail" 192544789 - 192543105 = 1664 which less than 2164, so it 
lead to "tail" application "file truncated".

And if we turn off "write-behind" feature, the issue will not be reproduced any 
more.
So we think it may be related to cache consistence issue due to performance 
consider, but we still have concern that:

The syslog file is used only with "Append" mode, so the size of file shouldn't 
be reduced, when a client read the file, why "fstat" can't return the really 
size match to the cache?
>From current investigation, we doubt that the current implement of "glusterfs" 
>has a bug on "fstat" when cache is on.

Your comments is our highly appreciated!

Thanks & Best Regards

George





___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel