RE: some concern about the fix of " tail: consistently output all data for truncated files"

Lian, George (Nokia - CN/Hangzhou) Thu, 10 Nov 2016 08:04:39 -0800

Hi,

The changes looks fine to the issue we encountered now, Thanks a lots for your 
support!

>Note I still think that's a bad glusterfs bug as the
client system should update its attribute cache when
reading data to the local system.

I agree with you, We will go on communicate to the GlusterFS community and 
meanwhile we will try to find the defect by ourselves.

Now I still have a concern for the condition to judge "truncate", and when 
truncate is happened, it just SEEK to HEAD, 
Does tail not consider the case of truncate with not whole of file? ( E.g.  
truncate(fd, 10K))

Thanks & Best Regards,
George

-----Original Message-----
From: Pádraig Brady [mailto:[email protected]] 
Sent: Wednesday, November 09, 2016 10:12 PM
To: Lian, George (Nokia - CN/Hangzhou) <[email protected]>; 
[email protected]
Cc: Zhang, Bingxuan (Nokia - CN/Hangzhou) <[email protected]>; Li, 
Deqian (Nokia - CN/Hangzhou) <[email protected]>; Zizka, Jan (Nokia - 
CZ/Prague) <[email protected]>; Bao, Xiaohui (Nokia - CN/Hangzhou) 
<[email protected]>
Subject: Re: some concern about the fix of " tail: consistently output all data 
for truncated files"

On 09/11/16 01:35, Lian, George (Nokia - CN/Hangzhou) wrote:
> Hi, 
>> What network file system type is this?
> 
> The file systems is GlusterFS of Redhat, 
> 
>> This stale st_size behavior, giving a smaller value _after_ a read,seems 
>> quite problematic to lots of apps though, not just tail(1).
> I agree, but I still suppose more application will do get st_size first then 
> do seek and read which will not over the size of file.
> 
> We also have submit the issue to GlusterFS community, but till now, they 
> can't find the root cause in glusterfs.
> 
> I still complain to "tail application", even if there has some issue on 
> glusterfs, 
> but "tail" eat all the space of the disk (by continues pseudo-truncate for a 
> large syslog file)  , I suggest "tail" could do some change to prevent it.

How about something like this?
I.E be careful not to read more than st_size.

This should avoid the incoherency of a stat() giving
smaller st_size _after_ having read() a larger amount.
Note I still think that's a bad glusterfs bug as the
client system should update its attribute cache when
reading data to the local system.

Now this has the disadvantage of possibly splitting data
over more ==> file headers <== than strictly needed,
but I've also limited to remote files here, so that's
ok I think.

Note the old code (before v8.24) that ignored all existing
data in a truncated file would have also output repeated data
in this case I think, as it did a seek() back to the smaller
st_size. Not so much I suppose that it was problematic in
disk space at least, but still not confusing.

cheers,
Pádraig.

diff --git a/src/tail.c b/src/tail.c
index 96982ed..f002db6 100644
--- a/src/tail.c
+++ b/src/tail.c
@@ -1222,7 +1222,10 @@ tail_forever (struct File_spec *f, size_t n_files, double

           bytes_read = dump_remainder (name, fd,
                                        (f[i].blocking
-                                        ? COPY_A_BUFFER : COPY_TO_EOF));
+                                        ? COPY_A_BUFFER :
+                                          S_ISREG (mode) && f[i].remote
+                                          ? stats.st_size - f[i].size :
+                                            COPY_TO_EOF));
           any_input |= (bytes_read != 0);
           f[i].size += bytes_read;
         }

RE: some concern about the fix of " tail: consistently output all data for truncated files"

Reply via email to