Am 09.11.2016 um 09:00 schrieb Zizka, Jan (Nokia - CZ/Prague): >> -----Original Message----- >> From: Zhang, Bingxuan (Nokia - CN/Hangzhou) >> Sent: Wednesday, November 09, 2016 8:51 AM >> To: Zizka, Jan (Nokia - CZ/Prague) <[email protected]>; Lian, George >> (Nokia - CN/Hangzhou) <[email protected]>; Pádraig Brady >> <[email protected]>; [email protected] >> Cc: Li, Deqian (Nokia - CN/Hangzhou) <[email protected]>; Bao, Xiaohui >> (Nokia - CN/Hangzhou) <[email protected]> >> Subject: RE: some concern about the fix of " tail: consistently output all >> data >> for truncated files" >> >>> Can you tell any real use case where the changed tail behaviour would fail >>> and print old content as you describe? I mean some realy use case not the >>> behaviour caused by GlusterFS bug. >> >> Not found from real environment, but we can design one program to do this: >> A program write a log file, and it want to keep its first 1K bytes >> always. >> When the file reach its limit (e.g. 10K bytes), it truncates its content >> to 1KB, then start to write content again. >> >> In this case, with new version, the beginning 1KB data will be printed by >> tail >> always when the truncate happen. > > yes I'm sure one can always find some artificial case, but can you think of > any real > usecase? Because I could not think for any kind of real use case.
I used `tail -f` in the past to feed the output of the logfile of IBM's Tivoli Storage Manager to a remote syslog. ITSM can truncate the logfile by keeping only the last e.g. 8 days (no rotation), hence the file is getting shorter at one point in time. (Nowadays I implemented this in syslog-ng directly to read the files and forward it to a remote syslog-ng server. And yes: syslog-ng has this behavior to output the first part of the file again in case it gets truncated. But as I look at it only in case of a problem, it wasn't a reason for me to switch back again.) -- Reuti (A sophisticated behavior would be to memorize the already output lines, and in case the file gets shorter to scan for a block of at least N matching lines to synchronize again - no double output, no missing lines.) > Moreover what may happen is that in case of file rotation with old design > that part > of the data will be missing in tail output. And that is real usecase. > > Jan > >> >> >> Br, Jimmy >> >> -----Original Message----- >> From: Zizka, Jan (Nokia - CZ/Prague) >> Sent: Wednesday, November 09, 2016 3:41 PM >> To: Zhang, Bingxuan (Nokia - CN/Hangzhou) <[email protected]>; >> Lian, George (Nokia - CN/Hangzhou) <[email protected]>; Pádraig >> Brady <[email protected]>; [email protected] >> Cc: Li, Deqian (Nokia - CN/Hangzhou) <[email protected]>; Bao, Xiaohui >> (Nokia - CN/Hangzhou) <[email protected]> >> Subject: RE: some concern about the fix of " tail: consistently output all >> data >> for truncated files" >> >>> -----Original Message----- >>> From: Zhang, Bingxuan (Nokia - CN/Hangzhou) >>> Sent: Wednesday, November 09, 2016 8:19 AM >>> To: Zizka, Jan (Nokia - CZ/Prague) <[email protected]>; Lian, George >>> (Nokia - CN/Hangzhou) <[email protected]>; Pádraig Brady >>> <[email protected]>; [email protected] >>> Cc: Li, Deqian (Nokia - CN/Hangzhou) <[email protected]>; Bao, Xiaohui >>> (Nokia - CN/Hangzhou) <[email protected]> >>> Subject: RE: some concern about the fix of " tail: consistently output all >> data >>> for truncated files" >>> >>> Hi, >>> >>> Let's not mix 2 problems here. >> >> yes and I was not mixing the two :) >> >>> >>> 1. glusterfs problem => We'll continue the investigation. >>> >>> 2. tail problem, let's discuss it separately from glusterfs bug, just from >>> its >>> own design. >>> New version: when find file size reduce, print content from 0 to the >>> reduced_size. >>> Old version: when find file size reduce, stay in the end of the >>> reduced size and wait for new content. >>> Both 2 ways has its limitation, neither of them are perfect or precisely. >>> Here I just want to say the older version is better than new version in my >>> understanding. >>> Refer to man manual, the '-f' option is designed to print the file which is >>> on >>> append mode, but not designed for the file which might have truncate >>> happen on it. >>> "tail" should focus on what is added, but not on the data from original >>> printed size part of the file. >> >> yes exactly. And in case file is truncated or replaced tail has to assume it >> is >> with >> new content which was added. >> >> Can you tell any real use case where the changed tail behaviour would fail >> and print old content as you describe? I mean some realy use case not the >> behaviour caused by GlusterFS bug. >> >> Jan >> >>> ============================= >>> # man tail >>> TAIL(1) User Commands >>> TAIL(1) >>> >>> >>> NAME >>> tail - output the last part of files >>> ... >>> -f, --follow[={name|descriptor}] >>> output appended data as the file grows; >>> ... >>> ============================= >>> >>> Br, Jimmy >>> >>> -----Original Message----- >>> From: Zizka, Jan (Nokia - CZ/Prague) >>> Sent: Wednesday, November 09, 2016 3:08 PM >>> To: Zhang, Bingxuan (Nokia - CN/Hangzhou) <[email protected]>; >>> Lian, George (Nokia - CN/Hangzhou) <[email protected]>; Pádraig >>> Brady <[email protected]>; [email protected] >>> Cc: Li, Deqian (Nokia - CN/Hangzhou) <[email protected]>; Bao, Xiaohui >>> (Nokia - CN/Hangzhou) <[email protected]> >>> Subject: RE: some concern about the fix of " tail: consistently output all >> data >>> for truncated files" >>> >>>> -----Original Message----- >>>> From: Zhang, Bingxuan (Nokia - CN/Hangzhou) >>>> Sent: Wednesday, November 09, 2016 6:36 AM >>>> To: Lian, George (Nokia - CN/Hangzhou) <[email protected]>; >> Pádraig >>>> Brady <[email protected]>; [email protected] >>>> Cc: Li, Deqian (Nokia - CN/Hangzhou) <[email protected]>; Zizka, Jan >>>> (Nokia - CZ/Prague) <[email protected]>; Bao, Xiaohui (Nokia - >>>> CN/Hangzhou) <[email protected]> >>>> Subject: RE: some concern about the fix of " tail: consistently output all >>> data >>>> for truncated files" >>>> >>>> Hi, >>>> >>>> I wonder the original requirement of "tail", what is the purpose of this >>> tool? >>>> Referred to: >>>> tail - output the last part of files >>>> >>>> Here when "tail" found the some file length become small, is it really >> need >>>> to print old content? >>> >>> but tail cannot know if that is old content. The truncate detection was >>> added there >>> to overcome problem when someone overwrites the file being tailed, in >>> which case >>> it should indeed start dumping the file from beggining. >>> >>>> My opinion is that ignore those old content is better alternative. >>> >>> OK but how would you do that as tail doens't know that it is old content ... >>> >>>> >>>> It is possible those "old content" is written newly (e.g. truncate to 0, >>>> then >>>> write small content). >>>> It is also possible those "old content" is really old (e.g. truncate to >>>> small >>>> size). >>>> >>>> So "tail" can do perfect design here to trace every piece of data write to >>> the >>>> file. >>>> But it should focus on only the data to the last with current reality. >>>> >>>> So my opinion is "revert to previous design" is better choice then >> currently. >>>> What you think? >>> >>> If the change is reverted then you will get regressions on the cases for >> which >>> this >>> was added so that is definately not an option. >>> >>> What should be fixed is GlusterFS instead of trying to make workarounds >> for >>> its >>> misbehaviour. As Pádraig also noted: >>> >>>> This stale st_size behavior, giving a smaller value _after_ a read, >>>> seems quite problematic to lots of apps though, not just tail(1). >>> >>> this will affect other applications and tools not only tail. If you make >>> some >>> kind of >>> workaround in tail for this and GlusterFS is not fixed then this problem >>> will >>> stay >>> hidden and will hit some other application sooner or later. >>> >>> Jan >>> >>> >>>> >>>> >>>> Br, Jimmy >>>> >>>> -----Original Message----- >>>> From: Lian, George (Nokia - CN/Hangzhou) >>>> Sent: Wednesday, November 09, 2016 9:36 AM >>>> To: Pádraig Brady <[email protected]>; [email protected] >>>> Cc: Zhang, Bingxuan (Nokia - CN/Hangzhou) >> <[email protected]>; >>>> Li, Deqian (Nokia - CN/Hangzhou) <[email protected]>; Zizka, Jan >> (Nokia >>> - >>>> CZ/Prague) <[email protected]>; Bao, Xiaohui (Nokia - CN/Hangzhou) >>>> <[email protected]> >>>> Subject: RE: some concern about the fix of " tail: consistently output all >>> data >>>> for truncated files" >>>> >>>> Hi, >>>>> What network file system type is this? >>>> >>>> The file systems is GlusterFS of Redhat, >>>> >>>>> This stale st_size behavior, giving a smaller value _after_ a read,seems >>>> quite problematic to lots of apps though, not just tail(1). >>>> I agree, but I still suppose more application will do get st_size first >>>> then >> do >>>> seek and read which will not over the size of file. >>>> >>>> We also have submit the issue to GlusterFS community, but till now, they >>>> can't find the root cause in glusterfs. >>>> >>>> I still complain to "tail application", even if there has some issue on >>>> glusterfs, >>>> but "tail" eat all the space of the disk (by continues pseudo-truncate for >> a >>>> large syslog file) , I suggest "tail" could do some change to prevent it. >>>> >>>> Thanks & Best Regards, >>>> George >>>> >>>> -----Original Message----- >>>> From: Pádraig Brady [mailto:[email protected]] >>>> Sent: Tuesday, November 08, 2016 7:29 PM >>>> To: Lian, George (Nokia - CN/Hangzhou) <[email protected]>; >>>> [email protected] >>>> Cc: Zhang, Bingxuan (Nokia - CN/Hangzhou) >> <[email protected]>; >>>> Li, Deqian (Nokia - CN/Hangzhou) <[email protected]>; Zizka, Jan >> (Nokia >>> - >>>> CZ/Prague) <[email protected]>; Bao, Xiaohui (Nokia - CN/Hangzhou) >>>> <[email protected]> >>>> Subject: Re: some concern about the fix of " tail: consistently output all >>> data >>>> for truncated files" >>>> >>>> On 08/11/16 02:50, Lian, George (Nokia - CN/Hangzhou) wrote: >>>>> Hi, >>>>>>> Add one more suggestion, if we have not a perfect solution to >> consider >>>> all the case of truncate, could we add an option to tail, such like tail >>>> -no- >>>> truncate >>>>>>> If tail run with this option, than application not consider any >> truncate >>>> case. >>>>>>> >>>>>>> For example, I suppose syslog output file will not have any truncate >>> case >>>> in our environment, then the tail could use the option to avoid the mis- >>>> truncated case? >>>>> >>>>>> Note for case 2) above, we only update fspec->size _after_ the read, >>>>>> so I'm not sure how practical the race with reading a _smaller_ st_size >>>> after that is? >>>>>> I.E. the heuristic is fairly good I think, >>>>>> so an option may be overkill. >>>>>> We'd have to see a demonstratable issue to consider such an option. >>>>> >>>>> We have an issue now for tail a syslog file which stored in a network- >>> based >>>> file system. A automated cased need tail the syslog about one hour to >> get >>>> the syslog of that period, >>>>> in that period of one hour , happen 6 times of un-expected file >>> truncated >>>> issue, so the output of tail has 6 times full syslog file, so the output >>>> file is >>> so >>>> huge and eat all of the disks. >>>>> The network-based file system maybe not so easy to change to meet >> the >>>> current implement of "tail" application. >>>>> So I need helps from yours :) >>>>> >>>>> And which your mean for demonstratable? The issue we encounter >>> could >>>> be easy to reproduce, maybe the file-system is not so strict like ext4 file >>>> system, >>>>> but I still suggest "tail" application could do some change to adapt this >>>> kinds network-based file system? >>>> >>>> It's important info that you have seen the issue. >>>> What network file system type is this? >>>> We might just revert this change if the issue is widespread enough. >>>> >>>> This stale st_size behavior, giving a smaller value _after_ a read, >>>> seems quite problematic to lots of apps though, not just tail(1). >>>> >>>> thanks, >>>> Pádraig. > >
