Re: Log rotation issue with runit

2018-12-30 Thread Jonathan de Boyne Pollard

Dmitry Bogatov:

After some investigation I found code, that caused issue, but it seems 
that it was written with some purpose, yet I fail to understand that 
purpose.



What you need are manuals that tell you about this mechanism. (-:

* http://jdebp.eu./Softwares/nosh/guide/cyclog.html#COMPATIBILITY

* http://skarnet.org./software/s6/s6-log.html#Logdirs

* http://jdebp.eu./Softwares/djbwares/guide/multilog.html#ROTATION

* http://smarden.org/runit/svlogd.8.html#sect4



Re: Log rotation issue with runit

2018-12-29 Thread Steve Litt
On Sat, 29 Dec 2018 18:33:06 +
Dmitry Bogatov  wrote:

> [2018-12-27 08:07] Steve Litt 
> > > [ Dmitry Bogatov ]
> > > No, it is reproducible. See end of bug thread.  
> > 
> > Hi Dmitry,
> > 
> > Just so we're all on the same page, do you mean the end of
> > https://bugs.debian.org/916230 ? Could you please provide a message
> > number? I briefly looked over https://bugs.debian.org/916230, paying
> > particular attention to the later half, and could see no
> > reproduction sequence articulated. But sometimes I miss things.  
> 
> Sure. Message 17:
> 
>  Offending .u file is created by rename(2) call at line 532, in
>  logdir_open() function. It happens, when 'current' file exists,
>  non-executable and non-empty.
>  [...]

OK, so you can reproduce it with

echo bogus > current; chmod a-x current

That's pretty bizarre logic. I can understand why an empty current
would be overwritten --- what do you lose. But why only if it's
non-executable? Are they using the executable bit as some sort of flag
to keep processes from overwriting each others' writes to the file? I
once wrote a program in which, when an invocation of my program opened a
file, it set the file to read-write, and other invocations of my program
would decline to try to open a read-write version of the file. Bizarre,
but it worked, and no process clobbered the other.

I guess what I'm asking is this: Are you sure the original poster's
(OP's) .u files were caused by the rename call when non-empty non-exec
current exists, or is he experiencing a different reproduction sequence.

I still wouldn't mess with the existing code. I don't think anyone here
is positive of the purpose of the logic you described, and therefore
what side effects would happen if it were modified. The OP didn't have
enough .u files to really inconvenience him, and there are other ways of
getting rid of .u files. And the OP could switch to s6.
 
SteveT

Steve Litt 
December 2018 featured book: Rapid Learning for the 21st Century
http://www.troubleshooters.com/rl21


Re: Log rotation issue with runit

2018-12-29 Thread Guillermo
El sáb., 29 dic. 2018 a las 15:33, Dmitry Bogatov escribió:
>
> > > And this issue happens not only on crash, it happens after every
> > > termination of svlogd, due any signal. I would agree that SIGKILL is
> > > crash, but issue reproduces with SIGINT and SIGTERM.
> >
> > SIGTERM should make svlogd exit cleanly, are you sure? It does for me,
> > and when restarted, does not create any .u file (using Gentoo's runit,
> > which is upstream runit plus a minor patch to the makefile).
>
> Wierd. I tried again, and failed to reproduced issue with SIGTERM this
> time.

Good. This is the expected behaviour of SIGTERM.

> What do you mean, `supervision tree not not being teared down properly'?

First of all, I should have written "not being torn down properly" :P
With "properly" I mean with all runsv and runsvdir processes exiting
cleanly.  Either by sending runsvdir a SIGHUP signal, which makes it
take care of its runsv children, of by sending runsvdir a SIGTERM
signal and then making all runsv processes exit one by one with 'sv
exit' (or SIGTERM, which is equivalent). Each runsv will then close
the writing end of the logger's pipe before exiting, if there is one,
making the logger finalize the 'current' file, set its permissions to
0744, and exit as well (because it detects EOF in its stdin). So next
time svlogd is started, no .u file. I know this to work as described
on Gentoo.

If runit is the init system, process 1 sends a SIGTERM signal to the
runsvdir process that /etc/runit/2 replaces itself with during the
shutdown sequence, so it is the task of /etc/runit/3 to make
supervised processes exit, as well as their runsv supervisors.
Debian's runit-2.1.2-19 'stock' /etc/runit/3 file is this one, right?

* 
https://salsa.debian.org/runit-team/runit/raw/a45cbc4ec73e1e293d372147ad5ab3f1d2a9/debian/contrib/3

Here, the 'sv exit /etc/service/*' that follows 'sv force-stop
/etc/service/*' takes care of making the runsv processes exit. The bug
reporter's /etc/runit/3 is a modified one, so that's what I would look
next.

G.


Re: Log rotation issue with runit

2018-12-29 Thread Dmitry Bogatov


[2018-12-27 20:39] Guillermo 
> El jue., 27 dic. 2018 a las 6:36, Dmitry Bogatov escribió:
> >
> > And this issue happens not only on crash, it happens after every
> > termination of svlogd, due any signal. I would agree that SIGKILL is
> > crash, but issue reproduces with SIGINT and SIGTERM.
> 
> SIGTERM should make svlogd exit cleanly, are you sure? It does for me,
> and when restarted, does not create any .u file (using Gentoo's runit,
> which is upstream runit plus a minor patch to the makefile).

Wierd. I tried again, and failed to reproduced issue with SIGTERM this
time. So seems that problem is that svlogd is killed by some other
signal. It is going to be quite a long debugging session.

> > /etc/runit/3 changed [not included]
> 
> What does this mean? Is this Debian with runit as the init system and
> a modified /etc/runit/3 instead of the one supplied with the Debian
> package?

Yes.

> I'm going to make a guess here and say that the supervision tree is
> not being teared down properly, resulting in svlogd processes not
> exiting cleanly and producing lots of .u files, compounded with the
> fact that they are never deleted during rotations because of the fix
> to bug #878476...

What do you mean, `supervision tree not not being teared down properly'?


Re: Log rotation issue with runit

2018-12-29 Thread Dmitry Bogatov
[2018-12-27 08:07] Steve Litt 
> > [ Dmitry Bogatov ]
> > No, it is reproducible. See end of bug thread.
> 
> Hi Dmitry,
> 
> Just so we're all on the same page, do you mean the end of
> https://bugs.debian.org/916230 ? Could you please provide a message
> number? I briefly looked over https://bugs.debian.org/916230, paying
> particular attention to the later half, and could see no reproduction
> sequence articulated. But sometimes I miss things.

Sure. Message 17:

 Offending .u file is created by rename(2) call at line 532, in
 logdir_open() function. It happens, when 'current' file exists,
 non-executable and non-empty.
 [...]


Re: Log rotation issue with runit

2018-12-27 Thread Jonathan de Boyne Pollard

Alex Efros:

I'm not sure is it good idea to include .u files in usual rotation 
process at all, and especially handle them just like .s files. If 
several crashes happens for some reason in a short period of time this 
will result in deletion of all log files except last (say) 10 .u 
files, which are usually small and may contain just one line. 
Replacing 10MB of last logs with 10 last log lines doesn't sounds like 
a nice idea.


Then cap the log directory by total file size rather than file count.  
M. Bercot and I both provide tools with this feature.  "I want to keep 
the last 10MiB of logs." is exactly what it is for.


* http://jdebp.eu./Softwares/nosh/guide/cyclog.html

* http://skarnet.org./software/s6/s6-log.html



Re: Log rotation issue with runit

2018-12-27 Thread Jonathan de Boyne Pollard

Laurent Bercot:
If loggers kept appending to "current" instead, it could have corrupt 
information in the middle of a "current" file, which would them be 
archived as a .s (intact file), which wouldn't be good. 


Examples:

* https://unix.stackexchange.com/questions/237321/

* https://askubuntu.com/questions/356981/


Re: Log rotation issue with runit

2018-12-27 Thread Guillermo
El jue., 27 dic. 2018 a las 10:47, Steve Litt escribió:
>
> [...] Simply run a program, invoked by cron,
> that deletes .u files over a certain age, as long as it doesn't delete
> the latest .u file.

Daemontools-style loggers like svlogd delete files themselves during a
rotation for a reason,..

> [...] If Debian doesn't yet have an s6 package,
> perhaps a packager could be found, and a Debian s6 (and s6-rc) package
> can be made.

It does. In buster and sid. Alas, no s6-rc (yet).

G.


Re: Log rotation issue with runit

2018-12-27 Thread Guillermo
El jue., 27 dic. 2018 a las 6:36, Dmitry Bogatov escribió:
>
> And this issue happens not only on crash, it happens after every
> termination of svlogd, due any signal. I would agree that SIGKILL is
> crash, but issue reproduces with SIGINT and SIGTERM.

SIGTERM should make svlogd exit cleanly, are you sure? It does for me,
and when restarted, does not create any .u file (using Gentoo's runit,
which is upstream runit plus a minor patch to the makefile).

IMO, the bug report shows way too many .u files, this should be a rare
occurence. Ideally, there should be none. And according to the
timestamps, at least two or sometimes all three loggers are producing
a .u file at about the same time, as if they had been started
together.

> Init: runit (via /run/runit.stopit)
> [...]
> Versions of packages runit recommends:
> ii  runit-init  2.1.2-19
> [...]
> -- Configuration Files:
> /etc/runit/3 changed [not included]

What does this mean? Is this Debian with runit as the init system and
a modified /etc/runit/3 instead of the one supplied with the Debian
package?

I'm going to make a guess here and say that the supervision tree is
not being teared down properly, resulting in svlogd processes not
exiting cleanly and producing lots of .u files, compounded with the
fact that they are never deleted during rotations because of the fix
to bug #878476...

G.


Re: Log rotation issue with runit

2018-12-27 Thread Laurent Bercot

Why would it be wrong to just keep appending to `current' instead of
moving it to `.u' file? (see my patch at end of bug thread)


On a crash, it is possible that a file you're writing to gets
corrupted. For the integrity of your logs, it's better to move that
potentially corrupted file to another place and start over.
If loggers kept appending to "current" instead, it could have
corrupt information in the middle of a "current" file, which would
them be archived as a .s (intact file), which wouldn't be good.
Please don't change that behaviour - it's correct.

Loggers should not be repeatedly crashing and wiping out your
existing logs by excessive rotations. If you have a repeatedly crashing
logger, something is wrong in your installation (or the way you built
the logger). However, if you cannot trust your installation, a
workaround is to use s6-log, which is very similar to svlogd and
understands a "S" directive, meaning "total size of archived files in
the directory". Using a large "n" value (maximum number of archived
files) and an appropriate "S" value will keep all your logs even in
the case of repeated crashes - which will not happen with s6-log.

--
Laurent



Re: Log rotation issue with runit

2018-12-27 Thread Dmitry Bogatov


[2018-12-26 00:05] Steve Litt 
> part   text/plain1664
> On Tue, 25 Dec 2018 13:39:17 +
> Dmitry Bogatov  wrote:
> 
> > Hello!
> > 
> > I am Debian maintainer of `runit' supervision suite. Recently, I
> > received bug report [^1] about stray .u logfiles. After some
> > investigation I found code, that caused issue, but it seems that it
> > was written with some purpose, yet I fail to understand that purpose.
> > 
> > Author Gerrit Pape directed me here. Any clarification would be
> > welcome! Please keep bug in thread.
> > 
> > [^1] https://bugs.debian.org/916230
> [...]
>
> This seems to be an intermittent problem (not reproducible at will),
> and very hard to reproduce by anyone but the bug reporter. If the
> preceding sentence is true, it's possible it will never be solved. In
> such a situation, a workaround becomes a legitimate tactic.

No, it is reproducible. See end of bug thread.


Re: Log rotation issue with runit

2018-12-27 Thread Dmitry Bogatov


[2018-12-26 03:58] Alex Efros 
> Hi!
>
> I'm not sure is it good idea to include .u files in usual rotation process
> at all, and especially handle them just like .s files. If several crashes
> happens for some reason in a short period of time this will result in
> deletion of all log files except last (say) 10 .u files, which are usually
> small and may contain just one line. Replacing 10MB of last logs with 10
> last log lines doesn't sounds like a nice idea.
>
> Possible "right" solution will be to keep same amount of last .u files as
> configured for .s files, i.e. if we've configured to keep last 10 files
> then we may have at most double amount (10 .u files and 10 .s files).
> (I didn't checked mentioned patches, so maybe they already works this way.)

Why would it be wrong to just keep appending to `current' instead of
moving it to `.u' file? (see my patch at end of bug thread)

And this issue happens not only on crash, it happens after every
termination of svlogd, due any signal. I would agree that SIGKILL is
crash, but issue reproduces with SIGINT and SIGTERM.


Re: Log rotation issue with runit

2018-12-25 Thread Alex Efros
Hi!

I'm not sure is it good idea to include .u files in usual rotation process
at all, and especially handle them just like .s files. If several crashes
happens for some reason in a short period of time this will result in
deletion of all log files except last (say) 10 .u files, which are usually
small and may contain just one line. Replacing 10MB of last logs with 10
last log lines doesn't sounds like a nice idea.

Possible "right" solution will be to keep same amount of last .u files as
configured for .s files, i.e. if we've configured to keep last 10 files
then we may have at most double amount (10 .u files and 10 .s files).
(I didn't checked mentioned patches, so maybe they already works this way.)

-- 
WBR, Alex.


Re: Log rotation issue with runit

2018-12-25 Thread Guillermo
> […] Both .u and .s files are supposed to
> be deleted when they are the oldest file in the logging directory, to
> honor 'u' lines in 'config' files, […]

Sorry, that should have read: to honor 'n' lines in 'config' files.

G.


Re: Log rotation issue with runit

2018-12-25 Thread Guillermo
El mar., 25 dic. 2018 a las 10:39, Dmitry Bogatov escribió:
>
> Hello!
>
> I am Debian maintainer of `runit' supervision suite. Recently, I
> received bug report [^1] about stray .u logfiles. After some
> investigation I found code, that caused issue, but it seems that it was
> written with some purpose, yet I fail to understand that purpose.
>
> Author Gerrit Pape directed me here. Any clarification would be welcome!
> Please keep bug in thread.
>
> [^1] https://bugs.debian.org/916230

It was likely written on purpose. A 'current' file that doesn't have
the execute by owner permission set is not considered to completely
processed, and becomes a .u file instead of being appended to. As has
been said, it can happen if svlogd does not exit cleanly. Although it
doesn't seem to be mentioned in svlogd's man page, this comes from
daemontools' multilog program, which inspired svlogd:

* https://cr.yp.to/daemontools/multilog.html (section "Automatically
rotated logs")

On the other hand, it seems to me that the patch used for resolution
of bug #878476 prevents all .u files from being deleted, so they would
indeed accumulate indefinitely. Both .u and .s files are supposed to
be deleted when they are the oldest file in the logging directory, to
honor 'u' lines in 'config' files, and this is only a problem when the
one about to be fed to a processor happens to be the one chosen for
deletion because of a wonky clock. This issue was brought up earlier
in this mailing list, and a different patch was suggested:

* https://www.mail-archive.com/supervision@list.skarnet.org/msg00667.html

This one seems to do the right check, maybe it should have been applied instead?



G.


Re: Log rotation issue with runit

2018-12-25 Thread Alex Efros
Hi!

On Tue, Dec 25, 2018 at 01:39:17PM +, Dmitry Bogatov wrote:
> Recently, I received bug report [^1] about stray .u logfiles. After
> some investigation I found code, that caused issue, but it seems that it
> was written with some purpose, yet I fail to understand that purpose.
> 
> [^1] https://bugs.debian.org/916230

.u files are created after svlogd crash (actually, when svlogd starts
again after crash). I don't think I've ever seen actual crash of svlogd,
so usually it means system was rebooted because of kernel OOPS or pressing
reset button or power loss or someone has killed svlogd with SIGKILL.

I believe this isn't a bug, but this behaviour isn't documented in svlogd
man page. To me it looks useful to be able to detect such crash-reboots
and what happens just before by looking for/in .u files.

-- 
WBR, Alex.