Re: Log rotation issue with runit
Dmitry Bogatov: After some investigation I found code, that caused issue, but it seems that it was written with some purpose, yet I fail to understand that purpose. What you need are manuals that tell you about this mechanism. (-: * http://jdebp.eu./Softwares/nosh/guide/cyclog.html#COMPATIBILITY * http://skarnet.org./software/s6/s6-log.html#Logdirs * http://jdebp.eu./Softwares/djbwares/guide/multilog.html#ROTATION * http://smarden.org/runit/svlogd.8.html#sect4
Re: Log rotation issue with runit
On Sat, 29 Dec 2018 18:33:06 + Dmitry Bogatov wrote: > [2018-12-27 08:07] Steve Litt > > > [ Dmitry Bogatov ] > > > No, it is reproducible. See end of bug thread. > > > > Hi Dmitry, > > > > Just so we're all on the same page, do you mean the end of > > https://bugs.debian.org/916230 ? Could you please provide a message > > number? I briefly looked over https://bugs.debian.org/916230, paying > > particular attention to the later half, and could see no > > reproduction sequence articulated. But sometimes I miss things. > > Sure. Message 17: > > Offending .u file is created by rename(2) call at line 532, in > logdir_open() function. It happens, when 'current' file exists, > non-executable and non-empty. > [...] OK, so you can reproduce it with echo bogus > current; chmod a-x current That's pretty bizarre logic. I can understand why an empty current would be overwritten --- what do you lose. But why only if it's non-executable? Are they using the executable bit as some sort of flag to keep processes from overwriting each others' writes to the file? I once wrote a program in which, when an invocation of my program opened a file, it set the file to read-write, and other invocations of my program would decline to try to open a read-write version of the file. Bizarre, but it worked, and no process clobbered the other. I guess what I'm asking is this: Are you sure the original poster's (OP's) .u files were caused by the rename call when non-empty non-exec current exists, or is he experiencing a different reproduction sequence. I still wouldn't mess with the existing code. I don't think anyone here is positive of the purpose of the logic you described, and therefore what side effects would happen if it were modified. The OP didn't have enough .u files to really inconvenience him, and there are other ways of getting rid of .u files. And the OP could switch to s6. SteveT Steve Litt December 2018 featured book: Rapid Learning for the 21st Century http://www.troubleshooters.com/rl21
Re: Log rotation issue with runit
El sáb., 29 dic. 2018 a las 15:33, Dmitry Bogatov escribió: > > > > And this issue happens not only on crash, it happens after every > > > termination of svlogd, due any signal. I would agree that SIGKILL is > > > crash, but issue reproduces with SIGINT and SIGTERM. > > > > SIGTERM should make svlogd exit cleanly, are you sure? It does for me, > > and when restarted, does not create any .u file (using Gentoo's runit, > > which is upstream runit plus a minor patch to the makefile). > > Wierd. I tried again, and failed to reproduced issue with SIGTERM this > time. Good. This is the expected behaviour of SIGTERM. > What do you mean, `supervision tree not not being teared down properly'? First of all, I should have written "not being torn down properly" :P With "properly" I mean with all runsv and runsvdir processes exiting cleanly. Either by sending runsvdir a SIGHUP signal, which makes it take care of its runsv children, of by sending runsvdir a SIGTERM signal and then making all runsv processes exit one by one with 'sv exit' (or SIGTERM, which is equivalent). Each runsv will then close the writing end of the logger's pipe before exiting, if there is one, making the logger finalize the 'current' file, set its permissions to 0744, and exit as well (because it detects EOF in its stdin). So next time svlogd is started, no .u file. I know this to work as described on Gentoo. If runit is the init system, process 1 sends a SIGTERM signal to the runsvdir process that /etc/runit/2 replaces itself with during the shutdown sequence, so it is the task of /etc/runit/3 to make supervised processes exit, as well as their runsv supervisors. Debian's runit-2.1.2-19 'stock' /etc/runit/3 file is this one, right? * https://salsa.debian.org/runit-team/runit/raw/a45cbc4ec73e1e293d372147ad5ab3f1d2a9/debian/contrib/3 Here, the 'sv exit /etc/service/*' that follows 'sv force-stop /etc/service/*' takes care of making the runsv processes exit. The bug reporter's /etc/runit/3 is a modified one, so that's what I would look next. G.
Re: Log rotation issue with runit
[2018-12-27 20:39] Guillermo > El jue., 27 dic. 2018 a las 6:36, Dmitry Bogatov escribió: > > > > And this issue happens not only on crash, it happens after every > > termination of svlogd, due any signal. I would agree that SIGKILL is > > crash, but issue reproduces with SIGINT and SIGTERM. > > SIGTERM should make svlogd exit cleanly, are you sure? It does for me, > and when restarted, does not create any .u file (using Gentoo's runit, > which is upstream runit plus a minor patch to the makefile). Wierd. I tried again, and failed to reproduced issue with SIGTERM this time. So seems that problem is that svlogd is killed by some other signal. It is going to be quite a long debugging session. > > /etc/runit/3 changed [not included] > > What does this mean? Is this Debian with runit as the init system and > a modified /etc/runit/3 instead of the one supplied with the Debian > package? Yes. > I'm going to make a guess here and say that the supervision tree is > not being teared down properly, resulting in svlogd processes not > exiting cleanly and producing lots of .u files, compounded with the > fact that they are never deleted during rotations because of the fix > to bug #878476... What do you mean, `supervision tree not not being teared down properly'?
Re: Log rotation issue with runit
[2018-12-27 08:07] Steve Litt > > [ Dmitry Bogatov ] > > No, it is reproducible. See end of bug thread. > > Hi Dmitry, > > Just so we're all on the same page, do you mean the end of > https://bugs.debian.org/916230 ? Could you please provide a message > number? I briefly looked over https://bugs.debian.org/916230, paying > particular attention to the later half, and could see no reproduction > sequence articulated. But sometimes I miss things. Sure. Message 17: Offending .u file is created by rename(2) call at line 532, in logdir_open() function. It happens, when 'current' file exists, non-executable and non-empty. [...]
Re: Log rotation issue with runit
Alex Efros: I'm not sure is it good idea to include .u files in usual rotation process at all, and especially handle them just like .s files. If several crashes happens for some reason in a short period of time this will result in deletion of all log files except last (say) 10 .u files, which are usually small and may contain just one line. Replacing 10MB of last logs with 10 last log lines doesn't sounds like a nice idea. Then cap the log directory by total file size rather than file count. M. Bercot and I both provide tools with this feature. "I want to keep the last 10MiB of logs." is exactly what it is for. * http://jdebp.eu./Softwares/nosh/guide/cyclog.html * http://skarnet.org./software/s6/s6-log.html
Re: Log rotation issue with runit
Laurent Bercot: If loggers kept appending to "current" instead, it could have corrupt information in the middle of a "current" file, which would them be archived as a .s (intact file), which wouldn't be good. Examples: * https://unix.stackexchange.com/questions/237321/ * https://askubuntu.com/questions/356981/
Re: Log rotation issue with runit
El jue., 27 dic. 2018 a las 10:47, Steve Litt escribió: > > [...] Simply run a program, invoked by cron, > that deletes .u files over a certain age, as long as it doesn't delete > the latest .u file. Daemontools-style loggers like svlogd delete files themselves during a rotation for a reason,.. > [...] If Debian doesn't yet have an s6 package, > perhaps a packager could be found, and a Debian s6 (and s6-rc) package > can be made. It does. In buster and sid. Alas, no s6-rc (yet). G.
Re: Log rotation issue with runit
El jue., 27 dic. 2018 a las 6:36, Dmitry Bogatov escribió: > > And this issue happens not only on crash, it happens after every > termination of svlogd, due any signal. I would agree that SIGKILL is > crash, but issue reproduces with SIGINT and SIGTERM. SIGTERM should make svlogd exit cleanly, are you sure? It does for me, and when restarted, does not create any .u file (using Gentoo's runit, which is upstream runit plus a minor patch to the makefile). IMO, the bug report shows way too many .u files, this should be a rare occurence. Ideally, there should be none. And according to the timestamps, at least two or sometimes all three loggers are producing a .u file at about the same time, as if they had been started together. > Init: runit (via /run/runit.stopit) > [...] > Versions of packages runit recommends: > ii runit-init 2.1.2-19 > [...] > -- Configuration Files: > /etc/runit/3 changed [not included] What does this mean? Is this Debian with runit as the init system and a modified /etc/runit/3 instead of the one supplied with the Debian package? I'm going to make a guess here and say that the supervision tree is not being teared down properly, resulting in svlogd processes not exiting cleanly and producing lots of .u files, compounded with the fact that they are never deleted during rotations because of the fix to bug #878476... G.
Re: Log rotation issue with runit
On Thu, 27 Dec 2018 09:36:28 + Dmitry Bogatov wrote: > [2018-12-26 03:58] Alex Efros > > Hi! > > > > I'm not sure is it good idea to include .u files in usual rotation > > process at all, and especially handle them just like .s files. If > > several crashes happens for some reason in a short period of time > > this will result in deletion of all log files except last (say) > > 10 .u files, which are usually small and may contain just one line. > > Replacing 10MB of last logs with 10 last log lines doesn't sounds > > like a nice idea. > > > > Possible "right" solution will be to keep same amount of last .u > > files as configured for .s files, i.e. if we've configured to keep > > last 10 files then we may have at most double amount (10 .u files > > and 10 .s files). (I didn't checked mentioned patches, so maybe > > they already works this way.) > > Why would it be wrong to just keep appending to `current' instead of > moving it to `.u' file? (see my patch at end of bug thread) > > And this issue happens not only on crash, it happens after every > termination of svlogd, due any signal. I would agree that SIGKILL is > crash, but issue reproduces with SIGINT and SIGTERM. The right solution is to leave the runit code as-is, until both the exact mechanism and intent of the existing code is understood, and the exact reproduction sequence *in the wild* is completely understood. Until then, any change to runit code would be bandaging the symptom rather than fixing the root cause. And we all know that symptom-bandaging leads to side effects, usually bad ones. The guy who jumpers across a circuit breaker because the circuit breaker flips every few days just might burn down his house. The Bug Filer (BF) disliked that there were too many .u files. That symptom can be fixed in a way that's a couple orders of magnitude less likely to produce side effects. Simply run a program, invoked by cron, that deletes .u files over a certain age, as long as it doesn't delete the latest .u file. This does pretty much what the BF needs, leaves recent .u files available for diagnostics, and doesn't touch any runit code. From what Gerrit wrote you, runit is now pretty much unmaintained, so unless someone steps forward ready, willing and able to master the entire runit code base and architecture, runit code probably shouldn't be messed with. I'll be glad to write the program to be run from the cron job. I can have it for you within 4 days, and I'll put any free software license on it that you/Debian wants. There's another alternative for the BF and anyone else inconvenienced by this symptom: They can switch to s6. In my opinion, runit and s6 are very close cousins, at least compared to the likes of systemd, sysvinit, OpenRC, Busybox init and the like. Unlike runit, s6 is constantly maintained, so any bugs can be solved by the actual author rather than distro packagers. If Debian doesn't yet have an s6 package, perhaps a packager could be found, and a Debian s6 (and s6-rc) package can be made. Thank you so much for packaging runit for Debian. Runit is a wonderful supervisor/init that provides a wonderful alternative to the ancient init, the massively entangled monolithic init, and the init that can't even respawn (or couldn't as of 3 years ago). Keep up the good work. SteveT Steve Litt December 2018 featured book: Rapid Learning for the 21st Century http://www.troubleshooters.com/rl21
Re: Log rotation issue with runit
On Thu, 27 Dec 2018 09:36:27 + Dmitry Bogatov wrote: > [2018-12-26 00:05] Steve Litt > > part text/plain1664 > > On Tue, 25 Dec 2018 13:39:17 + > > Dmitry Bogatov wrote: > > > > > Hello! > > > > > > I am Debian maintainer of `runit' supervision suite. Recently, I > > > received bug report [^1] about stray .u logfiles. After some > > > investigation I found code, that caused issue, but it seems that > > > it was written with some purpose, yet I fail to understand that > > > purpose. > > > > > > Author Gerrit Pape directed me here. Any clarification would be > > > welcome! Please keep bug in thread. > > > > > > [^1] https://bugs.debian.org/916230 > > [...] > > > > This seems to be an intermittent problem (not reproducible at will), > > and very hard to reproduce by anyone but the bug reporter. If the > > preceding sentence is true, it's possible it will never be solved. > > In such a situation, a workaround becomes a legitimate tactic. > > No, it is reproducible. See end of bug thread. Hi Dmitry, Just so we're all on the same page, do you mean the end of https://bugs.debian.org/916230 ? Could you please provide a message number? I briefly looked over https://bugs.debian.org/916230, paying particular attention to the later half, and could see no reproduction sequence articulated. But sometimes I miss things. Thanks, SteveT Steve Litt December 2018 featured book: Rapid Learning for the 21st Century http://www.troubleshooters.com/rl21
Re: Log rotation issue with runit
Why would it be wrong to just keep appending to `current' instead of moving it to `.u' file? (see my patch at end of bug thread) On a crash, it is possible that a file you're writing to gets corrupted. For the integrity of your logs, it's better to move that potentially corrupted file to another place and start over. If loggers kept appending to "current" instead, it could have corrupt information in the middle of a "current" file, which would them be archived as a .s (intact file), which wouldn't be good. Please don't change that behaviour - it's correct. Loggers should not be repeatedly crashing and wiping out your existing logs by excessive rotations. If you have a repeatedly crashing logger, something is wrong in your installation (or the way you built the logger). However, if you cannot trust your installation, a workaround is to use s6-log, which is very similar to svlogd and understands a "S" directive, meaning "total size of archived files in the directory". Using a large "n" value (maximum number of archived files) and an appropriate "S" value will keep all your logs even in the case of repeated crashes - which will not happen with s6-log. -- Laurent
Re: Log rotation issue with runit
[2018-12-26 00:05] Steve Litt > part text/plain1664 > On Tue, 25 Dec 2018 13:39:17 + > Dmitry Bogatov wrote: > > > Hello! > > > > I am Debian maintainer of `runit' supervision suite. Recently, I > > received bug report [^1] about stray .u logfiles. After some > > investigation I found code, that caused issue, but it seems that it > > was written with some purpose, yet I fail to understand that purpose. > > > > Author Gerrit Pape directed me here. Any clarification would be > > welcome! Please keep bug in thread. > > > > [^1] https://bugs.debian.org/916230 > [...] > > This seems to be an intermittent problem (not reproducible at will), > and very hard to reproduce by anyone but the bug reporter. If the > preceding sentence is true, it's possible it will never be solved. In > such a situation, a workaround becomes a legitimate tactic. No, it is reproducible. See end of bug thread.
Re: Log rotation issue with runit
[2018-12-26 03:58] Alex Efros > Hi! > > I'm not sure is it good idea to include .u files in usual rotation process > at all, and especially handle them just like .s files. If several crashes > happens for some reason in a short period of time this will result in > deletion of all log files except last (say) 10 .u files, which are usually > small and may contain just one line. Replacing 10MB of last logs with 10 > last log lines doesn't sounds like a nice idea. > > Possible "right" solution will be to keep same amount of last .u files as > configured for .s files, i.e. if we've configured to keep last 10 files > then we may have at most double amount (10 .u files and 10 .s files). > (I didn't checked mentioned patches, so maybe they already works this way.) Why would it be wrong to just keep appending to `current' instead of moving it to `.u' file? (see my patch at end of bug thread) And this issue happens not only on crash, it happens after every termination of svlogd, due any signal. I would agree that SIGKILL is crash, but issue reproduces with SIGINT and SIGTERM.
Re: Log rotation issue with runit
On Tue, 25 Dec 2018 13:39:17 + Dmitry Bogatov wrote: > Hello! > > I am Debian maintainer of `runit' supervision suite. Recently, I > received bug report [^1] about stray .u logfiles. After some > investigation I found code, that caused issue, but it seems that it > was written with some purpose, yet I fail to understand that purpose. > > Author Gerrit Pape directed me here. Any clarification would be > welcome! Please keep bug in thread. > > [^1] https://bugs.debian.org/916230 It's vital that no code be changed until the exact nature of the problem is understood and reproducible at will. The extra files reported in the bug aren't nearly especially inconvenient when compared to possible problems when modifying code that the code's author isn't maintaining on a regular basis. It would be interesting to see the dates and contents of the files: That might shed some light on what's happening. One person in this thread mentioned that .u files shouldn't be rotated and deleted at all. If that's the case, a fairly simple cron job could be made to delete all .u files older than a certain age except the current/latest .u file. That would probably fix the inconvenience endured by the bug reporter, without applying any premature fixes. This seems to be an intermittent problem (not reproducible at will), and very hard to reproduce by anyone but the bug reporter. If the preceding sentence is true, it's possible it will never be solved. In such a situation, a workaround becomes a legitimate tactic. SteveT Steve Litt December 2018 featured book: Rapid Learning for the 21st Century http://www.troubleshooters.com/rl21
Re: Log rotation issue with runit
Hi! I'm not sure is it good idea to include .u files in usual rotation process at all, and especially handle them just like .s files. If several crashes happens for some reason in a short period of time this will result in deletion of all log files except last (say) 10 .u files, which are usually small and may contain just one line. Replacing 10MB of last logs with 10 last log lines doesn't sounds like a nice idea. Possible "right" solution will be to keep same amount of last .u files as configured for .s files, i.e. if we've configured to keep last 10 files then we may have at most double amount (10 .u files and 10 .s files). (I didn't checked mentioned patches, so maybe they already works this way.) -- WBR, Alex.
Re: Log rotation issue with runit
> […] Both .u and .s files are supposed to > be deleted when they are the oldest file in the logging directory, to > honor 'u' lines in 'config' files, […] Sorry, that should have read: to honor 'n' lines in 'config' files. G.
Re: Log rotation issue with runit
El mar., 25 dic. 2018 a las 10:39, Dmitry Bogatov escribió: > > Hello! > > I am Debian maintainer of `runit' supervision suite. Recently, I > received bug report [^1] about stray .u logfiles. After some > investigation I found code, that caused issue, but it seems that it was > written with some purpose, yet I fail to understand that purpose. > > Author Gerrit Pape directed me here. Any clarification would be welcome! > Please keep bug in thread. > > [^1] https://bugs.debian.org/916230 It was likely written on purpose. A 'current' file that doesn't have the execute by owner permission set is not considered to completely processed, and becomes a .u file instead of being appended to. As has been said, it can happen if svlogd does not exit cleanly. Although it doesn't seem to be mentioned in svlogd's man page, this comes from daemontools' multilog program, which inspired svlogd: * https://cr.yp.to/daemontools/multilog.html (section "Automatically rotated logs") On the other hand, it seems to me that the patch used for resolution of bug #878476 prevents all .u files from being deleted, so they would indeed accumulate indefinitely. Both .u and .s files are supposed to be deleted when they are the oldest file in the logging directory, to honor 'u' lines in 'config' files, and this is only a problem when the one about to be fed to a processor happens to be the one chosen for deletion because of a wonky clock. This issue was brought up earlier in this mailing list, and a different patch was suggested: * https://www.mail-archive.com/supervision@list.skarnet.org/msg00667.html This one seems to do the right check, maybe it should have been applied instead? G.
Re: Log rotation issue with runit
Hi! On Tue, Dec 25, 2018 at 01:39:17PM +, Dmitry Bogatov wrote: > Recently, I received bug report [^1] about stray .u logfiles. After > some investigation I found code, that caused issue, but it seems that it > was written with some purpose, yet I fail to understand that purpose. > > [^1] https://bugs.debian.org/916230 .u files are created after svlogd crash (actually, when svlogd starts again after crash). I don't think I've ever seen actual crash of svlogd, so usually it means system was rebooted because of kernel OOPS or pressing reset button or power loss or someone has killed svlogd with SIGKILL. I believe this isn't a bug, but this behaviour isn't documented in svlogd man page. To me it looks useful to be able to detect such crash-reboots and what happens just before by looking for/in .u files. -- WBR, Alex.
Log rotation issue with runit
Hello! I am Debian maintainer of `runit' supervision suite. Recently, I received bug report [^1] about stray .u logfiles. After some investigation I found code, that caused issue, but it seems that it was written with some purpose, yet I fail to understand that purpose. Author Gerrit Pape directed me here. Any clarification would be welcome! Please keep bug in thread. [^1] https://bugs.debian.org/916230