Re: cruft(-ng) and dh-cruft: handling and registering of dynamic files
Hi, Le dim. 23 oct. 2022 à 04:24, Paul Wise a écrit : > Thank you for your work on this, being able to register files generated > at install time by maintainer scripts or even at runtime by system > maintainence tools to particular packages is a very useful feature for > keeping all the files on a system more easily managed. The "cpigs" command has now a new "-C" command line switch to output the ownership of all system files (static+volatile) in a single .csv. I think this is something quite basic that can fill so many needs; but simply did not existed before. "apt-file" could be adapted to also transparently cache this information. End-users of this tool would get better results without having to change their habits. $ apt-file search /etc/subgid [nothing] $ cpigs -c | grep subgid /etc/subgid;base-passwd;f;1;19 /etc/subgid-;base-passwd;f;1;0 The plan is to keep this .csv output stable, whatever changes in the upstream dataflow: which is now mix of dpkg + alternatives + custom fallback scripts that know and replicate how UCF, logrotate, initramfs, grub, systemd, sysvinit manage volatile files inside their postinst/postrm. > I do worry about users removing files that they don't understand, based > on feedback by cpigs/cruft-ng, but they do that already so... :) I have seen some complaints about this online, and I agree... original "cruft" tool looks more like an unfinished Q tool akin to piuparts than an end-user tool for me. > An ncdu or mc style interface (or plugins for those) to view cruft on a > system sounds very useful in addition to the data export. It's implemented but the ncdu datamodel does not allow to insert the matched package name for the volatile files. It's still nice to use if you need to quickly identify where are the big volatile files piling up and take action. Already done in real life. Greetings
Re: cruft(-ng) and dh-cruft: handling and registering of dynamic files
On Sun, 2022-10-23 at 01:08 +0200, Alexandre Detiste wrote: > This DebHelper works this way: > * the "debian/cruft" list merely register the glob patterns, > * and "debian/purge" list also an "rm -rf" stanza in postrm/purge. > > As a bonus there's now also a new "cpigs" command, working akin to > "dpigs" from Debian Goodies to list the biggest volatile data producers. Thank you for your work on this, being able to register files generated at install time by maintainer scripts or even at runtime by system maintainence tools to particular packages is a very useful feature for keeping all the files on a system more easily managed. Potentially it could also prompt users before removing packages that have registered data that won't be removed on purge, for example if a package creates at the sysadmin's request a dir in /srv to host a website, removing the package could warn about the directory. Or removing postgres with databases present could warn about those. I do worry about users removing files that they don't understand, based on feedback by cpigs/cruft-ng, but they do that already so... :) > The plan now is to have a new option that dumps the whole > matching result database as .json with individual file size > for jq consumption or in my case Jupyter; > this instead of implementing older requests (#291823 #487458 #527285). An ncdu or mc style interface (or plugins for those) to view cruft on a system sounds very useful in addition to the data export. -- bye, pabs https://wiki.debian.org/PaulWise signature.asc Description: This is a digitally signed message part
cruft(-ng) and dh-cruft: handling and registering of dynamic files
Hi, I had been working on the cruft/cruft-ng package since 2014; there where a few setbacks along the years, like mlocate -> plocate & UsrMerge transitions, but it's alive and kicking, helping to find random lost files left behind by other packages and file bugs against those from time to time to get these glitches resolved. Recently I've been working a lot on it because I realized it would be the perfect solution to audit the disk space usage problems I'm facing at work. So I somewhat whipped up what I remembered from my own proposal https://wiki.debian.org/Cruft/purge and have now for myself a working "dh-cruft" than I can use to register dynamic files owned by some private .deb. Here "dh-cruft" is a must, I don't want to polute Debian with some random external data from downstream. This DebHelper works this way: * the "debian/cruft" list merely register the glob patterns, * and "debian/purge" list also an "rm -rf" stanza in postrm/purge. As a bonus there's now also a new "cpigs" command, working akin to "dpigs" from Debian Goodies to list the biggest volatile data producers. The plan now is to have a new option that dumps the whole matching result database as .json with individual file size for jq consumption or in my case Jupyter; this instead of implementing older requests (#291823 #487458 #527285). I know it's a very old unresolved subject that has been lurking forever here, but maybe it's the right time to look it up with a fresh view. My proposal for next steps:µ * gather your comments here * some review of dh-cruft (I don't know Perl) * get it in the NEW queue soon * have interested packages take part; for now cruft-ng ship it's own homegrown fallback database * (later): merge dh-cruft into DebHelper when it's basically "done" * (much much later): migrate some logic from DH to dpkg itself, with a more declarative packaging style; cruft-ng is already linked with the static library libdpkg and is bound to progress at the same pace. * there is still a performance problem in cruft-ng that I wish to improve. Basic profiling can be done by setting ELAPSED=1 env var. Greetings, Alexandre Detiste ./cpigs 30 496720816 apt 68957680 npm 61846660 linux-image-5.19.0-1-amd64 (the initrd) 61787431 linux-image-5.19.0-2-amd64 53131401 dlocate 36229735 aptitude 19621198 dpkg 17896745 plocate 13559874 jupyter-nbextension-jupyter-js-widgets 11982526 udev 11870208 openjdk-11-jre-headless 7257544 debconf 5704857 smartmontools 5685370 ttf-mscorefonts-installer 5086033 linux-image-5.18.0-4-amd64 -> rc state 4933502 grub-common 3550208 qgis 3523931 fontconfig 3421312 ucf 3231839 shared-mime-info 3063016 locales 2266947 libreoffice-common (files seen from explain/ucf) 1901483 grub-pc-bin 1565651 logrotate 1258042 man-db 1107968 ALTERNATIVES (I thought these were only symlinks ?) 783313 popularity-contest 763776 unattended-upgrades (du -b /var/log/unattended-upgrades/760422) 657496 breeze-icon-theme 625345 PYEXCEL(some pip3 automation) pgplDWk0_S4Hw.pgp Description: OpenPGP digital signature