[ Thanks for the Cc, I'm indeed not subscribed to -i18n. ] Christian PERRIER <[email protected]> (2013-10-01): > So, it seems that: > - the virtual machine doesn't have that much memory (2GB) > - it doesnt have much swap > - clamd is eating a lot of memory > > clamd seems to be running for 17 days, about a week after we started > to have some issues with statistics. > > If I had root access to this machine, I would: > - restart clamd > - add more swap > - eventually add more memory
The machine is quite limited in RAM, yeah, and maybe clamd shouldn't be
using that much memory, but that might only work around the issue for a
given number of days (see analysis below).
> Anyway, I applied your patch and we'll see what happens
I might have mentioned on #debian-i18n, or maybe only to David that my
test had been running for a while when I posted my patches, and it made
it to the end. :)
Now, looking into what happens:
- dl10n-check looks at source packages, and creates a $deb object by
reading thanks to parse_tarball; its type is Debian::Pkg::DebSrc,
built on top of Debian::Pkg::Tar, which is an *in-memory* tar
processor; see its description:
""This package is the base class for all C<Debian::Pkg> classes.
Unlike most tar processors, this one does perform all operations
in memory, but retrieves only specified files, so it should not
consume too much memory if you are specific enough.""
- its implementation consists of opening the file for decompression
through: "{gzip,bzip2,xz} -dc $file |". That one explodes for 0a-data
with its 450 MB xz archive (1.1 GB uncompressed, not fitting into 2
GB RAM!), and error handling is poor.
I'm wondering whether the following wouldn't be better:
- use the nifty "dpkg-source -x" to inspect the source package; at the
moment, it doesn't seem to support dpkg-deb's --fsys-tarfile which
could have been used to pipe contents to tar, where filtering would
happen. Since "dpkg-source -x" is merely a wrapper for "extract" in
the Dpkg::Source::Package module, I guess one could implement an
option into that module, which would pass it on to the relevant
package format handler (Dpkg::Source::Package::*), to only unpack
the files which would be specified.
- since the various search_* subs in dl10n-check contain some file
patterns, all those could be passed to the said option, so that
dpkg-source -x only deals which the files one cares about.
- that also means you get support for all dpkg formats for free
(think multitarballs for 3.0 formats).
Not sure I'm going to be the one trying to PoC-ify it though. :/
Mraw,
KiBi.
signature.asc
Description: Digital signature

