On Mon, Jan 03, 2022 at 08:34:46PM +0100, Steinar H. Gunderson wrote:
> I've noticed that during upgrades to bullseye, the man-db trigger now seems to
> take a very long time; in fact, it frequently seems to be a significant time 
> of
> the total time of the full-upgrade (depending, of course, on network speeds).
> 
> After digging a bit, I think I've figured out why I haven't seen this before;
> it's become a _lot_ slower recently, so it wasn't on the radar before. It 
> seems
> that due to a combination of the architecture chosen for these operations
> (a pipeline system, seemingly forking off subprocesses and running lots of
> syscalls in the process)

This part is already covered in https://bugs.debian.org/696503.  I admit
that this was last updated ten years ago; I really need to get back to
that and get it to work one way or another. :-/

> and a new sandbox model (based on seccomp, which needs a lot of setup
> for each new subprocess and adds considerable overhead to each
> syscall),

This is an interesting observation.  I do think that seccomp is valuable
defence in general given the amount of rather ad-hoc parsing that's
going on, but I hadn't noticed that it made such a big difference to
mandb performance.  (About 3ms per subprocess here purely for setup, I
think.)

The seccomp sandbox is mainly so that man-db can more safely operate on
manual pages shipped by packaging systems that expect to run code under
confinement and thus can ship relatively untrusted code (such as snapd).
If we limited the mandb invocation in the postinst to only
/usr/share/man, we could reasonably assume that all pages there were
installed by dpkg, and those don't need the same level of care since any
package installed by dpkg can run code directly as root anyway.  It
would then be reasonable to run it under MAN_DISABLE_SECCOMP=1.

This limitation would make apropos a bit less useful for third-party
packages installed by dpkg that ship manual pages under something like
/opt/*/man, but those are relatively rare, and the existing cron job /
systemd timer will catch up later so it wouldn't be fatal to make such
cases a little less ergonomic.

> As an example, a complete mandb run (mandb -c) on my laptop takes 3 minutes 
> and
> 39 seconds. (Of course, the man-db trigger is incremental, but on a 
> full-upgrade,
> many man pages are likely to be updated.) If I set MAN_DISABLE_SECCOMP=1,
> it drops to 1:41. If I recompile without HAVE_LIBSECCOMP, it's down to 51 
> seconds!

The difference between MAN_DISABLE_SECCOMP=1 and building without
libseccomp looks like a relatively simple fix, at least:

  
https://gitlab.com/cjwatson/man-db/-/commit/50200d151dfedb9d5064ec7008c09f6cf0f5ee24

Results on my system for "mandb -c /usr/share/man", roughly confirming
your findings:

  status quo:                            5m08s
  MAN_DISABLE_SECCOMP=1:                 2m12s
  MAN_DISABLE_SECCOMP=1 plus 50200d151d: 1m17s

> I don't honestly know what this database is for, but my guess is that it is
> for the apropos command, which seems rare. (At least nothing obviously bad
> happens to my man usage if I “rm -r /var/cache/man/*”, but apropos stops 
> working.)
> Is it possible to either speed up man-db so that it takes less time to build
> its database, or otherwise perhaps split apropos out into a separate
> not-installed-by-default package, so that normal installations do not need to
> take this installation hit? Or maybe move man-db updating to cron?

I really don't want to make apropos less useful.  I'll have another go
at tackling the core problem here; but the fixes described above should
at least get us back to merely the situation we've had for years, rather
than the (as you say) much worse behaviour recently.

Given #696503, would you mind if I repurposed this bug to be just about
fixing the seccomp-related performance regression?  I think the plan
above will let me deal with just that part of it relatively quickly, and
the fixes would be small enough to be viable candidates for stable.

-- 
Colin Watson (he/him)                              [cjwat...@debian.org]

Reply via email to