Control: tags -1 pending Colin Watson: > On Sun, Sep 01, 2019 at 06:22:00AM +0000, Niels Thykier wrote: >> Colin Watson: >>> I think I might actually extend manconv instead; it already does a >>> certain amount of what you need here and just needs autodetection of >>> input encoding and the multiple-files interface. >>> >>> manconv is currently installed in man-db's libexecdir, but I could >>> easily move it onto $PATH. Since it isn't currently on $PATH, that >>> would provide you with an easy way to test whether this new interface is >>> supported (I could also add "manconv --has-bulk" or something, but I >>> don't think it's necessary in this case). >> >> SGTM. :) > > For internal code organisation reasons it ended up being easier to add a > new "man-recode" tool instead. > > Could you please try the tmp/recode-tool branch of > https://git.savannah.gnu.org/cgit/man-db.git ? To build it, something > like this should work: > > sudo apt build-dep man-db > ./bootstrap > ./configure --prefix=/usr --libexecdir=\${libdir} > --with-config-file=/etc/manpath.config --enable-mb-groff > --enable-silent-rules --with-db=gdbm > make -j4 > make -j4 check > > You should then be able to run src/man-recode. > > Initial performance testing from my end: to convert all the pages in > manpages-pl to UTF-8, it takes about 0.6 seconds. This is cheating > slightly because it takes a short cut in the case where the pages > already appear to be in UTF-8; so if I instead tell it to convert to > ISO-8859-2, it takes about 6.3 seconds. Compared to about 122 seconds > (without parallelisation) with "man -l --recode UTF-8", I think that's > probably good enough. > > Thanks, >
Hi Colin, I got around to testing your branch and the tool seem to work as advertised. Indeed, performance is a lot better with this tool compared to "man --recode". I have committed support for debhelper transparently using man-recode when present instead of "man --recode" to master (closing this bug). Please upload man-db with man-recode to Debian unstable at your convenience. :) Thanks, ~Niels

