Re: [gentoo-dev] omitting redirecting man pages from compression

2010-09-24 Thread Enrico Weigelt
* Peter Volkov  schrieb:
> ?? ??, 19/09/2010 ?? 19:43 -0400, Mike Frysinger ??:
> > many man pages exist merely as a redirect to another man page:
> > $ xzcat /usr/share/man/man1/zcat.1.xz
> > .so man1/gzip.1
> > 
> > compressing these tiny (always?) results in a larger file.  that means we
> > arent saving space, and we're adding overhead at runtime.
> 
> Isn't it better to skip compression on all tiny files (not only man
> pages)? In such case some other functions will need to be updated too
> (e.g. ecompress --suffix)...

Maybe it would be even better to mount some compressing filesystem 
on /usr/share/man and /usr/share/info (or perhaps even the whole
/usr/share ?), leave off the explicit compression at all and 
replace the link files by symlinks ?


cu
-- 
--
 Enrico Weigelt, metux IT service -- http://www.metux.de/

 phone:  +49 36207 519931  email: weig...@metux.de
 mobile: +49 151 27565287  icq:   210169427 skype: nekrad666
--
 Embedded-Linux / Portierung / Opensource-QM / Verteilte Systeme
--



Re: [gentoo-dev] omitting redirecting man pages from compression

2010-09-21 Thread Mike Frysinger
On Tuesday, September 21, 2010 05:26:36 James Cloos wrote:
> Ulrich Mueller writes:
> > If we take the second route, then maybe it should be a more general
> > solution, i.e. exclude all tiny files (man page or not) from
> > compression?
> 
> First, from a user’s perspective, not compressing small files is a good
> thing.  Man pages perhaps most of all, given makewhatis, et al.
> (Think of all the C₁₂ which won’t be un-sequestered quite so soon. ☺ ;^)
> 
> Ideally, there would be some way to configure, per filesystem and/or per
> directory, what constitutes a small file.  If the fs uses fixed-size
> blocks then anything already smaller than one block needn’t be compressed.
> OTOH, if the fs supports partial block file packing, then a smaller
> threshold may be better.

probably not a bad idea, but i'm going to attempt the other route and avoid 
the whole issue (automatically turn .so into symlinks).  feel free to pursue 
this in the related EAPI bug ;).
-mike


signature.asc
Description: This is a digitally signed message part.


Re: [gentoo-dev] omitting redirecting man pages from compression

2010-09-21 Thread James Cloos
> "UM" == Ulrich Mueller  writes:

UM> If we take the second route, then maybe it should be a more general
UM> solution, i.e. exclude all tiny files (man page or not) from
UM> compression?

First, from a user’s perspective, not compressing small files is a good
thing.  Man pages perhaps most of all, given makewhatis, et al.
(Think of all the C₁₂ which won’t be un-sequestered quite so soon. ☺ ;^)

Ideally, there would be some way to configure, per filesystem and/or per
directory, what constitutes a small file.  If the fs uses fixed-size
blocks then anything already smaller than one block needn’t be compressed.
OTOH, if the fs supports partial block file packing, then a smaller
threshold may be better.

Even for large files, if the compression fails to save any blocks then
it may be better to leave it uncompressed.

That said, some backup strategies may be better served by compressing
all but the smallest files.

Good heuristics for the default compress-or-don’t threshold should cover
most systems, but the ability to easily override the default is desirable.

-JimC
-- 
James Cloos  OpenPGP: 1024D/ED7DAEA6



Re: [gentoo-dev] omitting redirecting man pages from compression

2010-09-19 Thread Ulrich Mueller
> On Mon, 20 Sep 2010, Mike Frysinger wrote:

>> Isn't it better to skip compression on all tiny files (not only man
>> pages)? In such case some other functions will need to be updated
>> too (e.g. ecompress --suffix)...

> perhaps, but i think it should only be done on automatic dirs like
> docs/info/man.

It's not really an issue outside of /usr/share/man. On my system, I
count about 5600 tiny (smaller than 100 chars uncompressed) files in
man, 170 in doc, and none in info.

Ulrich



Re: [gentoo-dev] omitting redirecting man pages from compression

2010-09-19 Thread Mike Frysinger
On Monday, September 20, 2010 01:59:33 Peter Volkov wrote:
> В Вск, 19/09/2010 в 19:43 -0400, Mike Frysinger пишет:
> > many man pages exist merely as a redirect to another man page:
> > $ xzcat /usr/share/man/man1/zcat.1.xz
> > .so man1/gzip.1
> > 
> > compressing these tiny (always?) results in a larger file.  that means we
> > arent saving space, and we're adding overhead at runtime.
> 
> Isn't it better to skip compression on all tiny files (not only man
> pages)? In such case some other functions will need to be updated too
> (e.g. ecompress --suffix)...

perhaps, but i think it should only be done on automatic dirs like 
docs/info/man.  as for the --suffix thing, where would that be an issue ?  
people already know they should never rely on these dirs being compressed with 
a predictable format.
-mike


signature.asc
Description: This is a digitally signed message part.


Re: [gentoo-dev] omitting redirecting man pages from compression

2010-09-19 Thread Ulrich Mueller
> On Sun, 19 Sep 2010, Mike Frysinger wrote:

> many man pages exist merely as a redirect to another man page:
> $ xzcat /usr/share/man/man1/zcat.1.xz
> .so man1/gzip.1

> compressing these tiny (always?) results in a larger file.  that means we
> arent saving space, and we're adding overhead at runtime.

> two options which we can do transparently:
>   - rewrite the .so man pages into symlinks
>   - omit them from compression

> the latter is pretty easy (see below). any preferences on which
> route to take though as the former shouldnt be too hard either ...

With "controllable compression" in EAPI 4, /usr/share/man will no
longer be special in any way. (Currently, the part of prepman that
your patch changes won't even be reached in EAPI 4.)

If we take the second route, then maybe it should be a more general
solution, i.e. exclude all tiny files (man page or not) from
compression?

But I think that rewriting the .so files into symlinks would be
cleaner.

Ulrich



Re: [gentoo-dev] omitting redirecting man pages from compression

2010-09-19 Thread Peter Volkov
В Вск, 19/09/2010 в 19:43 -0400, Mike Frysinger пишет:
> many man pages exist merely as a redirect to another man page:
> $ xzcat /usr/share/man/man1/zcat.1.xz
> .so man1/gzip.1
> 
> compressing these tiny (always?) results in a larger file.  that means we
> arent saving space, and we're adding overhead at runtime.

Isn't it better to skip compression on all tiny files (not only man
pages)? In such case some other functions will need to be updated too
(e.g. ecompress --suffix)...

-- 
Peter.




Re: [gentoo-dev] omitting redirecting man pages from compression

2010-09-19 Thread Mike Frysinger
On Sunday, September 19, 2010 19:50:57 Zac Medico wrote:
> On 09/19/2010 04:43 PM, Mike Frysinger wrote:
> > many man pages exist merely as a redirect to another man page:
> > $ xzcat /usr/share/man/man1/zcat.1.xz
> > .so man1/gzip.1
> > 
> > compressing these tiny (always?) results in a larger file.  that means we
> > arent saving space, and we're adding overhead at runtime.
> > 
> > two options which we can do transparently:
> > - rewrite the .so man pages into symlinks
> > - omit them from compression
> > 
> > the latter is pretty easy (see below).  any preferences on which route to
> > take though as the former shouldnt be too hard either ...
> 
> It feels like an insignificant optimization to me, but I don't feel
> strongly either way.

~19% of the man pages on my system appear to be forwarding files (glorified 
symlinks).  in my case, that's almost 3000 files.  considering things like 
`makewhatis` need to decompress & read all of these, i think the difference is 
worth addressing.
-mike


signature.asc
Description: This is a digitally signed message part.


Re: [gentoo-dev] omitting redirecting man pages from compression

2010-09-19 Thread Zac Medico
On 09/19/2010 04:43 PM, Mike Frysinger wrote:
> many man pages exist merely as a redirect to another man page:
> $ xzcat /usr/share/man/man1/zcat.1.xz
> .so man1/gzip.1
> 
> compressing these tiny (always?) results in a larger file.  that means we
> arent saving space, and we're adding overhead at runtime.
> 
> two options which we can do transparently:
>   - rewrite the .so man pages into symlinks
>   - omit them from compression
> 
> the latter is pretty easy (see below).  any preferences on which route to take
> though as the former shouldnt be too hard either ...

It feels like an insignificant optimization to me, but I don't feel
strongly either way.
-- 
Thanks,
Zac



[gentoo-dev] omitting redirecting man pages from compression

2010-09-19 Thread Mike Frysinger
many man pages exist merely as a redirect to another man page:
$ xzcat /usr/share/man/man1/zcat.1.xz
.so man1/gzip.1

compressing these tiny (always?) results in a larger file.  that means we
arent saving space, and we're adding overhead at runtime.

two options which we can do transparently:
- rewrite the .so man pages into symlinks
- omit them from compression

the latter is pretty easy (see below).  any preferences on which route to take
though as the former shouldnt be too hard either ...

--- a/bin/ebuild-helpers/ecompressdir
+++ b/bin/ebuild-helpers/ecompressdir
@@ -13,6 +13,7 @@ case $1 in
--ignore)
shift
for skip in "$@" ; do
+   skip=${skip#${D}}
[[ -d ${D}${skip} || -f ${D}${skip} ]] \
&& touch "${D}${skip}.ecompress.skip"
done
--- a/bin/ebuild-helpers/prepman
+++ b/bin/ebuild-helpers/prepman
@@ -27,6 +27,10 @@ for subdir in "${mandir}"/man* "${mandir}"/*/man* ; do
[[ -d ${subdir} ]] && really_is_mandir=1 && break
 done
 
-[[ ${really_is_mandir} == 1 ]] && exec ecompressdir --queue "${mandir#${D}}"
+if [[ ${really_is_mandir} == 1 ]] ; then
+   ecompressdir --queue "${mandir#${D}}" || exit 1
+   # compressing small files just adds overhead
+   find "${mandir}" -type f '!' -size +100c -print0 | ${XARGS} -0 
ecompressdir --ignore
+fi
 
 exit 0
-mike


signature.asc
Description: This is a digitally signed message part.