On Wed, Aug 5, 2009 at 5:28 AM, Roland Mainz<[email protected]> wrote: > > Hi! > > ---- > > Since a while I am thinking how we can handle space problems on the > LiveCD without getting a performance penalty caused by compression for > normal systems. > The random thoughts on the problem look like this: > 1. We need to conserve space on the LiveCD as much as possible > 2. We mainly try to use compression but decompression needs lots of CPU > time > 3. The use of filesystem-based compression is usually inefficient > compared to application-specific compression (for example the FreeType2 > library supports transparent compression of font files which is usually > more efficient (e.g. memory no longer needed is free'ed and doesn't use > up filesystem cache pages) than using ZFS with GZIP) > 4. Other methods may be nice to be considered, for example we have > /usr/bin/shcomp as method to precompile shell scripts into binaries > (which saves CPU time, memory and improves performance and reduces > startup time) > > These items leads to an old idea: > What about allowing one file entry in IPS database to be represented by > multiple different "mutations" of a file, e.g. one and the same source > file is "mutated" into different formats depending on platform > attributes. > > Example 1: > We have a font file called "songti.ttf" (large chinese font) which is > quite large but the size is only a problem when we're low on disk space: > 1. The source file is called "songti.ttf" > 2. The "mutation" for LiveCD targets would be gzip compressed, e.g. > "songti.ttf.gz" > 3. The "mutation" for embedded systems with a minimum of disk space > would be bzip2-compressed, e.g. "songti.ttf.bz2" > > Example 2: > There are _lots_ of shell scripts in /lib/svc/method/ which could be > greatly reduced in size by compiling them. > Quick comparisation shows this: > $ du -ks compiled > normal > 420 compiled > 1464 normal > e.g. there is a ~~1MB difference between both directories. > > The concept of adding "file mutations" unfortunately has one flaw: > - If the file name changes (like in example 1 (which is unavoidable > since the FreeType2 code mainly looks at the file name and not file > signatures)) other files (in the fonts case the "fonts.dir" files) need > to be changed, too (which creates dependicy trees or even graphs) ... > > ... comments/rants/ideas/etc. welcome... > > ---- > > Bye, > Roland > > -- > __ . . __ > (o.\ \/ /.o) [email protected] > \__\/\/__/ MPEG specialist, C&&JAVA&&Sun&&Unix programmer > /O /==\ O\ TEL +49 641 3992797 > (;O/ \/ \O;)
Hi Roland, some interesting thoughts indeed! However, there are already different options, as you know. In terms of LiveMEDIA I think the best of which are clofi and Dcfs. .:*~*:._.:*~*:._.:*~*:._.:*~*:._.:*~*:._.:*~*:._.:*~*:._.:*~*:._.:*~*:. !!!CREDITS TO MOINAK GHOSH and BELENIX!!! http://moinakg.wordpress.com/page/2/ http://moinakg.wordpress.com/2008/07/23/compression-in-ramdisk-dcfs/ also to !!!ALEXANDER EREMIN and MILAX!!! .:*~*:._.:*~*:._.:*~*:._.:*~*:._.:*~*:._.:*~*:._.:*~*:._.:*~*:._.:*~*:. How is application based (de-)compression more efficient, than Dcfs? I think memory (de-)allocation and (un-)caching should be equivalent from the pure performance efficiency standpoint. And what does give Dcfs the lead it the fact, that it doesn't introduce logical fragmentation (different flavors, equal and/or distinct file names of those etc.). Dcfs and clofi are designed to be based on http://en.wikipedia.org/wiki/Zlib, with pluggable cmd-line specifiable compression algorithm options "-C {gzip | gzip-N | lzma}". Next point: When you are discussing LiveMEDIA (especially read-only ones), how is IPS actually involved with its technical "mutated files" capability? This would only be relevant to normal hdd-installations, not to LiveMEDIA. Next+1: You mentioned "RM> more efficient (e.g. memory no longer needed is free'ed and doesn't use RM> up filesystem cache pages) than using ZFS with GZIP)" While I'm not genius Jeff Bonwick and didn't design or implement zfs and hence cannot make a statement concerning zfs's efficiency (although it "feels" good on my systems), zfs is completely out of the question. Already in 2007 Moinak wrote and talked about zfs being a bad choice for LiveCD's, due to too much overhead added by it. Overhead caused by functionality which is essential and grand for hdd files systems, yet which is just not being used on read-only http://en.wikipedia.org/wiki/ISO_9660 file systems, and consequently would pay a price for unused redundant functionality, such as ZFS's copy-on-write transactional object model etc. Therefure neither BeleniX, MilaX, Schillix, Nexenta, nor Sun ever used ZFS for any LiveCD. Neiter on the CD's root, nor for micro-/mini- root ramdisk images (ufs!), nor for usr. Last but not least: Compiling shell scripts may increase execution performance. Therefore in some cases it might be advisable to do this, but then you would usually *add* the compiled versions to the ascii versions, with distinct file name suffix, and keep the ascii versions intact (and therefore you would actually increase space consumption). Same can be said for most scripting languages. IMHO exclusively shipping compiled shell scripts would be violating the nature of shell scripts being exactly this: Scriptable. Regards, Martin _______________________________________________ pkg-discuss mailing list [email protected] http://mail.opensolaris.org/mailman/listinfo/pkg-discuss
