Space optimisations: OpenJDK runtime .jar

2010-11-06 Thread Louis Simard
Matthias,

In a posting from 'More LiveCD space optimizations' [1], you asked if
the Java VM would open recompressed .jar files as quickly as the
originals. I wrote a benchmark to assess this, by using the largest
.jar there is: the Java library itself! :)

I've split off this benchmark result to its own e-mail thread because
Java is not on the LiveCD, and it has more of a special interest to
folks running servers. But it's still a space optimisation.

Attached is a benchmark program; please download it if you want to
reproduce this benchmark on your machine.

* * * EXECUTIVE SUMMARY  * * *

In the OpenJDK package, most of the Java class library is implemented
as a single file, rt.jar, and that file is uncompressed. Compressing
it with 'advzip -z4 rt.jar' saves 32,184 KiB out of 60,064 KiB
(53.5%).

The speed regressed by 32 milliseconds (16%) to load 894 classes from
the Java library. I believe this number to be representative of the
number of classes loaded by an application server such as JBoss or
GlassFish.

* * * END OF EXECUTIVE SUMMARY * * *

-- Preparing for the benchmark --

sudo apt-get install openjdk-6-jre advancecomp
cp /usr/lib/jvm/java-6-openjdk/jre/lib/rt.jar rt.jar; sync
cp rt.jar rt-recompressed.jar
advzip -z4 rt-recompressed.jar

# [At this stage, rt.jar is 60064 KiB and rt-recompressed.jar is 27880 KiB]
# Make sure you've downloaded the attachment and saved it to
$PWD/ClassLoadTest.java.gz
# before running the rest.

gunzip ClassLoadTest.java.gz  javac -g:none -source 1.6 -target 1.6
ClassLoadTest.java
for i in `seq 1 10`; do java ClassLoadTest; done
sudo cp rt-recompressed.jar /usr/lib/jvm/java-6-openjdk/jre/
for i in `seq 1 10`; do java ClassLoadTest; done

# Cleanup
sudo cp rt.jar /usr/lib/jvm/java-6-openjdk/jre/lib/rt.jar
rm ClassLoadTest.java ClassLoadTest.class rt.jar rt-recompressed.jar; sync

-- Methodology --

I ran this test on a computer that has 4 GB of RAM and a dual-core 2.6
GHz AMD processor set to 2.6 GHz with the Performance frequency
selector. This test does not use threads, so it would be run at 2600
MIPS.

The Java program outputs the number of nanoseconds it takes the Java
VM to load all of the classes named in the 'loads' variable and all of
their dependencies (to see the number of classes loaded, use 'java
-XX:+TraceClassLoading ClassLoadTest | grep Loaded | wc -l'). For the
results below, I have removed the fastest and the slowest run before
calculating the average time.

-- Results --

Original rt.jar, nanoseconds: 192534058, 197527724, 194842272,
196013535, 188327043 [fastest], 191867030, 200329576, 202827722
[slowest], 195906590, 199961018

Recompressed rt.jar, nanoseconds: 208085265 [fastest], 230182021,
218891572, 221611904, 231577539, 232149021, 228110533, 230476584,
232480308 [slowest], 229873766

Original rt.jar: 196 milliseconds
Recompressed rt.jar: 228 milliseconds

Speed regressed by 32 milliseconds (16%) to load 894 classes.

[1] 
https://lists.ubuntu.com/archives/ubuntu-devel-discuss/2010-October/012183.html
--
Louis Simard
Conspicuous absence of digital signature here


ClassLoadTest.java.gz
Description: GNU Zip compressed data
-- 
Ubuntu-devel-discuss mailing list
Ubuntu-devel-discuss@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel-discuss


Re: LiveCD optimisations

2010-11-01 Thread Louis Simard
Hi Martin,

Thanks for the notification.

While you're working on PNG optimisations in the build scripts, I have
something to ask you.

There has been discussion on the More LiveCD space optimisations
thread [1] of using AdvanceCOMP to further reduce the size of PNG
files (even after OptiPNG, PNG files can get recompressed further!).
There has also been discussion of using jpegoptim to losslessly
recompress JPEG files, and AdvanceCOMP for ZIP/JAR and gzip files.

AdvanceCOMP is packaged in maverick universe as advancecomp.
jpegoptim is packaged in maverick universe as jpegoptim.

Could these programs be added to the build scripts, or would that be
discouraged since they're in universe? Would these optimisations be
a case for inclusion into main?

Regarding this:
 I'll package scour, and add it to cdbs gnome.mk with some test cases.

Thanks for this. Scour also has a fair amount of unit tests and other
test cases that you could use.

If you need to communicate with Scour for packaging adjustments, bugs
or gaps in documentation, don't hesitate to file bugs and/or patches
against Scour, or e-mail me. I'm the co-maintainer for Scour since
June 2010, but even if I can't do releases, I can commit to the trunk.

[1] 
https://lists.ubuntu.com/archives/ubuntu-devel-discuss/2010-October/012181.html

Regards,
--
Louis Simard
Conspicuous absence of digital signature here

-- 
Ubuntu-devel-discuss mailing list
Ubuntu-devel-discuss@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel-discuss


Re: More LiveCD space optimizations

2010-10-10 Thread Louis Simard
Sorry for the 4th post in a row, but I added a script that uses
AdvanceCOMP to recompress the .gz files that aren't man pages, and I
had to share my findings.

AdvanceCOMP? | ISO size (B) | Install (KiB)

  No |  711,032,832 | 2,474,660
 Yes |  707,821,568 | 2,469,568
---
 Savings |3,211,264 | 5,092

The script is attached.

Due to ext4 extent allocation and the order of the files on the CD,
the reordered CD made by 98make-disc boots faster, but its installed
size is 180 MB bigger, so this new mksquashfs ordering (in
98make-disc) is a tradeoff. This new ordering is not used in the
actual CD building process, though I filed a bug for it [1].

Should I revert to the default ordering done by mksquashfs or start
using ext3 installations to compensate, for testing?

- Louis

[1] https://bugs.launchpad.net/ubuntu/+source/livecd-rootfs/+bug/589629


92gz-optimisation-experimental
Description: Binary data
-- 
Ubuntu-devel-discuss mailing list
Ubuntu-devel-discuss@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel-discuss


Re: More LiveCD space optimizations

2010-10-08 Thread Louis Simard
2010-10-08 09:54 GMT Matthias Klose d...@ubuntu.com:
 In the past we did see wasted space:

  - Packages which should not be on the CD.  Some things should not be
   on the CD at all.  Looking at the current live CD log, a typical
   candidate for this would be tcl8.4. Why is it there, and how can
   it be avoided?

foo2zjs made APT install that package.

$ aptitude why tcl8.4
i   tk8.4 Depends tcl8.4 (= 8.4.16)
$ aptitude why tk8.4
i   foo2zjs Recommends tk8.4

I'm sure there will be other examples, though I'm not as familiar with
the LiveCD's packages as you guys at Canonical.

-- 
Ubuntu-devel-discuss mailing list
Ubuntu-devel-discuss@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel-discuss


Re: More LiveCD space optimizations

2010-10-08 Thread Louis Simard
Apologies for the previous attachment, it didn't have the addition for
man-page symbolic links.

I attach the proper one this time.

- Louis


ubuntu-opt.tar.gz
Description: GNU Zip compressed data
-- 
Ubuntu-devel-discuss mailing list
Ubuntu-devel-discuss@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel-discuss


Re: More LiveCD space optimizations

2010-10-07 Thread Louis Simard
2010-10-07 16:29 GMT Martin Owens docto...@gmail.com:
 On Fri, 2010-10-08 at 00:07 +0800, John McCabe-Dansted wrote:
 Strangely, even running advzip -z -0
 images_human.zip shrinks it by 3%, and even shrinks the corresponding
 images_human.zip.gz file

 That's not strange, that's just entropic packing principles. You've got
 a bunch of assumptions that can be made about data and a bunch of
 compression iterations, each make assumptions about the nature of the
 data and some are fitting together better.

 I'm keen on this work since saving space allows for all sorts of
 goodies. Did we save space with any of the SVG cleaning or did that need
 to be brought up to the packaging level?

 Martin,



Back in May, the preliminary testing I did on the LiveCD's .svg files
resulted in the finding that using Scour on them saved about 7 MB [1].
Of course, not only the LiveCD's packages use .svg files, and it would
be important to get that to other packages as well, for download
times/bandwidth use, if for any other reason. Perhaps rendering speed
would increase too, in SVG's case, but the other file formats
discussed in this thread have different characteristics.

So it needed to be brought up at the packaging level [1]. Scour will
probably itself need to be packaged too, to be included as
build-depends for packages that have SVG files (which is a lot of
application packages, since most have an SVG icon) to work well with
'apt-get source'.

[1] https://lists.ubuntu.com/archives/ubuntu-devel-discuss/2010-May/011505.html

-- 
Ubuntu-devel-discuss mailing list
Ubuntu-devel-discuss@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel-discuss


Re: More LiveCD space optimizations

2010-10-07 Thread Louis Simard
* LONG MESSAGE WARNING *
While I've tried to reduce the quotes and quote nesting as much as I
could, this message is still long. It is still important to read, when
you have time.

2010-10-07 16:07 GMT John McCabe-Dansted gma...@gmail.com:
 On Thu, Oct 7, 2010 at 10:05 AM, Louis Simard louis.sim...@gmail.com wrote:
 snipped

 I think this will be discussed at UDS-N, see:
 http://archives.free.net.ph/message/20101004.065026.e553efd1.en.html

Awesome! Will a digest of this conversation need to be posted to
ubuntu-devel only once done, continuing on ubuntu-devel-discuss for
now?

 2010-10-06 16:08 GMT John McCabe-Dansted gma...@gmail.com:
 [...] I note that we can save further space by:

 1) Using advdef on the png files in addition to optipng. This is what
 optimizegraphics does, and this shrinks the pngs on the Maverick RC
 liveCD from about 100.1MB to 85.3MB providing a saving of 14.8MB.

 We could test each file [after using advpng on them]
 to ensure the image is identical, perhaps
 using pngtopnm, and md5sum. This would be especially important for
 jpegrescan/jpgcrush, which is at version 0.0.0-1.

Good idea. I may be able to integrate this test into my script as an option.

 2) Recompressing gz files with advdef. Using advdef, we can shrink the
 gz files from 89.5MB to 84.8MB, [...] a saving of 4.7MB.

 [...] I did use 7zip's Deflate compressor to recompress a
 .zip file of OpenOffice.org's from 5.9 MB to 5.4 MB. [...]

 You mean images_human.zip?

Yes, thanks. :) I had forgotten the name.

 I have a hunch that compressing that file
 wouldn't actually save space on the liveCD as I can gzip it down to
 3.9MB. It may be better to leave it as an uncompressed zip, and let
 squashfs deal with it.

Per that Performance - Disk footprint thread from ubuntu-devel
[brainstorm], we may actually want to also care about the installed
size, and use the 7zip recompression. While it's not going to be
*perfectly optimal*, reducing both the CD footprint and the installed
size by 0.5 MB using 7zip sounds better than reducing the CD footprint
by 2 MB, but increasing the installed size by more than 2 MB. And if
you managed to re-gzip the zip, squashfs will also manage to re-lzma
the zip for more savings and still a decent installed size. You should
test this again with lzma, I think.

 Recompressing the pngs contained in the zip
 sounds worthwhile though. Strangely, even running advzip -z -0
 images_human.zip shrinks it by 3%, and even shrinks the corresponding
 images_human.zip.gz file

I believe you there, only because the original situation has a
deflated container (png) within another deflated container (zip).
Counter-intuitive, but something to consider.

 Also, there are 12MB of jar files, which are basically zip files. We
 can also shrink those by 5MB or so with advzip, but that doesn't seem
 to shrink a .tgz of them so it may not shrink the liveCD. Since zip
 files compress file by file, we may be able to save space on the
 liveCD by running advzip -z -0 on them. That would expand them to
 24MB, but reduces the size of a .tgz of them to 4.6MB, possibly saving
 space on the liveCD if squashfs is similarly efficient.

Later post by Matthias Klose
 same for jar files. are these extracted as fast as without your changes by the
 jvm? if not, then these should be left alone (and afaik there shouldn't be any
 jar files on the live CD).

Aha! I completely forgot .jar files. The OpenJDK package itself may
become much smaller after this, because of the huge runtime rt.jar.
Must test and benchmark this!

I believe OpenOffice.org is a huge user of Java, so there would be
.jar files on the LiveCD from that too.

 A further 10MB could be saved by recompressing the gz files as lzma.
 At what LZMA compression level? Default (7) or --best (9)?
 --best

I just want to add that blanket recompression of gzip files as lzma
with --best could be harmful, but with small files it's probably OK.
LZMA uses a huge dictionary to do its work, which needs to be
allocated even on the decompressing side, and --best may overrun the
memory of low-end computers on larger files.

 Also, if we want to take replacing deflate with lzma to extremes, we
 could replace the deflate compression in the png files with lzma. A
 command that does this is advpng -z -0 $f  lzma --best $f. I found
 that this could save 18.7MB. However,  It may also degrade performance
 slightly, but I doubt it would be too significant on modern CPUs.
 Running unlzma on all 66MB of the .png.lzma files takes:
 real    1m2.666s
 user    0m6.540s
 sys     0m5.610s

 I think the user/sys are the relevant ones, and taking 12s to read
 every png doesn't seem too bad. The main thing is that I doubt that it
 would work out of the box.

 If we use lzma in the squashfs, just deflating them all with advpng -z
 -0 could reduce the liveCD size. Probably wouldn't help the installed
 size though.

Indeed.

 There are a over a dozen different types of file to be tested (and
 there may be more than

Re: More LiveCD space optimizations

2010-10-06 Thread Louis Simard
Hey :)

Thanks for the interest in this optimisation! Unfortunately I wasn't
pushy enough in my thread from May-June and it wasn't included in the
Maverick LiveCD. A pending question is what to do to include the
recompressed files into the archive's packages [1].

2010-10-06 16:08 GMT John McCabe-Dansted gma...@gmail.com:
 In May, Louis Simard proposed rencoding PNG files and SVG files to
 reduce their size [Quoted 1]. I note that we can save further space by:

 1) Using advdef on the png files in addition to optipng. This is what
 optimizegraphics does, and this shrinks the pngs on the Maverick RC
 liveCD from about 100.1MB to 85.3MB providing a saving of 14.8MB.

So it does; I didn't know about that. Reading the man file for advpng,
it gave a warning that it was only supported for AdvanceMAME-generated
PNG files, so I was skeptical, but it does shave off about 4% more
filesize on average with 'advpng -z4'.

 2) Recompressing gz files with advdef. Using advdef, we can shrink the
 gz files from 89.5MB to 84.8MB, and provides a saving of 4.7MB.

That's an interesting optimisation; I didn't really know about it
either. However, I did use 7zip's Deflate compressor to recompress a
.zip file of OpenOffice.org's from 5.9 MB to 5.4 MB. The method was
rather crude, but it did the job:

mkdir extracted
cd extracted
unzip ../file.zip
7z a -tzip -mx=9 -mfb=258 file.repack.zip extracted/*
rm -r extracted

 3) Recompressing jpeg files with jpegrescan. This only saves 0.5MB,
 but implementing this would add just a couple more lines of code, and
 jpegrescan does not lose any picture quality [Quoted 2].

jpegoptim indeed performs lossless optimisation of JPEG files by
editing Huffman tables, and it's used as the basis of jpegrescan.
However, jpegoptim doesn't make non-progressive files progressive, as
I understand jpegrescan does. This may make jpegoptim's optimisations
more transparent to applications that, for some reason, can't decode
progressive JPEGs and thus have non-progressive JPEGs in their
packages. However, most applications should be using libjpeg anyway,
so perhaps this point is moot.


 Together these should shrink the liveCD by over 20MB. This is without
 even considering the .xml and .svg optimizations Louis proposed.

 A further 10MB could be saved by recompressing the gz files as lzma.

At what LZMA compression level? Default (7) or --best (9)?

 This seems reasonable as lzma has reasonable decompression times (e.g.
 7ms to decompress a largish manpage like lsof).

7 ms? What's your CPU? :)

 Since the liveCD is
 compressed anyway, it seems that if a file is compressed with gzip. it
 is worth compressing with lzma.  The command man already seems to
 have lzma support, but we'd want to test each application to ensure
 that it functions correctly when its .gz files are replaced with lzma
 files. We could also selectively recompress the gz files, as some .gz
 files are actually smaller (by about 40 bytes) than the corresponding
 lzma file.

I hadn't considered this type of transcoding for the LiveCD. We may
want to ourselves test which programs accept .lzma files in their
directories in addition to .gz. Shall you do it, shall I, or shall we
both do it? Also, is anyone else interested?

Your point about files being compressed anyway is kind of interesting:
both Deflate and LZMA recompress very poorly, so saving bytes by
switching from one to the other makes sense. That would not shrink the
*installed size* of these man pages much, though, because of default 4
KB blocks for ext[2-4].


 Given that recoding SVG files can save 7MB [Quoted 1], simply recoding files
 could free up 37MB for the Natty LiveCD (and presumably also reduce
 the the average size of debs in the repos by about 5%).

 [Quoted 1] 
 http://www.mail-archive.com/ubuntu-devel-discuss@lists.ubuntu.com/msg11337.html
 [Quoted 2] http://news.ycombinator.com/item?id=803839

 I attach the script I used to check how much space would be saved.
 This is purely for reproduction of my results, it is not integrated
 into Louis's script.

Do you want me to add to my script any of the optimisations discussed
in your email? They are: Using AdvanceCOMP to recompress .png images
and gzipped files; using either of jpegoptim or jpegrescan to
losslessly recompress .jpg images; transcoding man pages from .gz to
.lzma. I'm not going to add untested optimisations yet, such as
transcoding *all* .gz files to .lzma.

I'm still very interested in this, despite the lack of posting about
the subject in the last 4 months! I've just been waiting for the guys
at Debian to advise me on how to best integrate these optimisations
into packages. Perhaps I should just devise a set of suitable
build-depends additions (optipng, advancecomp, jpegoptim) and makefile
rules for .png/.jpg/.gz, then file a single bug report on all of the
packages that would benefit the most from optimisations? That way,
package maintainers could opt in rather easily.

- Louis

[1] https://lists.ubuntu.com

Re: Apache2 in default Ubuntu install

2010-08-13 Thread Louis Simard
At 2010-08-13 08:39 GMT, Joshua Timberman jos...@opscode.com wrote:
 Hello!

 On Aug 13, 2010, at 12:37 AM, Micah Gersten wrote:

 Because sensible defaults are necessary.  You get your choice of Apache
 or something else.  If you selected another httpd on install and php5
 dragged in apache, that might qualify as a bug.  If you selected
 nothing, well you get the sensible default which is Apache.


 Why not have the depends something like:

 Depends: ... apache2 | lighthttpd | nginx | otherhttpserverphpworkswith ...

 So that if one of the others is installed, php5 doesn't try to install 
 apache2?

 --
 Opscode, Inc
 Joshua Timberman, Technical Evangelist
 C: 720.334.RUBY E: jos...@opscode.com


 --
 Ubuntu-devel-discuss mailing list
 Ubuntu-devel-discuss@lists.ubuntu.com
 Modify settings or unsubscribe at: 
 https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel-discuss

Or, perhaps more future-proof, have php5-cgi Depend on a new virtual
package called http-server or something, and make
apache/nginx/lighttpd/... Provide http-server.

However, I don't know how one would go about making the php5-cgi
package install the proper library (like libapache2-mod-php5) for the
specific httpd that's actually installed, upon installing php5-cgi
itself... except via Suggests/Recommends + the user opting to install
the proper one afterwards :)

-- 
Ubuntu-devel-discuss mailing list
Ubuntu-devel-discuss@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel-discuss


Re: Where to find this

2010-07-20 Thread Louis Simard
At 2010-07-20 14:19 GMT, aakash pandey ap2373...@gmail.com wrote:
 I need a non brown ambiance theme with full black panels .

This question looks completely off-topic for the ubuntu-devel list.
You should ask it on ubuntu-users instead.

Regards,
Louis

-- 
Ubuntu-devel-discuss mailing list
Ubuntu-devel-discuss@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel-discuss


Re: LiveCD optimisations

2010-05-22 Thread Louis Simard
At 2010-05-21 04:41 GMT, Martin Owens docto...@gmail.com wrote:
 Are there no more things that could be optimised? For instance does
 using xmllint with --noblanks on the 12496 xml files save any space?

My testing with XML files is done now, and here are the results! (And
the modified script, attached to this email)

'xmllint --noblanks --nsclean FILE' gives savings of 3 MB
(pre-squashfs). It actually *enlarges* some files containing non-7-bit
characters, such as gconf-tree in French (by a bit, due to accented
chars), Greek and Japanese (by a lot, due to every single text-node
character being entity'd).

'xmllint --noblanks --nsclean --encode utf-8 FILE' gives savings of 10
MB. It shrinks even the French, Greek and Japanese files.

On the squashfs'd side, this gives modest savings of 0.79 MB.

HOWEVER: The optimisations made card games (Klondike etc.) unplayable,
as no cards appear, due to the change in
/usr/share/gnome-games-common/cards/gnomangelo_bitmap.svg. Gbrainy
started crashing when a new game of verbal analogies was started, due
to xmllint's addition of an ?xml? tag in
/usr/share/games/gbrainy/verbal_analogies.xml. Nautilus lost its
toolbars, icons and right-click menu. The help viewer (System / Help
and Support) complains that every file is not a well-formed XML
document. So perhaps XML optimisations aren't so good? :(

1379 HTML files could be optimised too, but they might get hopelessly
mangled by xmllint - is there a utility for that?

136 JPEG files... well, those are lossy :)

379 GIF files... some are in HTML docs and could be replaced with
PNGs, but that can't be done automatically, and so the Ubuntu Doc team
would have to get involved. (the images are so small it's probably not
worth it, except to get away from the LZW patent...) There are
spinner/throbber animations in .gif format in some packages (Gwibber,
Rhythmbox), as well as animated clipart in OpenOffice, which probably
can't get replaced.

1 TIFF image: 
/usr/share/app-install/icons/_usr_lib_GNUstep_Applications_GTAMSAnalyzer.app_Resources_largeApp.tif
- This could become a PNG too.


ubuntu-optimisations.sh
Description: Bourne shell script
-- 
Ubuntu-devel-discuss mailing list
Ubuntu-devel-discuss@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel-discuss


Re: LiveCD optimisations

2010-05-22 Thread Louis Simard
At 2010-05-22 09:06 GMT, Didier Roche didro...@ubuntu.com wrote:
 (is 0.79 MB containing the whole optimization or just the xml one?)
 [snip]
 Not sure it worths the risk if the real size gain in the iso is only
 0.79 MB.

0.79 MB (on the squashfs) is for XML files only, and is HIGHLY
error-prone due to applications either not recognising whitespace
properly, not recognising the ?xml? declaration, not recognising the
encoding, parsing using a homegrown parser or regex that relies on
indentation/pretty-printing, or just being finnicky.

But the Ubuntu LiveCD still stands to gain an easy 11 MB, even without
that 0.79 MB. :)

5 MB (on the squashfs) for PNG images: completely safe conversion,
because OptiPNG doesn't have bad bugs.

6 MB (on the squashfs) for SVG images: completely safe conversion
except for the card games, which need ID names to seaprate the cards
from the one SVG file. All application icons and user interface button
icons etc. are okay.

Regards,
- Louis

-- 
Ubuntu-devel-discuss mailing list
Ubuntu-devel-discuss@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel-discuss


Re: LiveCD optimisations

2010-05-22 Thread Louis Simard
At 2010-05-22 10:59 GMT, Dmitrijs Ledkovs dmitrij.led...@ubuntu.com wrote:
 Is it due to them using GMarkup instead of libxml to parse XML's?

 I yes it's a bug in glib then =) i would be cool to compress xml's as
 much as possible. Afterall people should be getting the source
 packages to edit those and apps should parse xml's just fine without
 spaces and with/without ?xml? tag.

I think optimisations on the XMLs should be called off for now. All of
the XMLs in /usr/share/gnome/help are getting their ENTITY
declarations duplicated from /usr/share/gnome/help/libs/global.ent
with xmllint... this is probably the cause of the help viewer not
working. As for applications using GMarkup, that's entirely possible;
the dependency list for Gbrainy, for instance, has libglib2.0-cil and
libmono-system2.0-cil which probably implement Mono's System.Xml
namespace. However, Yelp depends on docbook-xml and libxml2.

I can confirm that my XML optimisations broke Gbrainy and Nautilus by
excluding these files from the script:

/usr/share/branding/gnome-games-common/cards/gnomangelo_bitmap.svg
/usr/share/nautilus/ (recursive) *.xml
/usr/share/games/gbrainy/verbal_analogies.xml

The Scour optimisation should also be called off only for the cards.

-- 
Ubuntu-devel-discuss mailing list
Ubuntu-devel-discuss@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel-discuss


Re: LiveCD optimisations

2010-05-21 Thread Louis Simard
At 2010-05-21 04:41 GMT, Martin Owens docto...@gmail.com wrote:
 Hey Louis,

Hey Martin, thanks for the reply!

 Sounds great and looks like a pretty good script, I have some comments:

 You may be able to make it a little faster by using the find results in
 one like like this:

 find / -type f -name *.svg -print0 | xargs -0 -I FILE sh -c
 '/tmp/scour/scour.py --enable-id-stripping --indent=none -i FILE -o
 FILE-opt  test -s FILE-opt  mv FILE-opt FILE || rm FILE-opt'

I had considered using sh -c to execute the Scouring and renaming,
yes, but didn't know how to go about detecting empty files except with
another 'find'. Thanks for telling me about test -s :)

 Although if you can get all that into a script file, so much the better
 so it's not all on one line. But at least it's not doing a find 3 times
 for the same files.

True. This is a case of optimising the optimiser, which I consider a
micro-optimisation because the later invocations of 'find' are highly
likely to have the needed disk blocks in RAM - but every little bit
helps, just like with these image files. (Speaking of which, Scour.py
imports the Psyco JIT if it's available, but it doesn't help that
much. It makes the Python code itself run faster, yes, but at the cost
of greater startup time for each Scour.py instance, and most files are
optimised in 0.06 second anyway.)

 Do you need to chroot into the file system to perform these steps?
 considering that your downloading code to do it (with bzr which isn't
 installed ont he cd). Would it not be good to perform these steps
 outside of the squashfs and iso file system?

 For instance I got resolve issues when it tried to do the apt update.

I probably don't. That was part of a script that allowed me to
customise more things, such as updating packages (which I needed to
chroot for), removing the desktop background, updating Linux and all
that; I just trimmed it down for this email. I'll move the chroot
processing to the host.

 Are there no more things that could be optimised? For instance does
 using xmllint with --noblanks on the 12496 xml files save any space?

Will test this shortly. I hadn't thought of that yet, and I'm
flabbergasted by the number of XML files! Seeing as SVG files are also
XML files, and Scour.py seems to pretty-print XML even with
--indent=none, that might save even more, actually.

 Finally... should some of these optimisations work their way upstream so
 all packages have optimised files, smaller downloads, smarter mirror
 storage etc?

Of course! :) Working with upstreams would avoid keeping debdiffs
around for the optimised files in Ubuntu repositories, and will help
other distributions too.

I'll attach a modified script to my next email with more testing
results regarding XML.

Regards,
- Louis

-- 
Ubuntu-devel-discuss mailing list
Ubuntu-devel-discuss@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel-discuss


LiveCD optimisations explained

2010-05-21 Thread Louis Simard
At 2001-05-21 14:48 GMT, Phillip Susi ps...@cfl.rr.com wrote:
 When attaching scripts please make sure they are attached with an inline
 disposition so they are readily reviewable while reading the email
 instead of having to save them and open them in another text editor.

Err... While I know what you want me to do (you want
Content-Disposition: inline), I don't know how to do that in the Gmail
web interface. Perhaps I'll set up Mozilla Thunderbird, if it can do
that :-)

 [C]ould you explain a bit what you mean by optimizations?  You can
 of course, use a higher lossy compression on the png images, but that
 lowers their quality, which I think is not a desirable tradeoff.

The optimisations I describe would be completely lossless, barring
bugs in the software used to carry out these optimisations.

- For PNG: the data used to store some images on the CD is not
compressed to the highest level. OptiPNG takes those files and tries
to recompress them to the highest level, while ensuring that every
pixel's color value ends up being the same.

- For SVG: the data used to store ALL images on the CD is not optimal
for rendering purposes. Inkscape metadata, Sodipodi metadata, ID names
for elements that end up unused, gradients defined dozens of times,
etc., are bloating the files. Scour.py takes those files and removes
this bloat, while ensuring that the new versions render identically to
the original. However, since Inkscape's metadata ends up removed, it
could be more difficult for users to open these new files in Inkscape.

- For XML, as described by Martin Owens: xmllint would remove
everything superfluous from all files on the CD, while ensuring that
the data is parsed identically. I haven't tested this yet except on
one file from the CD (squashfs -
/var/lib/gconf/defaults/%gconf-tree.xml), but that file went from
2,095,034 bytes to 1,779,376 (a savings of 315,658). There's more hope
yet.

Regards,
- Louis

-- 
Ubuntu-devel-discuss mailing list
Ubuntu-devel-discuss@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel-discuss


LiveCD optimisations

2010-05-20 Thread Louis Simard
Greetings ubuntu-devel-discuss :)

I have a proposal for you, and I'll present it simply with the 5 W's.

-- WHAT? --

Optimise the PNG images and SVG files on the Ubuntu LiveCD.
Optimise the Ubuntu LiveCD by putting start-up files and programs near
the end of the CD.

-- WHY? --

Optimising the PNG images saves 5.5 MB on the filesystem.squashfs.
Optimising the SVG files saves an additional 7 MB. This is a total of
12.5 MB which could be used to pack more software or another language
pack or two onto the LiveCD.
Optimising the CD to put files at the end allows it to boot marginally
faster (about 10 seconds on my benchmarks), start applications faster,
and allows the CD drive on a user's computer to run quieter while
using his/her applications, as reading near the end (edge of the disc)
requires slower spinning.
These changes will give prospective users a better view of Ubuntu
right from the LiveCD. There might also be additional benefits to
having smaller PNG and SVG images, such as saving space on a user's
hard disk when installed. The uncompressed (pre-squashfs) savings for
the SVG images is 18 MB.

-- WHEN? --

Now! :) Just kidding. As soon as possible would be nice. Maybe even
for the next Ubuntu version, codename Maverick Meerkat!

-- WHO? --

Ubuntu developers. But don't go thinking that you'll do all the grunt
work of testing these optimisations for yourself! (See HOW? below)

-- HOW? --

Attached to this email is a bash script I've made to perform all of
these optimisations on any Canonical-supported Ubuntu 10.04
LiveCD image, almost automatically. (After optimisations are done, you
can check the state of the LiveCD in a bash shell from within it. The
rest is fully automatic.)

The real savings would come from optimising the PNG and SVG images
right in the packages themselves, not just the LiveCD. Given a
directory containing PNG and SVG images, the part of my script dealing
with OptiPNG and Scour.py can automatically optimise these files. The
best candidate for such a Scouring would be ubuntu-docs, as it has
tons of PNG images. Most application packages have an SVG icon or two
as well.

Thanks for your time!
- Louis


ubuntu-optimisations.sh
Description: Bourne shell script
-- 
Ubuntu-devel-discuss mailing list
Ubuntu-devel-discuss@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel-discuss