Re: LiveCD optimisations
At 2010-05-21 04:41 GMT, Martin Owens docto...@gmail.com wrote: Hey Louis, Hey Martin, thanks for the reply! Sounds great and looks like a pretty good script, I have some comments: You may be able to make it a little faster by using the find results in one like like this: find / -type f -name *.svg -print0 | xargs -0 -I FILE sh -c '/tmp/scour/scour.py --enable-id-stripping --indent=none -i FILE -o FILE-opt test -s FILE-opt mv FILE-opt FILE || rm FILE-opt' I had considered using sh -c to execute the Scouring and renaming, yes, but didn't know how to go about detecting empty files except with another 'find'. Thanks for telling me about test -s :) Although if you can get all that into a script file, so much the better so it's not all on one line. But at least it's not doing a find 3 times for the same files. True. This is a case of optimising the optimiser, which I consider a micro-optimisation because the later invocations of 'find' are highly likely to have the needed disk blocks in RAM - but every little bit helps, just like with these image files. (Speaking of which, Scour.py imports the Psyco JIT if it's available, but it doesn't help that much. It makes the Python code itself run faster, yes, but at the cost of greater startup time for each Scour.py instance, and most files are optimised in 0.06 second anyway.) Do you need to chroot into the file system to perform these steps? considering that your downloading code to do it (with bzr which isn't installed ont he cd). Would it not be good to perform these steps outside of the squashfs and iso file system? For instance I got resolve issues when it tried to do the apt update. I probably don't. That was part of a script that allowed me to customise more things, such as updating packages (which I needed to chroot for), removing the desktop background, updating Linux and all that; I just trimmed it down for this email. I'll move the chroot processing to the host. Are there no more things that could be optimised? For instance does using xmllint with --noblanks on the 12496 xml files save any space? Will test this shortly. I hadn't thought of that yet, and I'm flabbergasted by the number of XML files! Seeing as SVG files are also XML files, and Scour.py seems to pretty-print XML even with --indent=none, that might save even more, actually. Finally... should some of these optimisations work their way upstream so all packages have optimised files, smaller downloads, smarter mirror storage etc? Of course! :) Working with upstreams would avoid keeping debdiffs around for the optimised files in Ubuntu repositories, and will help other distributions too. I'll attach a modified script to my next email with more testing results regarding XML. Regards, - Louis -- Ubuntu-devel-discuss mailing list Ubuntu-devel-discuss@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel-discuss
Re: LiveCD optimisations
On 5/20/2010 8:35 PM, Louis Simard wrote: Greetings ubuntu-devel-discuss :) I have a proposal for you, and I'll present it simply with the 5 W's. snip When attaching scripts please make sure they are attached with an inline disposition so they are readily reviewable while reading the email instead of having to save them and open them in another text editor. Also could you explain a bit what you mean by optimizations? You can of course, use a higher lossy compression on the png images, but that lowers their quality, which I think is not a desirable tradeoff. -- Ubuntu-devel-discuss mailing list Ubuntu-devel-discuss@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel-discuss
LiveCD optimisations explained
At 2001-05-21 14:48 GMT, Phillip Susi ps...@cfl.rr.com wrote: When attaching scripts please make sure they are attached with an inline disposition so they are readily reviewable while reading the email instead of having to save them and open them in another text editor. Err... While I know what you want me to do (you want Content-Disposition: inline), I don't know how to do that in the Gmail web interface. Perhaps I'll set up Mozilla Thunderbird, if it can do that :-) [C]ould you explain a bit what you mean by optimizations? You can of course, use a higher lossy compression on the png images, but that lowers their quality, which I think is not a desirable tradeoff. The optimisations I describe would be completely lossless, barring bugs in the software used to carry out these optimisations. - For PNG: the data used to store some images on the CD is not compressed to the highest level. OptiPNG takes those files and tries to recompress them to the highest level, while ensuring that every pixel's color value ends up being the same. - For SVG: the data used to store ALL images on the CD is not optimal for rendering purposes. Inkscape metadata, Sodipodi metadata, ID names for elements that end up unused, gradients defined dozens of times, etc., are bloating the files. Scour.py takes those files and removes this bloat, while ensuring that the new versions render identically to the original. However, since Inkscape's metadata ends up removed, it could be more difficult for users to open these new files in Inkscape. - For XML, as described by Martin Owens: xmllint would remove everything superfluous from all files on the CD, while ensuring that the data is parsed identically. I haven't tested this yet except on one file from the CD (squashfs - /var/lib/gconf/defaults/%gconf-tree.xml), but that file went from 2,095,034 bytes to 1,779,376 (a savings of 315,658). There's more hope yet. Regards, - Louis -- Ubuntu-devel-discuss mailing list Ubuntu-devel-discuss@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel-discuss
Re: LiveCD optimisations explained
On 5/21/2010 1:40 PM, Louis Simard wrote: Err... While I know what you want me to do (you want Content-Disposition: inline), I don't know how to do that in the Gmail web interface. Perhaps I'll set up Mozilla Thunderbird, if it can do that :-) Heh, yea, I've struggled with this on thunderbird too, which is why I usually end up submitting patches via something like mime-construct or some other command line mime editor where I can force it to use Content-Disposition: inline. - For PNG: the data used to store some images on the CD is not compressed to the highest level. OptiPNG takes those files and tries to recompress them to the highest level, while ensuring that every pixel's color value ends up being the same. I believe that PNG applies a lossey compression first, then gzips the result. It sounds like you are saying that the gzip is done with -3 instead of -9, so you ungzip it and recompress on -9. Is that more or less correct? If so that sounds pretty good, but like you mentioned before, should be done upstream rather than only for the livecd. - For SVG: the data used to store ALL images on the CD is not optimal for rendering purposes. Inkscape metadata, Sodipodi metadata, ID names for elements that end up unused, gradients defined dozens of times, etc., are bloating the files. Scour.py takes those files and removes this bloat, while ensuring that the new versions render identically to the original. However, since Inkscape's metadata ends up removed, it could be more difficult for users to open these new files in Inkscape. Sounds good, and also would be good to do upstream instead of just for the lived. - For XML, as described by Martin Owens: xmllint would remove everything superfluous from all files on the CD, while ensuring that the data is parsed identically. I haven't tested this yet except on one file from the CD (squashfs - /var/lib/gconf/defaults/%gconf-tree.xml), but that file went from 2,095,034 bytes to 1,779,376 (a savings of 315,658). There's more hope yet. I noticed the bloated gconf xml files a few years back myself and brought it up on the devel list. IIRC I saw even more wasted space than you mention here, due to 10, 20, even 30 characters of whitespace indenting each line. This does add a lot of bloat to the files I don't like to have on an installed system, but once compressed into the squashfs for the livecd, the whitespace drops out, so there wasn't much concern about it. At one point I tried just converting the whitespace into hard tabs and saved quite a bit of space, while preserving the indentation for human editing. -- Ubuntu-devel-discuss mailing list Ubuntu-devel-discuss@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel-discuss
Re: LiveCD optimisations
On 21 May 2010 01:35, Louis Simard louis.sim...@gmail.com wrote: -- WHAT? -- Optimise the PNG images and SVG files on the Ubuntu LiveCD. Optimise the Ubuntu LiveCD by putting start-up files and programs near the end of the CD. -- Implementation -- 1) Should this go into deb-package mangler run by soyuz? 2) Or should this be implemented as debhelper addon / cdbs as no-op ubuntu-patch and then if successful (all the quirks are worked out) and pushed to Debian? -- Ubuntu-devel-discuss mailing list Ubuntu-devel-discuss@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel-discuss