On 03/01/2013 01:10:36 AM, Isaac Dunham wrote:
I'm looking into adding an unxz based on xz-embedded, which is public domain.

Cool!

I noticed this recently (due to the busybox thread about it) and was pondering the same myself. I downloaded the git repo but am not going to have time to look at it any time soon, happy somebody else is taking a look at it. :)

However, I'm wondering about some things.
Basically, I get the impression that some (most? all?) of the compile-time options
may not be reasonable.

Toybox's primary design goal is simplicity. Complexity is a limited resource that we spend on implementing features, increasing speed, and reducing size, but everything we do has to be worth the complexity cost.

1) xz allows several filters to improve compression of executables (BCJ filters). Should all of these be turned on unconditionally, or should it be user-selectable? The native BCJ filter for each arch is probably necessary for compatability reasons, but I'm wondering about alternative ones (eg, should we enable sparc BCJ filters
everywhere?)

On kernel.org there are tar.gz files, tar.bz2 files, and tar.xz files. Our decompressor has to handle all of that.

On the compression side, we've got a quick streaming compressor already (gzip) which gets the low hanging fruit of compression and is going to be faster than anything else (fits in L1 cache a lot of the time), so I believe the main advantage of xz is _better_ compression? (Correct me if I'm wrong here, I don't use it much...)

I agree that 8 gazillion knobs isn't really what toybox is good at.

2) I assume that CRC64 support should be unconditional. Upstream recently added
crc64, but it's optional there.

Compatability with existing and future data files is the important thing.

3) Should unsupported integrity checks be ignored, cause an error, or should
this be a compile-time option?

On the compressor side or on the decompressor side?

On the decompressor side I'd probably just ignore them. We're going to have at least crc32, right? And then tar will internally have some basic "this is not a valid tar file" check...

I'm assuming that even if we can't check, we should still decompress.

Doing the best we can to work with the input we're given, yes.

Also, (assuming that at least one of the above should be configurable) should the xz library part be configurable separately from the unxz command? This is mainly
relevant for if you plan to use it to decompress for tar et al.

Hmmm... that's the kind of thing we can clean up later (don't have to decide right now). Just do the xz command(s) and I'll wire it up to tar when I get around to doing tar. :)

(It's quite possible the right thing for tar is to just shell out to xz from the $PATH and pipe stuff through an external command, and if that command is internal then fork() and xexec() will do the right thing anyway. The reason this is the right thing is both simplicity of implementation and because SMP is pretty ubiquitous these days and two processes are SMP-friendly. If somebody wants to wire this into an u-boot with no scheduler, they can do it themselves.)

Is there a way to conditionally compile code in lib/?

Not yet. In theory the gc-sections stuff is dropping out unused code, so it gets built but not included into the final binary.

In practice, I probably need to redo the build system because the gcc guys decided that their compiler was just too horrible to make build-at-once mode actually work, so they save the intermediate parse results into special ELF sections and then unload the actual code generation onto the linker, which is called link time optimization and is a horrible solution. So the "cc *.c" approach I've been doing doesn't take advantage of SMP and won't because the gcc developers are incompetent, and I need to work around them (or see if llvm is better).

So for right now, don't worry about it. Just add the file to lib and if the build gets uncomfortably slow I'll improve it later.

Rob
_______________________________________________
Toybox mailing list
[email protected]
http://lists.landley.net/listinfo.cgi/toybox-landley.net

Reply via email to