On Tuesday, February 27th, 2024 at 20:01, Rob Landley <[email protected]> wrote: > On 2/27/24 19:23, Oliver Webb wrote: > > > The below patch does a bunch cleanup on xzcat.c. > > It resolves the ifdefs that were in the code (But everything they were > > checking was defined so they didn't do anything) > > Adds Tests for errors (And while we are changing the test suite, converts > > testing -> testcmd) > > Rearange/Rewrite large comments to be smaller (and sometimes C99!). > > It puts main at the bottom and do_xzcat above it, the way code is usually > > arranged.. > > Removes some function prototypes and the inline keyword (useless in modern > > C since it doesn't force anything) > > Removes "!= 0" (And turns "x == 0" to "!x") > > Some types replaced with their LP64 counterparts (uint8_t to char, etc) > > > > Yes, I know that this is large for a patch, but a lot of that is shuffling > > code/comments. None of the logic has been > > changed. When all is said and done, this patch shrinks xzcat.c by over 10% > > > Applied (out of the list's spam filter, "message body too big"), but you've > introduced a glitch (missing flush maybe). When I tested it against the xz > files > I had lying around via "xzcat $FILE | tar tv", one file failed with your patch > and succeeded before it. I copied it to my web server so you can take a look: > > wget https://landley.net/musl-cross-make.tar.xz > > (Yeah, it's 200 megs. Sorry about that. That was the first one that went > "boing".)
$ ./xzcat musl-cross-make.tar.xz | ./count -l > /dev/null 229834752 bytes, 219Mb, 153Mb/s, 0m01s $ xzcat musl-cross-make.tar.xz | ./count -l > /dev/null 229836800 bytes, 219Mb, 509Mb/s, 0m00s Hmmm, there seems to be missing bytes at the end $ ./xzcat musl-cross-make.tar.xz | xxd > fi 229834752 bytes, 219Mb, 153Mb/s, 0m01s $ xzcat musl-cross-make.tar.xz | xxd > fi2 229836800 bytes, 219Mb, 509Mb/s, 0m00s14364673,14364800d14364672 $ diff fi fi2 < 0db30000: 2022 2469 220a 2020 646f 6e65 0a66 690a "$i". done.fi. < 0db30010: 0000 0000 0000 0000 0000 0000 0000 0000 ................ [zeros until end of file] putting a fflush at the end of reading a file doesn't fix anything, it's losing exactly half a page so not flushing would also be my guess > Also, I note that there is an "upstream" for this code, which may have changed > since I merged this (many moons ago), and any upstream changes are probably > worth a look: > > https://git.tukaani.org/xz-embedded.git > > The one in pending is an old version of that glued together into one file > with a > config header, NEWTOY() and command_main() wrapper. (Yes, the upstream is > public > domain! Extractor only, not compressor. I haven't found a public domain xz > compressor yet.) > > I'm also unclear on the difference between xz and 7zip, and the linux kernel's > "general setup menu" also offers lzma, lzo, lz4, and zstd as initramfs > compression options. I just know kernel.org has tar.xz files, and > https://linuxfromscratch.org/lfs/view/12.0/chapter03/packages.html has .gz, > .bz2, and .xz tarballs, hence those are the three formats toybox can > decompress. > > It should be able to create gz and zip, which is also what > https://github.com/landley/toybox/tags offers to create. (Both of those are > deflate, I know how and wrote one in java once long ago. I just got > sidetracked > on when to do dictionary resets to match the hashes of existing tarballs. The > answer is probably "nuts to your white mice" and just do it every 250k or > something, and oh well the hashes don't match but you get the data back and > that's the important thing. But first I wanted to grab a bunch of existing .gz > tarballs and CHECK when they do dictionary resets, because maybe it IS already > something simple like that.) > > Rob > > P.S. Alas, I don't make a habit of extensively reading incompatibly licensed > code before writing a 0BSD version, so this kind of question can linger a > while > on the todo list when I could theoretically just go track it down in the > source... - Oliver Webb <[email protected]> _______________________________________________ Toybox mailing list [email protected] http://lists.landley.net/listinfo.cgi/toybox-landley.net
