On Sun, Jan 14, 2018 at 10:29 AM, Rob Landley <[email protected]> wrote: > On 01/08/2018 07:27 PM, enh wrote: >> but, yeah, i should have said so in TODO comments. (i thought i did >> comment the former, but that may have been lost in the toybox >> obfuscated c competition rewrite.) > > I should explain what all those changes are for: > > First hunk: make -12345 switch off the others so it's "last wins" > instead of "lowest wins". > > Second hunk: Move zlib.h #include after #include toys.h because > #includes pull in other #includes and some of them care about #defines > happening first so I generally have local includes after the #include > toybox wherever possible to avoid perturbing the existing include sequence. > > Next hunk: gzerror_exit() became gzerror_msg() so that "gunzip one.gz > two.gz three.gz" could continue after errors because that's what the > other one does. > > I factored out the zlib_inflate() call into its own function so I can > have builtin vs zlib versions as a config option (there's already a > builtin deflate in compress.c that I want this to be able to use, I > should probably ove it into ). > > The other reason is the zlib api it's currently using passes through > uncompressed data instead of erroring out. (I.E. on ubuntu "zcat README" > says "gzip: README: not in gzip format" and ours cats it, which means it > will ALSO cat bunzip2 or xz data rather than detecting you've chosen the > wrong compression format. That's a significant behavioral difference > because it's using a wrapper around inflate(). I want to switch it to > use zlib's inflate so it can detect errors, it's covered in > https://www.zlib.net/zlib_how.html but I haven't done that yet.) > > The changes to do_gunzip() are also half-finished, I checked in what I > had when I ran out of time to work on it. (Sorry.) But here's a few of > the highlights: > > - FILE *out; > + int out_fd=0; > - out = xfdopen(out_fd, "w"); > > The previous code was using a FILE * wrapper around filehandles, I made > it just use filehandles consistently rather than mix the two. > > The old code was using FLAG_c in three places and the new stuff is using > FLAG_c in one place. the old code was doing strdup() to create temporary > copies of all the filenames, but only the output filename was ever > getting modified and only in one codepath. I switched from having a > separate "both_files" variable to having out_name be NULL when we didn't > create an output file. > > The new code tries to take advantage of the fact loopfiles() > automatically knows that when you have no arguments, you should read > from stdin and the name is "-". (The comment about stderr is wrong, my > bad. As I said, unfinished.) > > The old code had a lot of "-" being magic: > > - } else if (!strcmp(arg, "-")) { > - // "-" means stdin; assume output to stdout. > - // TODO: require -f to read compressed data from tty? > - in_name = strdup("-"); > - out_name = strdup("-"); > - } else error_exit("unknown suffix: %s", arg); > - if (toys.optflags&FLAG_c) { > - free(out_name); > - out_name = strdup("-"); > > But since out_name is only really ever used in xcreate() (which doesn't > know that "-" means stdout), and in_name doesn't get modified, this is > all unnecessary. There were xopen() changes last year so it never > returns stdin/stdout/stderr by default (even if the command is run with > those filehandles closed, see the notstdio() function in lib/xwrap.c > which moves the filehandle out of that range), so we can reliably check > the fd and know 0 means stdin and 1 means stdout. So just do that. > > The new code uses -f to read from a tty regardless of whether it's stdin > or not. This is because: > > $ sudo gunzip /dev/ttyS0 > gzip: /dev/ttyS0 is not a directory or a regular file - ignored > > Is something you might conceivably want to do, and if we've already > _got_ a -f that means that, why not have it apply? > > A lot of changes like: > > - } else error_exit("unknown suffix: %s", arg); > - both_files = strcmp(in_name, "-") && strcmp(out_name, "-"); > - if (both_files) xstat(in_name, &sb); > > becoming > > + if (!out_fd) { > + // "gunzip x.gz" will decompress "x.gz" to "x". > + if ((len = strlen(arg))<4 || strcmp(arg+len-3, ".gz")) { > + error_msg("no .gz: %s", arg); > + return; > + } > + if (!stat(arg, &sb)) { > + perror_msg("%s", arg); > + return; > + } > > Are so it can warn and continue instead of error exiting. Yes it's > longer, but it's what the command should do. > > The "unknown suffix: %s" becoming "no .gz" is that design.html bit about > not expecting more than a dozen words of english out of Xiaomi's > userbase, although the old one printing the name it barfed on is good > and I should add that back. "no .gz: %s" > > Ok, next function: I inlined do_gz() into gzip_main() not just because > it's 6 fewer lines but because if the call to loopfiles() performs the > test it happens once instead of happening every argument. > > I could have left the flag modifying like it was but I thought a for > loop from 0 to 9 in the actual level variable, with setting a default > value if it fell off the end, was more straightforward. (If nothing else > it avoids the need for a temporary variable.) The fact I'm then > reversing the result at the end undermines that though. If you really > care I can put it back. > > We're not checking FLAG_c here, that's part of the consolidation of > FLAG_c stuff to the single place it's tested. > > And there's the inlining of the test into loopfiles(). > > I need to retest all of this stuff, but with it using the old gzip api > that gets zcat README wrong, I already have tests fail for the wrong > reasons and will have to completely retest after the conversion anyway. > (And then again when I add the deflate() code from compress.c, which > should probably go into lib/ even though bunzip2 didn't because zip.c > also wants to use it and we're not pulling in bzlib.so, so this _is_ > kind of special. It's probably the only symmetrical compression and > decompression toybox will have, the other two just decompress and that's > probablyf ine.) > > This is also where I looked at the test suite and said "comparing the > gzip output against the gunzip output, we should really have canned > files for it to check against the way bunzip2 does" and that's part of > the same todo item. > > So I need to finish it, and didn't have time then. But I didn't want it > to be yet another "I've done 80% of a coding pass, let this fester in my > tree for 6 months and then get reverted" thing like I've got so many of, > so I went ahead and hecked it in because it's in pending. Sorry I forgot > that Android is using stuff out of pending. My bad for falling this far > behind and _having_ stuff android is using still be in pending.
i think the fix is to have more, smaller commits rather than trying to do everything in one commit. and to always run the tests before committing... > I'm also sorry I didn't have time to do this level of writeup about it > at the time, but my day job has been cutting into my hobby time for > quite a while now. I take my volunteer responsibilities very seriously, > but work still has dibs on my cycles. > > And yes, this command needs a follow-up cleanup pass adding comments and > focusing on "is this bit readable" enough. I try to do that at the time > but you always need to do later passes for 1) readability, 2) > security/bugcheck, 3) functionality and standards compliance, 4) test > coverage. And those passes take place _after_ the code is otherwise > "done" (or at least checkpointed). And this hasn't gotten to that point > yet, which is why it's still in pending. > > Rob _______________________________________________ Toybox mailing list [email protected] http://lists.landley.net/listinfo.cgi/toybox-landley.net
