Now let's look at traditional encoding. (Same bat-file, same bat-changeset. Oh _wow_ I'm dating myself with that reference, although in my defense I only saw it in reruns.)

uu_3bytes() does the same "shift input into an int, output encoded 6 bit character" but the encoding is just add 32 to the value. (That's the space character, and the next 63 characters after that are all printable, so...) Just a really quick cleanup pass on this function: remove curly brackets around single lines, and replace the assignment into out[] with an xputc().

uu_line(): take out the special case to print something for a length 0 line (the standard doesn't require it). Instead wrap the whole thing in if (len > 0) line the b64_line() does. The big xprintf() went away because 3bytes is outputting stuff itself.

Traditional uuencoded lines start with the length of the line, so the tuples are always 3 bytes encoded as 4 characters, and that initial length tells you when to ignore bits of the end. That goes inside the if() statement, along with a simple for() loop calling uu_3bytes() on every 3 bytes of output until we're done. (We don't care about falling off the end because we assume the input is big enough, and whatever trailing garbage potentially winds up in those last couple bytes won't get decoded at the far end due to the length saying not to.)

Hmmm, although now that I think about it this implies that encoding the same file twice could produce slightly different results, even though they decode to the same thing. But this would only actually happen if a signal handler called us and crapped out the stack non-deterministically, and modern linux actually has a separate signal stack, so it can't actually happen. Otherwise buf[] contains either the previous line or zero, deterministically. (That's black magic enough I'm tempted to throw a memset() in there, but it's not worth the extra code.)

Oh wait, we pass along len all the way to uu_3bytes() so it'll only shift/load the bytes we give it into the integer output is produced from, it's zero initialized and the rest remain zeroed. So nevermind, already handled. :)

Where was I?

uuencode_uu() does the same inbuf/outbuf setup using toybuf, and again the output buffer went away and we can replace both with a single input buffer on the stack. The size of the xread was 45 bytes, and the spec says:

The maximum number of octets to be encoded on each line shall be 45.

So that's actually already correct. Adjust the whitespace (I tend to do spaces after commas and statements that aren't function names like if () vs func(), and around assignment characters. Habit I picked up somewhere, more important to be consistent than right with whitespace. I sometimes cheat and remove spaces to fit in 80 chars, but space after commas is less important than space after if or before curly brackets...

Finally we get to uuencode_main():

The variable declarations got redone based on the needs of the code in the function, so let's skip that except to note that I renamed encode_filename to name because it was an unnecessarily long local variable name. (A three line function needs less descriptive names than a twenty line function. I try not to use a name like "k" if the scope it lives in is longer than 10 lines or so, but I do note that "i" as a loop index is tradithional! (Lightning strike! Yeah, discworld reference.)

So: toybuf gets filled with a base64 table via a loop. (I could do that only for -m but didn't bother, compared to the exec a for loop initializing a 64-entry table with code that fits in a cache line is trivial.)

The remaining cleanups are whitespace and changing the name of encode_filename to just name.

And that's pass 1!

Rob
_______________________________________________
Toybox mailing list
[email protected]
http://lists.landley.net/listinfo.cgi/toybox-landley.net

Reply via email to