On 12/27/16, Evan Gates <[email protected]> wrote: > On Tue, Dec 27, 2016 at 5:55 AM, Laslo Hunhold <[email protected]> wrote: >> well-spotted! Still, it's _very_ counterintuitive to call the flag >> "-c". Instead of adding a non-portable m-flag, it would even sound >> better to me to add a b-flag for byte-offsets.
Yes, it's a bit counter-intuitive, but conflicting with POSIX for this alone seems like a really bad idea. I always consult POSIX when writing shell scripts to ensure that they will run on any conforming system. If sbase decided that the option character name was not the best choice, then reasonable, valid, and portable scripts may start operating unexpectedly with no indication as to why. Also, wc(1) (even sbase's implementation) uses -c to refer to bytes, and -m to refer to characters. It wouldn't be self-consistent to make tail use -b for bytes and -c for characters. (Just to clarify, I also think it would be a really bad idea to make wc use -b for bytes and -c for characters). >> It all depends on how many scripts rely on this behaviour. Can you give >> an example? Sure. gcc's build system uses tail to skip the first 16 bytes of the binaries to check that stage2 and stage3 are the same. Granted, it does use non-standard syntax tail +16c, and I don't know that there are any bytes in there with the high bit set, but still, tail *does* get invoked on binary files, and treating the byte offsets as characters will break things in strange ways that are difficult to debug. >> I thought cut(1) was the tool of choice for extracting >> headers and such things. How do you use cut(1) to strip the first 512 bytes of a binary file? It operates on lines. > I think deviating from POSIX here is a bad idea. Every deviation from > POSIX means that our tools cannot be used in another situation and > pushes prospective users away. If the user wants characters instead of > bytes we have tools to do that, don't surprise the user by doing > something different than every other implementation. > > P.S. I too found -c confusing the first time I expected utf8 > characters, but remembering these tools were created with ascii in > mind, I think of -c as char and it all works out... Agreed.
