> tail: Process bytes with -c option, and add -m option for runes > > POSIX says that -c specifies a number of bytes, not characters. > This flag is commonly used by scripts that operate on binary files to > things like extract a header. Treating the offsets as character > offsets will break things in mysterious ways. > > Instead, add a -m option (chosen to match `wc -m`, which also > operates on characters) to handle character offsets.
FWIW, the decision was that the tool is intended to be a text tool hence acting on characters. The POSIX rationale explains it was kept as bytes not to hassle too much implementations; if you want to copy bytes, use dd. That said, your patch gives an alternative to this so it's good. After a quick look, I don't see any bug reports suggesting adding -m flag for tail, maybe this is something we should do. — Quentin
