On Tue,  6 Dec 2016 02:17:03 -0800
Michael Forney <mfor...@mforney.org> wrote:

Hey Michael,

> POSIX says that -c specifies a number of bytes, not characters. This
> flag is commonly used by scripts that operate on binary files to do
> things like extract a header. Treating the offsets as character
> offsets will break things in mysterious ways.
> 
> Instead, add a -m option (chosen to match `wc -m`, which also operates
> on characters) to handle character offsets.
> ---
> I'm tempted to just delete the character functionality instead of
> introducing a new non-standard option. I can see the use of tail with
> codepoints, but we definitely need to make -c work on bytes so that we
> don't break scripts.
> 
> I'm also open to changing the option flag to something else. I just
> chose -m because that's what wc uses for characters.

well-spotted! Still, it's _very_ counterintuitive to call the flag
"-c". Instead of adding a non-portable m-flag, it would even sound
better to me to add a b-flag for byte-offsets.

It all depends on how many scripts rely on this behaviour. Can you give
an example? I thought cut(1) was the tool of choice for extracting
headers and such things.

Cheers

Laslo

-- 
Laslo Hunhold <d...@frign.de>

Reply via email to