On Fri, Jan 02, 2015 at 02:11:34AM +0200, Kaspars Bankovskis wrote:
> On Thu, Jan 01, 2015 at 10:28:44PM +0001, Jason McIntyre wrote:
> > it's not exactly that we updated wc knowing that it was not posix
> > conformant. i think the general explanation is that the current
> > implementation of obsd treats characters and bytes the same, but they
> > might not on other systems, and then all the multibyte stuff that kills
> > my brain.
> 
> Well, the current implementation of wc in OpenBSD doesn't treat
> characters, -m option is simply aliased to the same behaviour as -c.
> 
> As for multibyte stuff (and portability), I've checked wc on Linux
> (coreutils), FreeBSD and OSX, and they all deal with multibyte
> characters when wc is invoked with -m. For example, if I give A-macron
> character as an input, it's 1 for -m but 2 for -c on those systems. On
> OpenBSD it always will be 2.  We have two flags for the same action, and
> one of them is working differently on majority of currently deployed
> unixes.
> 
> I might be wrong, but isn't "m" standing for multibyte, to contrast with
> "c" for (byte-sized) char?
> 
> > if i'm totally wrong about that, someone correct me.
> > 
> > we could make that clearer in the man page, but i think we decided it
> > was best to do it the way we have now.
> 
> Current text of the man page is giving wrong impression that the -m flag
> is doing something that it doesn't, especially the part about "number of
> bytes being replaced by number of characters" (nothing is replaced there).
> 
> I'm not saying that multibyte character counting should be implemented
> in OpenBSD's wc. But at least the manpage should indicate that the -m is
> useless flag.
> 

let's wait till after the weekend and see if any developers state that
it's meant to be like this (or not). in the meantime i'll try and look
through my mails and see if i can dig up any past conversation about it.

jmc

Reply via email to