Re: wc(1) and -m flag

Jason McIntyre Thu, 01 Jan 2015 14:29:55 -0800

On Thu, Jan 01, 2015 at 03:17:48PM +0200, Kaspars Bankovskis wrote:
> Correct me if I'm wrong, but it seems to me that the manual for wc(1)
> has been updated to be compliant with the standard but implementation
> not. -m and -c both set dochar=1 and I can't see any difference in
> further execution. They're not "mutually exclusive" either (opposed to
> manual) as it's the same thing.
> 
> -m option seems to be implemented in FreeBSD and NetBSD, where they
> differentiate between counting bytes and multi-byte characters. I'm not
> sure whether it's needed/wanted here, so sending only a diff that
> removes misleading information.
>


it's not exactly that we updated wc knowing that it was not posix
conformant. i think the general explanation is that the current
implementation of obsd treats characters and bytes the same, but they
might not on other systems, and then all the multibyte stuff that kills
my brain.

if i'm totally wrong about that, someone correct me.

we could make that clearer in the man page, but i think we decided it
was best to do it the way we have now.

posix displays -c and -m as exclusive, so that's probably why synopsis
shows it that way. i guess it encourages portability.

jmc

> 
> Index: wc.1
> ===================================================================
> RCS file: /cvs/src/usr.bin/wc/wc.1,v
> retrieving revision 1.23
> diff -u -p -u -r1.23 wc.1
> --- wc.1      15 Nov 2014 13:55:25 -0000      1.23
> +++ wc.1      1 Jan 2015 13:15:35 -0000
> @@ -37,11 +37,10 @@
>  .Os
>  .Sh NAME
>  .Nm wc
> -.Nd word, line, and byte or character count
> +.Nd word, line, and byte count
>  .Sh SYNOPSIS
>  .Nm wc
> -.Op Fl c | m
> -.Op Fl hlw
> +.Op Fl chlmw
>  .Op Ar
>  .Sh DESCRIPTION
>  The
> @@ -72,8 +71,8 @@ using powers of 2 for sizes (K=1024, M=1
>  The number of lines in each input file
>  is written to the standard output.
>  .It Fl m
> -The number of characters in each input file
> -is written to the standard output.
> +Equivalent to
> +.Fl c
>  .It Fl w
>  The number of words in each input file
>  is written to the standard output.
> @@ -85,11 +84,6 @@ only reports the information requested b
>  The default action is equivalent to the flags
>  .Fl clw
>  having been specified.
> -The
> -.Fl c
> -and
> -.Fl m
> -options are mutually exclusive.
>  .Pp
>  If no file names are specified, the standard input is used
>  and a file name is not output.
> @@ -103,13 +97,7 @@ input file of the form:
>  lines         words  bytes   file_name
>  .Ed
>  .Pp
> -If the
> -.Fl m
> -option is specified,
> -the number of bytes is replaced by
> -the number of characters in the listing above.
>  The counts for lines, words, and bytes
> -.Pq or characters
>  are integers separated by spaces.
>  .Sh EXIT STATUS
>  .Ex -std wc
> @@ -130,3 +118,8 @@ A
>  .Nm
>  utility appeared in
>  .At v1 .
> +.Sh BUGS
> +The
> +.Fl m
> +flag can't be used for counting multi-byte characters as can be found in
> +other implementations.
> Index: wc.c
> ===================================================================
> RCS file: /cvs/src/usr.bin/wc/wc.c,v
> retrieving revision 1.16
> diff -u -p -u -r1.16 wc.c
> --- wc.c      27 Nov 2013 13:32:02 -0000      1.16
> +++ wc.c      1 Jan 2015 12:16:40 -0000
> @@ -57,7 +57,7 @@ main(int argc, char *argv[])
>  
>       setlocale(LC_ALL, "");
>  
> -     while ((ch = getopt(argc, argv, "lwchm")) != -1)
> +     while ((ch = getopt(argc, argv, "chlmw")) != -1)
>               switch(ch) {
>               case 'l':
>                       doline = 1;
> @@ -75,7 +75,7 @@ main(int argc, char *argv[])
>               case '?':
>               default:
>                       (void)fprintf(stderr,
> -                         "usage: %s [-c | -m] [-hlw] [file ...]\n",
> +                         "usage: %s [-chlmw] [file ...]\n",
>                           __progname);
>                       exit(1);
>               }
>

Re: wc(1) and -m flag

Reply via email to