Jeremie Courreges-Anglas <[email protected]> writes:

> Ingo Schwarze <[email protected]> writes:
>
>> Hi,
>
> Hi Ingo,
>
>> two general remarks:
>>
>>  1) The head(1) utility is supposed to handle text files.  Our
>>     manual page doesn't mention that technicality - in general, our
>>     manuals avoid excessive technicality in favour of readability -
>>     but POSIX is explicit:
>>       "Input files shall be text files, but the line length
>>        is not restricted to {LINE_MAX} bytes."
>>     So, the -c option is badly designed.  It requires a text file
>>     as input, but usually produces something on output that will
>>     no longer be a text file, because the output won't usually end
>>     in a newline character.  Adding a trailing newline in case a
>>     line is cut in the middle would also be a bad idea.  Cutting
>>     at the last line break before the character count would have
>>     been better design, but it's not an option for us because it's
>>     not what everyone else does.  Note that tail(1) does not suffer
>>     from the same ailment.  tail -c does not turn text files into
>>     non-text files, it does preserve the trailing newline.  Of
>>     course, tail(1) -c cuts characters in half, so it isn't stellar
>>     design either...
>>
>>  2) In a text utility, it feels quite strange to add an option
>>     counting bytes in 2016 when there is no way to count characters
>>     or display columns.  But given that's what everyone else does,
>>     so be it.  If the GNU folks ever notice this, we'll very
>>     probably get option inflation here.  But we can deal with
>>     that when it happens.
>>
>> Dmitrij D. Czarkoff wrote on Thu, Mar 10, 2016 at 09:27:05AM +0100:
>>> Jeremie Courreges-Anglas said:
>>
>>>> The situation is a bit muddy. :)
>>>> 1. GNU head obeys the last command line option
>>>> 2. FreeBSD errors out if both -c and -n are specified
>>>> 3. NetBSD always follows -c if it has been specified, probably mixing -c
>>>>    and -n was overlooked
>>>> 4. busybox is a bit more broken:
>>>> 
>>>>   $ printf '%s\n' a b c d e | busybox head -c 2 -n 5
>>>>   a
>>>>   b
>>>>   c$
>>>> 
>>>>   ie if -c is passed it always specifies the byte-counting behavior, but
>>>>   the actual byte count can be modified by subsequent -n options...
>>>> 
>>>> I prefer 1. 'cause I see no reason to do 2.
>>
>> There are three good reasons for 2.:
>>
>>  1)
>>
>>> FWIW our tail(2) does 2, so IMO head should as well.
>>
>> I agree with Dmitrij, for two more reasons in addition to the already
>> quite good one Dmitrij mentions - which, by the way, is not just us,
>> but POSIX, too, in the case of tail(1).
>
> GNU tail, busybox tail, FreeBSD tail and NetBSD tail all allow mixing -c
> and -n options.

My bad, I mixed things up: FreeBSD and NetBSD tail(1) behave just like
ours.  As this has been mentioned before I propose not to change its
behavior.

> Our tail(1) is the one that is different here, so it's
> not clear to me that the right move is to follow FreeBSD regarding
> head(1) and to differ from them regarding tail(1).
>
> To repeat myself, the addition of this rather silly option is supposed
> to reduce differences from other implementations so that we can stop
> wasting time about it.
>
>>  2) POSIX says:  http://pubs.opengroup.org/onlinepubs/9699919799/
>>
>>     Guideline 11:
>>     The order of different options relative to one another should
>>     not matter, unless the options are documented as mutually-exclusive
>>     and such an option is documented to override any incompatible
>>     options preceding it.
>>
>>     So, Jeremie, as it stands, your diff is outright wrong.  You
>>     document the two options to conflict (by marking them with "|"
>>     in the SYNOPSIS) but implement them as silently overriding
>>     each other.  That's NOT ok.
>
> Yeah, that came to my mind earlier today.  So, to make things clear: if
> those options were explicitely documented as mutually-exclusive *and*
> that the last option passed wins, that point would be addressed.  Right?

Here's a diff that further tweaks the manpage, wording stolen from
df(1).

Index: head.1
===================================================================
RCS file: /cvs/src/usr.bin/head/head.1,v
retrieving revision 1.23
diff -u -p -r1.23 head.1
--- head.1      25 Oct 2015 21:50:32 -0000      1.23
+++ head.1      11 Mar 2016 17:42:24 -0000
@@ -37,7 +37,7 @@
 .Nd display first few lines of files
 .Sh SYNOPSIS
 .Nm head
-.Op Fl Ar count | Fl n Ar count
+.Op Fl c Ar count | Fl Ar count | Fl n Ar count
 .Op Ar
 .Sh DESCRIPTION
 The
@@ -56,6 +56,12 @@ is omitted, it defaults to 10.
 .Pp
 The options are as follows:
 .Bl -tag -width Ds
+.It Fl c Ar count
+Copy the first
+.Ar count
+bytes of each input file to the standard output.
+.Ar count
+must be a positive decimal integer.
 .It Fl Ar count | Fl n Ar count
 Copy the first
 .Ar count
@@ -63,6 +69,14 @@ lines of each input file to the standard
 .Ar count
 must be a positive decimal integer.
 .El
+.Pp
+It is not an error to specify more than one of the mutually exclusive
+options
+.Fl c
+and
+.Fl n .
+Where more than one of these options is specified, the last option given
+overrides the others.
 .Pp
 If more than one file is specified,
 .Nm
Index: head.c
===================================================================
RCS file: /cvs/src/usr.bin/head/head.c,v
retrieving revision 1.20
diff -u -p -r1.20 head.c
--- head.c      9 Oct 2015 01:37:07 -0000       1.20
+++ head.c      10 Mar 2016 02:32:49 -0000
@@ -40,7 +40,7 @@
 static void usage(void);
 
 /*
- * head - give the first few lines of a stream or of each of a set of files
+ * head - give the first few bytes/lines of a stream or of each of a set of 
files
  *
  * Bill Joy UCB August 24, 1977
  */
@@ -51,9 +51,9 @@ main(int argc, char *argv[])
        FILE    *fp;
        long    cnt;
        int     ch, firsttime;
-       long    linecnt = 10;
+       long    count = 10;
        char    *p = NULL;
-       int     status = 0;
+       int     dobytes = 0, status = 0;
 
        if (pledge("stdio rpath", NULL) == -1)
                err(1, "pledge");
@@ -66,13 +66,18 @@ main(int argc, char *argv[])
                argv++;
        }
 
-       while ((ch = getopt(argc, argv, "n:")) != -1) {
+       while ((ch = getopt(argc, argv, "c:n:")) != -1) {
                switch (ch) {
+               case 'c':
+                       dobytes = 1;
+                       p = optarg;
+                       break;
                case 'n':
+                       dobytes = 0;
                        p = optarg;
                        break;
                default:
-                       usage();        
+                       usage();
                }
        }
        argc -= optind, argv += optind;
@@ -80,9 +85,10 @@ main(int argc, char *argv[])
        if (p) {
                const char *errstr;
 
-               linecnt = strtonum(p, 1, LONG_MAX, &errstr);
+               count = strtonum(p, 1, LONG_MAX, &errstr);
                if (errstr)
-                       errx(1, "line count %s: %s", errstr, p);
+                       errx(1, "%s count %s: %s", dobytes ? "bytes" : "lines",
+                           errstr, p);
        }
 
        for (firsttime = 1; ; firsttime = 0) {
@@ -105,10 +111,16 @@ main(int argc, char *argv[])
                        }
                        ++argv;
                }
-               for (cnt = linecnt; cnt && !feof(fp); --cnt)
-                       while ((ch = getc(fp)) != EOF)
-                               if (putchar(ch) == '\n')
-                                       break;
+               if (dobytes) {
+                       for (cnt = count; cnt && !feof(fp); --cnt)
+                               if ((ch = getc(fp)) != EOF)
+                                       putchar(ch);
+               } else {
+                       for (cnt = count; cnt && !feof(fp); --cnt)
+                               while ((ch = getc(fp)) != EOF)
+                                       if (putchar(ch) == '\n')
+                                               break;
+               }
                fclose(fp);
        }
        /*NOTREACHED*/
@@ -118,6 +130,6 @@ main(int argc, char *argv[])
 static void
 usage(void)
 {
-       fputs("usage: head [-count | -n count] [file ...]\n", stderr);
+       fputs("usage: head [-c count | -count | -n count] [file ...]\n", 
stderr);
        exit(1);
 }


-- 
jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF  DDCC 0DFA 74AE 1524 E7EE

Reply via email to