Jeremie Courreges-Anglas <[email protected]> writes: > Ingo Schwarze <[email protected]> writes: > >> Hi, > > Hi Ingo, > >> two general remarks: >> >> 1) The head(1) utility is supposed to handle text files. Our >> manual page doesn't mention that technicality - in general, our >> manuals avoid excessive technicality in favour of readability - >> but POSIX is explicit: >> "Input files shall be text files, but the line length >> is not restricted to {LINE_MAX} bytes." >> So, the -c option is badly designed. It requires a text file >> as input, but usually produces something on output that will >> no longer be a text file, because the output won't usually end >> in a newline character. Adding a trailing newline in case a >> line is cut in the middle would also be a bad idea. Cutting >> at the last line break before the character count would have >> been better design, but it's not an option for us because it's >> not what everyone else does. Note that tail(1) does not suffer >> from the same ailment. tail -c does not turn text files into >> non-text files, it does preserve the trailing newline. Of >> course, tail(1) -c cuts characters in half, so it isn't stellar >> design either... >> >> 2) In a text utility, it feels quite strange to add an option >> counting bytes in 2016 when there is no way to count characters >> or display columns. But given that's what everyone else does, >> so be it. If the GNU folks ever notice this, we'll very >> probably get option inflation here. But we can deal with >> that when it happens. >> >> Dmitrij D. Czarkoff wrote on Thu, Mar 10, 2016 at 09:27:05AM +0100: >>> Jeremie Courreges-Anglas said: >> >>>> The situation is a bit muddy. :) >>>> 1. GNU head obeys the last command line option >>>> 2. FreeBSD errors out if both -c and -n are specified >>>> 3. NetBSD always follows -c if it has been specified, probably mixing -c >>>> and -n was overlooked >>>> 4. busybox is a bit more broken: >>>> >>>> $ printf '%s\n' a b c d e | busybox head -c 2 -n 5 >>>> a >>>> b >>>> c$ >>>> >>>> ie if -c is passed it always specifies the byte-counting behavior, but >>>> the actual byte count can be modified by subsequent -n options... >>>> >>>> I prefer 1. 'cause I see no reason to do 2. >> >> There are three good reasons for 2.: >> >> 1) >> >>> FWIW our tail(2) does 2, so IMO head should as well. >> >> I agree with Dmitrij, for two more reasons in addition to the already >> quite good one Dmitrij mentions - which, by the way, is not just us, >> but POSIX, too, in the case of tail(1). > > GNU tail, busybox tail, FreeBSD tail and NetBSD tail all allow mixing -c > and -n options.
My bad, I mixed things up: FreeBSD and NetBSD tail(1) behave just like ours. As this has been mentioned before I propose not to change its behavior. > Our tail(1) is the one that is different here, so it's > not clear to me that the right move is to follow FreeBSD regarding > head(1) and to differ from them regarding tail(1). > > To repeat myself, the addition of this rather silly option is supposed > to reduce differences from other implementations so that we can stop > wasting time about it. > >> 2) POSIX says: http://pubs.opengroup.org/onlinepubs/9699919799/ >> >> Guideline 11: >> The order of different options relative to one another should >> not matter, unless the options are documented as mutually-exclusive >> and such an option is documented to override any incompatible >> options preceding it. >> >> So, Jeremie, as it stands, your diff is outright wrong. You >> document the two options to conflict (by marking them with "|" >> in the SYNOPSIS) but implement them as silently overriding >> each other. That's NOT ok. > > Yeah, that came to my mind earlier today. So, to make things clear: if > those options were explicitely documented as mutually-exclusive *and* > that the last option passed wins, that point would be addressed. Right? Here's a diff that further tweaks the manpage, wording stolen from df(1). Index: head.1 =================================================================== RCS file: /cvs/src/usr.bin/head/head.1,v retrieving revision 1.23 diff -u -p -r1.23 head.1 --- head.1 25 Oct 2015 21:50:32 -0000 1.23 +++ head.1 11 Mar 2016 17:42:24 -0000 @@ -37,7 +37,7 @@ .Nd display first few lines of files .Sh SYNOPSIS .Nm head -.Op Fl Ar count | Fl n Ar count +.Op Fl c Ar count | Fl Ar count | Fl n Ar count .Op Ar .Sh DESCRIPTION The @@ -56,6 +56,12 @@ is omitted, it defaults to 10. .Pp The options are as follows: .Bl -tag -width Ds +.It Fl c Ar count +Copy the first +.Ar count +bytes of each input file to the standard output. +.Ar count +must be a positive decimal integer. .It Fl Ar count | Fl n Ar count Copy the first .Ar count @@ -63,6 +69,14 @@ lines of each input file to the standard .Ar count must be a positive decimal integer. .El +.Pp +It is not an error to specify more than one of the mutually exclusive +options +.Fl c +and +.Fl n . +Where more than one of these options is specified, the last option given +overrides the others. .Pp If more than one file is specified, .Nm Index: head.c =================================================================== RCS file: /cvs/src/usr.bin/head/head.c,v retrieving revision 1.20 diff -u -p -r1.20 head.c --- head.c 9 Oct 2015 01:37:07 -0000 1.20 +++ head.c 10 Mar 2016 02:32:49 -0000 @@ -40,7 +40,7 @@ static void usage(void); /* - * head - give the first few lines of a stream or of each of a set of files + * head - give the first few bytes/lines of a stream or of each of a set of files * * Bill Joy UCB August 24, 1977 */ @@ -51,9 +51,9 @@ main(int argc, char *argv[]) FILE *fp; long cnt; int ch, firsttime; - long linecnt = 10; + long count = 10; char *p = NULL; - int status = 0; + int dobytes = 0, status = 0; if (pledge("stdio rpath", NULL) == -1) err(1, "pledge"); @@ -66,13 +66,18 @@ main(int argc, char *argv[]) argv++; } - while ((ch = getopt(argc, argv, "n:")) != -1) { + while ((ch = getopt(argc, argv, "c:n:")) != -1) { switch (ch) { + case 'c': + dobytes = 1; + p = optarg; + break; case 'n': + dobytes = 0; p = optarg; break; default: - usage(); + usage(); } } argc -= optind, argv += optind; @@ -80,9 +85,10 @@ main(int argc, char *argv[]) if (p) { const char *errstr; - linecnt = strtonum(p, 1, LONG_MAX, &errstr); + count = strtonum(p, 1, LONG_MAX, &errstr); if (errstr) - errx(1, "line count %s: %s", errstr, p); + errx(1, "%s count %s: %s", dobytes ? "bytes" : "lines", + errstr, p); } for (firsttime = 1; ; firsttime = 0) { @@ -105,10 +111,16 @@ main(int argc, char *argv[]) } ++argv; } - for (cnt = linecnt; cnt && !feof(fp); --cnt) - while ((ch = getc(fp)) != EOF) - if (putchar(ch) == '\n') - break; + if (dobytes) { + for (cnt = count; cnt && !feof(fp); --cnt) + if ((ch = getc(fp)) != EOF) + putchar(ch); + } else { + for (cnt = count; cnt && !feof(fp); --cnt) + while ((ch = getc(fp)) != EOF) + if (putchar(ch) == '\n') + break; + } fclose(fp); } /*NOTREACHED*/ @@ -118,6 +130,6 @@ main(int argc, char *argv[]) static void usage(void) { - fputs("usage: head [-count | -n count] [file ...]\n", stderr); + fputs("usage: head [-c count | -count | -n count] [file ...]\n", stderr); exit(1); } -- jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF DDCC 0DFA 74AE 1524 E7EE
