Ingo Schwarze <[email protected]> writes:
> Hi,
Hi Ingo,
> two general remarks:
>
> 1) The head(1) utility is supposed to handle text files. Our
> manual page doesn't mention that technicality - in general, our
> manuals avoid excessive technicality in favour of readability -
> but POSIX is explicit:
> "Input files shall be text files, but the line length
> is not restricted to {LINE_MAX} bytes."
> So, the -c option is badly designed. It requires a text file
> as input, but usually produces something on output that will
> no longer be a text file, because the output won't usually end
> in a newline character. Adding a trailing newline in case a
> line is cut in the middle would also be a bad idea. Cutting
> at the last line break before the character count would have
> been better design, but it's not an option for us because it's
> not what everyone else does. Note that tail(1) does not suffer
> from the same ailment. tail -c does not turn text files into
> non-text files, it does preserve the trailing newline. Of
> course, tail(1) -c cuts characters in half, so it isn't stellar
> design either...
>
> 2) In a text utility, it feels quite strange to add an option
> counting bytes in 2016 when there is no way to count characters
> or display columns. But given that's what everyone else does,
> so be it. If the GNU folks ever notice this, we'll very
> probably get option inflation here. But we can deal with
> that when it happens.
>
> Dmitrij D. Czarkoff wrote on Thu, Mar 10, 2016 at 09:27:05AM +0100:
>> Jeremie Courreges-Anglas said:
>
>>> The situation is a bit muddy. :)
>>> 1. GNU head obeys the last command line option
>>> 2. FreeBSD errors out if both -c and -n are specified
>>> 3. NetBSD always follows -c if it has been specified, probably mixing -c
>>> and -n was overlooked
>>> 4. busybox is a bit more broken:
>>>
>>> $ printf '%s\n' a b c d e | busybox head -c 2 -n 5
>>> a
>>> b
>>> c$
>>>
>>> ie if -c is passed it always specifies the byte-counting behavior, but
>>> the actual byte count can be modified by subsequent -n options...
>>>
>>> I prefer 1. 'cause I see no reason to do 2.
>
> There are three good reasons for 2.:
>
> 1)
>
>> FWIW our tail(2) does 2, so IMO head should as well.
>
> I agree with Dmitrij, for two more reasons in addition to the already
> quite good one Dmitrij mentions - which, by the way, is not just us,
> but POSIX, too, in the case of tail(1).
GNU tail busybox tail, FreeBSD tail and NetBSD tail all allow mixing -c
and -n options. Our tail(1) is the one that is different here, so it's
not clear to me that the right move is to follow FreeBSD regarding
head(1) and to differ from them regarding tail(1).
To repeat myself, the addition of this rather silly option is supposed
to reduce differences from other implementations so that we can stop
wasting time about it.
> 2) POSIX says: http://pubs.opengroup.org/onlinepubs/9699919799/
>
> Guideline 11:
> The order of different options relative to one another should
> not matter, unless the options are documented as mutually-exclusive
> and such an option is documented to override any incompatible
> options preceding it.
>
> So, Jeremie, as it stands, your diff is outright wrong. You
> document the two options to conflict (by marking them with "|"
> in the SYNOPSIS) but implement them as silently overriding
> each other. That's NOT ok.
Yeah, that came to my mind earlier today. So, to make things clear: if
those options were explicitely documented as mutually-exclusive *and*
that the last option passed wins, that point would be addressed. Right?
> 3) When different implementations conflict, let us be conservative.
> It doesn't help users if a non-standard option behaves in some
> random way conflicting with what other systems do. If somebody
> uses a non-standard option in a way treated differently by
> different systems, they will get bitten somewhere no matter what.
> So it's better to error out and tell them they are doing it wrong.
Among the head(1) implementations I mentioned, only one errors out if
both -c and -n are passed. If FreeBSD wants to be different fine,
that's their problem; maybe they'd change their implementation if this
difference was reported. So I really don't find your argument to be
compelling here.
> I suggest you rework the diff to error out.
I don't want to sound dense but I think that I'd rather tweak the
manpage than change the proposed implementation.
> I don't like this
> option even then, it's badly designed, but given how widespread it
> is and how little bloat it causes, i doubt that shooting this diff
> down and forcing porters to waste their time fighting this particular
> windmill is a wise course.
--
jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF DDCC 0DFA 74AE 1524 E7EE