Re: old style tail(1) options and bin/57483

2023-06-30 Thread Mouse
>> Do we want to support postfix options in something like old style
>> +qF ?

Personally, I curse every time I run into a tail that doesn't support
"tail +0f" or "tail -f".  I think that and "tail -%d" are the only
forms I use enough for it to be any kind of issue for me for them to
change.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: old style tail(1) options and bin/57483

2023-06-30 Thread RVP

On Fri, 30 Jun 2023, Valery Ushakov wrote:


The man page seems to be completely silent about the old style
options.

What exactly are we aiming for here?  Do we want to support postfix
options in something like old style +qF ?



It would be nice to retain `[+-]N' as a shortcut for `-n [+-]N'.
You can ditch every other compat as far as I'm concerned. You
can even remove obsolete() entirely and I wouldn't shed any tears.

-RVP



Re: printf(1), sh(1), POSIX.2 and octal escape sequences

2023-06-30 Thread Martin Neitzel
KRE> It depends upon the usage.   But if you're processing escapes, you
KRE> need to also process \\ to mean a literal '\' of course,  [...]

Not necessarily -- '\134' would be good enough :-)

Just joking, of course.  The weekend is nigh.

Martin Neitzel


Trivial program size inflation

2023-06-30 Thread Mouse
Based on something at work, I was looking at executable sizes.  I
eventually tried a program stripped about as far down as I could:

int main(void);
int main(void)
{
 return(0);
}

and built it -static.  size on the resulting binary:

sparc, my mutant 1.4T:

textdatabss dec hex filename
12616   124 288 13028   32e4main

amd64, my mutant 5.2:

   textdata bss dec hex filename
 1526134416   16792  173821   2a6fd main

amd64, 9.0_STABLE (ftp.n.o):

   textdata bss dec hex filename
 562318   29064 2176416 2767798  2a3bb6 main

12K to do nothing is bad enough (I'm going to be looking at why it's
that big).  149K is even more disturbing (I'll be looking at that too).
But over half a meg of text and two megs of BSS?  To do nothing?
Surely something is wrong somewhere.

Not that NetBSD is alone in this.  On an Ubuntu machine at work, I see

   textdata bss dec hex filename
 761750   208046016  788570   c085a main

but I hardly think Ubuntu's sins are relevant to NetBSD. :-)

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: printf(1), sh(1), POSIX.2 and octal escape sequences

2023-06-30 Thread Robert Elz
Date:Fri, 30 Jun 2023 17:51:13 +0200
From:tlaro...@polynum.com
Message-ID:  

  | So what is established behavior in this case

It depends upon the usage.   But if you're processing escapes, you
need to also process \\ to mean a literal '\' of course, and once
you have that if the user wants to pass the string \000 to some
application, they can simply write it as \\000 -- there's no need
to assume that \000 as input must have been meant to be \000 as
output that as inserting a literal '\0' is stupid.

As to what to actually do if someone does write \000 (or \0 with
1, 2 or 3 0's) that's kind of up to you.   You can do what the user
said, insert a '\0', and by so doing terminate the input at that
point, or you can simply throw it away, inserting nothing for that
sequence, or generate an error if you want.   Only someone idiotic
enough to actually write \000 in their config file is going to notice.

  | ---and, BTW, most utilities
  | ignore errors with octal sequences (printf(1) for example).

Historic practice, it is just what always has been done.

kre



Re: printf(1), sh(1), POSIX.2 and octal escape sequences

2023-06-30 Thread tlaronde
Le Fri, Jun 30, 2023 at 03:37:18PM +, David Holland a écrit :
> On Wed, Jun 28, 2023 at 06:32:10PM +0200, tlaro...@polynum.com wrote:
>  > > If you want to write a two digit octal number you can not continue with
>  > > another ocatal digit. In C you could do "...\77" "7" and have it concat
>  > > the literals. In config files (without concatenation) you need some
>  > > other trick.
>  > 
>  > I beg to differ: since due to this very unfortunate "variable length"
>  > feature, your scanner has to read char by char, it can reject the third
>  > digit since it would yield an out of range byte value.
> 
> The behavior of escapes in C strings is widely used and well
> understood. Don't improvise.
> 
> There are such things as invalid inputs. Reject them with a reasonable
> diagnostic message instead of trying to guess what the user might have
> meant. Works out much better in the long run.

For this one I will go with the established behavior, but what should I
do when someone is passing, in octal or in hexa: "\000" ou "\x00"?

I have decided that this value will be reput, back, as an escape
sequence (possibly for an argument of some program), since if the
program "interprets" the escape sequence (as current inetd(8) does),
while manipulating internally, obviously, C strings, it will certainly
not provide what was intended... (supposing the user knows what he
wants, and this is, I admit, quite an optimistic view).

So what is established behavior in this case---and, BTW, most utilities
ignore errors with octal sequences (printf(1) for example).
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Re: printf(1), sh(1), POSIX.2 and octal escape sequences

2023-06-30 Thread David Holland
On Wed, Jun 28, 2023 at 06:32:10PM +0200, tlaro...@polynum.com wrote:
 > > If you want to write a two digit octal number you can not continue with
 > > another ocatal digit. In C you could do "...\77" "7" and have it concat
 > > the literals. In config files (without concatenation) you need some
 > > other trick.
 > 
 > I beg to differ: since due to this very unfortunate "variable length"
 > feature, your scanner has to read char by char, it can reject the third
 > digit since it would yield an out of range byte value.

The behavior of escapes in C strings is widely used and well
understood. Don't improvise.

There are such things as invalid inputs. Reject them with a reasonable
diagnostic message instead of trying to guess what the user might have
meant. Works out much better in the long run.

-- 
David A. Holland
dholl...@netbsd.org


Re: old style tail(1) options and bin/57483

2023-06-30 Thread Robert Elz
Date:Fri, 30 Jun 2023 15:37:02 +0300
From:Valery Ushakov 
Message-ID:  

  | What exactly are we aiming for here?  Do we want to support postfix
  | options in something like old style +qF ?

What we want I will leave for others to determine, but in v7 tail
there was a single (optional) "option" (must be argv[1] or nothing)
in the form +/- [NNN] [bclr]

That is, to be treated as this option it had to start with either '+' or '-'.
If one of those was the first char, then came an optional string of digits,
followed by one of 'b' 'c' 'l' or 'r' (or nothing at all).   Anything else
as the terminating char there was an error.  Anything following
that character (if present) was ignored.   If its first char was not '+'
or '-' argv[1] was simply ignored.

For 'r' a digit string (or missing one) with value 0 was treated as as
many lines as fit in the final 4KB (or something like that), and the lines
were printed "backwards" (last line first) - all other variants printed the
output in the order it appeared in the file.   For all other cases, a
digit string of 0 (or missing) just meant 0.   'b' multiplied that number
by 512 (ie: blocks) and then treated it as characters (like 'c').
'c' did nothing, the number was simply a char count, l (and r) caused the
number to mean lines.   Leading '+' skipped that much from the start of
the file, leading '-' started that much before the end of the file
(with caveats as to how much was possible).

If this option was missing, the default was "-10l".   (print the final 10
lines of the file).

Only one file was handled (and as usual, if no file arg was given, stdin
was used).   But somewhat bizarrely, the file arg had to be argv[2], if
argv[1] was present, but didn't start with a '+' or '-' it was simply
ignored (so "tail file" would read stdin, not file).

Supporting all of that certainly seems pointless, if not impossible.
Doing what tail's "obsolete()" function does, and looking for this
form of option, anywhere in the arg list, seems to simply be wrong.

The earliest CSRG version that is in the SCCS files, is from 1980,
and so probably for BSD 4.0 and already had added the 'f' option, and
allowed the 'b' 'c' 'l' and 'r' chars (and 'f') to follow the number
in any order, and any number of them (though obviously, some combinations
made no sense).   It still required the single file arg to be argv[2]
if present - that remained until what is close to our current version
arrived, the checkin log message says:
new version from scratch; POSIX 1003.2 version
in July 1991 (which means that it would only have been in one or more
of the 4.4 (semi) releases I think, ie: not 4.3).

Until the new version appeared, with POSIX style options, there was still
just a single optional "option" and a single optional "file".

kre



old style tail(1) options and bin/57483

2023-06-30 Thread Valery Ushakov
bin/57483 reports that tail(1) doesn't correctly handle old style
options in all cases.  The current approach taken by tail is to
massage the command line to convert old style options into the new
style options and then use getopt to parse only the new style.
Unfortunately the code that does the conversion is a bit naive, so it
doesn't notice that in -fn +20 the "+20" is not a standalone old style
option but an argument to the new style -n hidden/fused into "-fn".

I sketched a prototype to parse both new and old style options
together but the comments about incompatibilities with historic
behavior give me pause.  Cf.

  https://anonhg.netbsd.org/src/file/tip/usr.bin/tail/tail.c#l76

The man page seems to be completely silent about the old style
options.

What exactly are we aiming for here?  Do we want to support postfix
options in something like old style +qF ?

-uwe