Re: old style tail(1) options and bin/57483
>> Do we want to support postfix options in something like old style >> +qF ? Personally, I curse every time I run into a tail that doesn't support "tail +0f" or "tail -f". I think that and "tail -%d" are the only forms I use enough for it to be any kind of issue for me for them to change. /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTMLmo...@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Re: old style tail(1) options and bin/57483
On Fri, 30 Jun 2023, Valery Ushakov wrote: The man page seems to be completely silent about the old style options. What exactly are we aiming for here? Do we want to support postfix options in something like old style +qF ? It would be nice to retain `[+-]N' as a shortcut for `-n [+-]N'. You can ditch every other compat as far as I'm concerned. You can even remove obsolete() entirely and I wouldn't shed any tears. -RVP
Re: printf(1), sh(1), POSIX.2 and octal escape sequences
KRE> It depends upon the usage. But if you're processing escapes, you KRE> need to also process \\ to mean a literal '\' of course, [...] Not necessarily -- '\134' would be good enough :-) Just joking, of course. The weekend is nigh. Martin Neitzel
Trivial program size inflation
Based on something at work, I was looking at executable sizes. I eventually tried a program stripped about as far down as I could: int main(void); int main(void) { return(0); } and built it -static. size on the resulting binary: sparc, my mutant 1.4T: textdatabss dec hex filename 12616 124 288 13028 32e4main amd64, my mutant 5.2: textdata bss dec hex filename 1526134416 16792 173821 2a6fd main amd64, 9.0_STABLE (ftp.n.o): textdata bss dec hex filename 562318 29064 2176416 2767798 2a3bb6 main 12K to do nothing is bad enough (I'm going to be looking at why it's that big). 149K is even more disturbing (I'll be looking at that too). But over half a meg of text and two megs of BSS? To do nothing? Surely something is wrong somewhere. Not that NetBSD is alone in this. On an Ubuntu machine at work, I see textdata bss dec hex filename 761750 208046016 788570 c085a main but I hardly think Ubuntu's sins are relevant to NetBSD. :-) /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTMLmo...@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Re: printf(1), sh(1), POSIX.2 and octal escape sequences
Date:Fri, 30 Jun 2023 17:51:13 +0200 From:tlaro...@polynum.com Message-ID: | So what is established behavior in this case It depends upon the usage. But if you're processing escapes, you need to also process \\ to mean a literal '\' of course, and once you have that if the user wants to pass the string \000 to some application, they can simply write it as \\000 -- there's no need to assume that \000 as input must have been meant to be \000 as output that as inserting a literal '\0' is stupid. As to what to actually do if someone does write \000 (or \0 with 1, 2 or 3 0's) that's kind of up to you. You can do what the user said, insert a '\0', and by so doing terminate the input at that point, or you can simply throw it away, inserting nothing for that sequence, or generate an error if you want. Only someone idiotic enough to actually write \000 in their config file is going to notice. | ---and, BTW, most utilities | ignore errors with octal sequences (printf(1) for example). Historic practice, it is just what always has been done. kre
Re: printf(1), sh(1), POSIX.2 and octal escape sequences
Le Fri, Jun 30, 2023 at 03:37:18PM +, David Holland a écrit : > On Wed, Jun 28, 2023 at 06:32:10PM +0200, tlaro...@polynum.com wrote: > > > If you want to write a two digit octal number you can not continue with > > > another ocatal digit. In C you could do "...\77" "7" and have it concat > > > the literals. In config files (without concatenation) you need some > > > other trick. > > > > I beg to differ: since due to this very unfortunate "variable length" > > feature, your scanner has to read char by char, it can reject the third > > digit since it would yield an out of range byte value. > > The behavior of escapes in C strings is widely used and well > understood. Don't improvise. > > There are such things as invalid inputs. Reject them with a reasonable > diagnostic message instead of trying to guess what the user might have > meant. Works out much better in the long run. For this one I will go with the established behavior, but what should I do when someone is passing, in octal or in hexa: "\000" ou "\x00"? I have decided that this value will be reput, back, as an escape sequence (possibly for an argument of some program), since if the program "interprets" the escape sequence (as current inetd(8) does), while manipulating internally, obviously, C strings, it will certainly not provide what was intended... (supposing the user knows what he wants, and this is, I admit, quite an optimistic view). So what is established behavior in this case---and, BTW, most utilities ignore errors with octal sequences (printf(1) for example). -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: printf(1), sh(1), POSIX.2 and octal escape sequences
On Wed, Jun 28, 2023 at 06:32:10PM +0200, tlaro...@polynum.com wrote: > > If you want to write a two digit octal number you can not continue with > > another ocatal digit. In C you could do "...\77" "7" and have it concat > > the literals. In config files (without concatenation) you need some > > other trick. > > I beg to differ: since due to this very unfortunate "variable length" > feature, your scanner has to read char by char, it can reject the third > digit since it would yield an out of range byte value. The behavior of escapes in C strings is widely used and well understood. Don't improvise. There are such things as invalid inputs. Reject them with a reasonable diagnostic message instead of trying to guess what the user might have meant. Works out much better in the long run. -- David A. Holland dholl...@netbsd.org
Re: old style tail(1) options and bin/57483
Date:Fri, 30 Jun 2023 15:37:02 +0300 From:Valery Ushakov Message-ID: | What exactly are we aiming for here? Do we want to support postfix | options in something like old style +qF ? What we want I will leave for others to determine, but in v7 tail there was a single (optional) "option" (must be argv[1] or nothing) in the form +/- [NNN] [bclr] That is, to be treated as this option it had to start with either '+' or '-'. If one of those was the first char, then came an optional string of digits, followed by one of 'b' 'c' 'l' or 'r' (or nothing at all). Anything else as the terminating char there was an error. Anything following that character (if present) was ignored. If its first char was not '+' or '-' argv[1] was simply ignored. For 'r' a digit string (or missing one) with value 0 was treated as as many lines as fit in the final 4KB (or something like that), and the lines were printed "backwards" (last line first) - all other variants printed the output in the order it appeared in the file. For all other cases, a digit string of 0 (or missing) just meant 0. 'b' multiplied that number by 512 (ie: blocks) and then treated it as characters (like 'c'). 'c' did nothing, the number was simply a char count, l (and r) caused the number to mean lines. Leading '+' skipped that much from the start of the file, leading '-' started that much before the end of the file (with caveats as to how much was possible). If this option was missing, the default was "-10l". (print the final 10 lines of the file). Only one file was handled (and as usual, if no file arg was given, stdin was used). But somewhat bizarrely, the file arg had to be argv[2], if argv[1] was present, but didn't start with a '+' or '-' it was simply ignored (so "tail file" would read stdin, not file). Supporting all of that certainly seems pointless, if not impossible. Doing what tail's "obsolete()" function does, and looking for this form of option, anywhere in the arg list, seems to simply be wrong. The earliest CSRG version that is in the SCCS files, is from 1980, and so probably for BSD 4.0 and already had added the 'f' option, and allowed the 'b' 'c' 'l' and 'r' chars (and 'f') to follow the number in any order, and any number of them (though obviously, some combinations made no sense). It still required the single file arg to be argv[2] if present - that remained until what is close to our current version arrived, the checkin log message says: new version from scratch; POSIX 1003.2 version in July 1991 (which means that it would only have been in one or more of the 4.4 (semi) releases I think, ie: not 4.3). Until the new version appeared, with POSIX style options, there was still just a single optional "option" and a single optional "file". kre
old style tail(1) options and bin/57483
bin/57483 reports that tail(1) doesn't correctly handle old style options in all cases. The current approach taken by tail is to massage the command line to convert old style options into the new style options and then use getopt to parse only the new style. Unfortunately the code that does the conversion is a bit naive, so it doesn't notice that in -fn +20 the "+20" is not a standalone old style option but an argument to the new style -n hidden/fused into "-fn". I sketched a prototype to parse both new and old style options together but the comments about incompatibilities with historic behavior give me pause. Cf. https://anonhg.netbsd.org/src/file/tip/usr.bin/tail/tail.c#l76 The man page seems to be completely silent about the old style options. What exactly are we aiming for here? Do we want to support postfix options in something like old style +qF ? -uwe