Re: faster printf

2017-11-20 Thread Todd C. Miller
On Sun, 19 Nov 2017 21:49:43 +0100, Theo Buehler wrote: > Since I am not aware of any further objections and most responses seemed > positive, I think it's time to move forward. > > For your convenience, I attached the diff again below. OK millert@ - todd

Re: faster printf

2017-11-19 Thread Theo Buehler
Since I am not aware of any further objections and most responses seemed positive, I think it's time to move forward. For your convenience, I attached the diff again below. ok? Index: lib/libc/stdio/vfprintf.c === RCS file: /var/cvs

Re: faster printf

2017-11-18 Thread Ingo Schwarze
Hi Theo, Theo de Raadt wrote on Fri, Nov 17, 2017 at 11:59:47AM -0700: > how should this work and what would be the best direction The following two aspects provide no clear guidance what is better: 1. Both printing invalid bytes (in particular to terminals) and losing information that was

Re: faster printf

2017-11-17 Thread Theo de Raadt
> So C99 explicitly requires failure *for encoding errors* and > explicitly requires multibyte encoding for the format string. > So it appears that *everybody* (except us) is in blatant violation > of C99. > > To hell with multibyte characters! How on earth do so many dragons > fit into such a sm

Re: faster printf

2017-11-17 Thread Theo de Raadt
> Todd's research revealed that jtc@ got the information from the > C standard in 1995, so i just checked what C89 (sic!) says: > > 4.9.6.1 The fprintf function > [...] > The format shall be a multibyte character sequence, beginning and > ending in its initial shift state. The format is c

Re: faster printf

2017-11-17 Thread Ingo Schwarze
Hi Theo, Theo de Raadt wrote on Fri, Nov 17, 2017 at 10:43:10AM -0700: > Ingo Schwarze wrote: >> I don't think, though, that the commit message should advertise >> this as a performance improvement. It should be called an intentional >> change of behaviour, now using the format string as a byte

Re: faster printf

2017-11-17 Thread Theo de Raadt
> I don't think, though, that the commit message should advertise > this as a performance improvement. It should be called an intentional > change of behaviour, now using the format string as a byte string > like everyone else, no matter whether POSIX explicitly specifies > it as a character strin

Re: faster printf

2017-11-17 Thread Todd C. Miller
On Fri, 17 Nov 2017 10:20:49 -0700, "Todd C. Miller" wrote: > I've done a brief survey using the test program at the end of > this message. Here are the results: Here's the missing test program. It compares how mbrtowc() and snprintf() treat an invalid UTF-8 sequence. I chose a simple one. -

Re: faster printf

2017-11-17 Thread Ingo Schwarze
Ingo Schwarze wrote on Fri, Nov 17, 2017 at 03:07:48PM +0100: [ regarding cases where this may matter in practice ] > (2) Programs legitimately calling *printf() with a variable format > string in any non-POSIX locale, even if it's just UTF-8. Whoa. I just realized there is a very widespre

Re: faster printf

2017-11-17 Thread Theo de Raadt
> On Thu, 16 Nov 2017 11:27:45 -0700, "Theo de Raadt" wrote: > > > Yes, I already proposed that someone made a mistake a while ago. > > This was added in NetBSD in 1995: > > > revision 1.17 > date: 1995/05/02 19:52:41; author: jtc; state: Exp; lines: +15 -8; > The

Re: faster printf

2017-11-17 Thread Todd C. Miller
On Thu, 16 Nov 2017 11:27:45 -0700, "Theo de Raadt" wrote: > Yes, I already proposed that someone made a mistake a while ago. This was added in NetBSD in 1995: revision 1.17 date: 1995/05/02 19:52:41; author: jtc; state: Exp; lines: +15 -8; The C Standard says tha

Re: faster printf

2017-11-16 Thread Theo de Raadt
> Quick answer, more later: > > Theo de Raadt wrote on Thu, Nov 16, 2017 at 09:52:39AM -0700: > > Todd Miller wrote: > > >> Also, POSIX isn't explicit as to whether that restriction applies > >> to the format string or just the arguments to %lc and %ls conversions. > >> > >> What it does say is:

Re: faster printf

2017-11-16 Thread Ingo Schwarze
Hi Theo, Quick answer, more later: Theo de Raadt wrote on Thu, Nov 16, 2017 at 09:52:39AM -0700: > Todd Miller wrote: >> Also, POSIX isn't explicit as to whether that restriction applies >> to the format string or just the arguments to %lc and %ls conversions. >> >> What it does say is: >> >>

Re: faster printf

2017-11-16 Thread Todd C. Miller
On Thu, 16 Nov 2017 09:52:39 -0700, "Theo de Raadt" wrote: > > Also, POSIX isn't explicit as to whether that restriction applies > > to the format string or just the arguments to %lc and %ls conversions. > > > > What it does say is: > > > > The format is composed of zero or more directives:

Re: faster printf

2017-11-16 Thread Theo de Raadt
> Also, POSIX isn't explicit as to whether that restriction applies > to the format string or just the arguments to %lc and %ls conversions. > > What it does say is: > > The format is composed of zero or more directives: ordinary > characters, which are simply copied to the output stream,

Re: faster printf

2017-11-16 Thread Todd C. Miller
On Thu, 16 Nov 2017 16:19:52 +0100, Stefan Sperling wrote: > I would expect EILSEQ during %lc and %ls conversions which explicitly > expect wide characters as arguments, but not for arbitrary data that > happens to be part of the format string. It is worth noting that this restriction is a POSIX

Re: faster printf

2017-11-16 Thread Stefan Sperling
On Thu, Nov 16, 2017 at 09:57:06AM -0500, Ted Unangst wrote: > Ingo Schwarze wrote: > > [EILSEQ] > > A wide-character code that does not correspond to a valid > > character has been detected. > > > > That means that the functions are *required* to fail ("shall fail") > > if encodin

Re: faster printf

2017-11-16 Thread Ted Unangst
Ingo Schwarze wrote: > [EILSEQ] > A wide-character code that does not correspond to a valid > character has been detected. > > That means that the functions are *required* to fail ("shall fail") > if encoding errors can be detected, that -1 must be returned, and > that errno must b

Re: faster printf

2017-11-15 Thread Ingo Schwarze
Hi Todd, Todd C. Miller wrote on Tue, Nov 14, 2017 at 09:09:13AM -0700: > On Tue, 14 Nov 2017 09:26:47 +0100, Theo Buehler wrote: >> If we only support UTF-8 and ASCII, we do not need complicated multibyte >> decoding to recognize a '%' in the format string. >> >> In his commit message, enh clai

Re: faster printf

2017-11-15 Thread Ingo Schwarze
Hi Theo, it's bad that i slacked on this, causing people to spend so much effort, but i failed to make up my mind at first. I think now i see clearly. Theo Buehler wrote on Tue, Nov 14, 2017 at 09:26:47AM +0100: > There is a simplification and optimization for __vfprintf() > from android pointed

Re: faster printf

2017-11-14 Thread Todd C. Miller
On Tue, 14 Nov 2017 09:26:47 +0100, Theo Buehler wrote: > If we only support UTF-8 and ASCII, we do not need complicated multibyte > decoding to recognize a '%' in the format string. > > In his commit message, enh claims that there is a 10x speedup. In my own > benchmarking on amd64, a speedup be

faster printf

2017-11-14 Thread Theo Buehler
There is a simplification and optimization for __vfprintf() from android pointed out by tedu: https://github.com/aosp-mirror/platform_bionic/commit/5305a4d4a723b06494b93f2df81733b83a0c46d3 If we only support UTF-8 and ASCII, we do not need complicated multibyte decoding to recognize a '%' in the