On Sun, 19 Nov 2017 21:49:43 +0100, Theo Buehler wrote:
> Since I am not aware of any further objections and most responses seemed
> positive, I think it's time to move forward.
>
> For your convenience, I attached the diff again below.
OK millert@
- todd
Since I am not aware of any further objections and most responses seemed
positive, I think it's time to move forward.
For your convenience, I attached the diff again below.
ok?
Index: lib/libc/stdio/vfprintf.c
===
RCS file: /var/cvs
Hi Theo,
Theo de Raadt wrote on Fri, Nov 17, 2017 at 11:59:47AM -0700:
> how should this work and what would be the best direction
The following two aspects provide no clear guidance what is better:
1. Both printing invalid bytes (in particular to terminals)
and losing information that was
> So C99 explicitly requires failure *for encoding errors* and
> explicitly requires multibyte encoding for the format string.
> So it appears that *everybody* (except us) is in blatant violation
> of C99.
>
> To hell with multibyte characters! How on earth do so many dragons
> fit into such a sm
> Todd's research revealed that jtc@ got the information from the
> C standard in 1995, so i just checked what C89 (sic!) says:
>
> 4.9.6.1 The fprintf function
> [...]
> The format shall be a multibyte character sequence, beginning and
> ending in its initial shift state. The format is c
Hi Theo,
Theo de Raadt wrote on Fri, Nov 17, 2017 at 10:43:10AM -0700:
> Ingo Schwarze wrote:
>> I don't think, though, that the commit message should advertise
>> this as a performance improvement. It should be called an intentional
>> change of behaviour, now using the format string as a byte
> I don't think, though, that the commit message should advertise
> this as a performance improvement. It should be called an intentional
> change of behaviour, now using the format string as a byte string
> like everyone else, no matter whether POSIX explicitly specifies
> it as a character strin
On Fri, 17 Nov 2017 10:20:49 -0700, "Todd C. Miller" wrote:
> I've done a brief survey using the test program at the end of
> this message. Here are the results:
Here's the missing test program. It compares how mbrtowc() and
snprintf() treat an invalid UTF-8 sequence. I chose a simple one.
-
Ingo Schwarze wrote on Fri, Nov 17, 2017 at 03:07:48PM +0100:
[ regarding cases where this may matter in practice ]
> (2) Programs legitimately calling *printf() with a variable format
> string in any non-POSIX locale, even if it's just UTF-8.
Whoa. I just realized there is a very widespre
> On Thu, 16 Nov 2017 11:27:45 -0700, "Theo de Raadt" wrote:
>
> > Yes, I already proposed that someone made a mistake a while ago.
>
> This was added in NetBSD in 1995:
>
>
> revision 1.17
> date: 1995/05/02 19:52:41; author: jtc; state: Exp; lines: +15 -8;
> The
On Thu, 16 Nov 2017 11:27:45 -0700, "Theo de Raadt" wrote:
> Yes, I already proposed that someone made a mistake a while ago.
This was added in NetBSD in 1995:
revision 1.17
date: 1995/05/02 19:52:41; author: jtc; state: Exp; lines: +15 -8;
The C Standard says tha
> Quick answer, more later:
>
> Theo de Raadt wrote on Thu, Nov 16, 2017 at 09:52:39AM -0700:
> > Todd Miller wrote:
>
> >> Also, POSIX isn't explicit as to whether that restriction applies
> >> to the format string or just the arguments to %lc and %ls conversions.
> >>
> >> What it does say is:
Hi Theo,
Quick answer, more later:
Theo de Raadt wrote on Thu, Nov 16, 2017 at 09:52:39AM -0700:
> Todd Miller wrote:
>> Also, POSIX isn't explicit as to whether that restriction applies
>> to the format string or just the arguments to %lc and %ls conversions.
>>
>> What it does say is:
>>
>>
On Thu, 16 Nov 2017 09:52:39 -0700, "Theo de Raadt" wrote:
> > Also, POSIX isn't explicit as to whether that restriction applies
> > to the format string or just the arguments to %lc and %ls conversions.
> >
> > What it does say is:
> >
> > The format is composed of zero or more directives:
> Also, POSIX isn't explicit as to whether that restriction applies
> to the format string or just the arguments to %lc and %ls conversions.
>
> What it does say is:
>
> The format is composed of zero or more directives: ordinary
> characters, which are simply copied to the output stream,
On Thu, 16 Nov 2017 16:19:52 +0100, Stefan Sperling wrote:
> I would expect EILSEQ during %lc and %ls conversions which explicitly
> expect wide characters as arguments, but not for arbitrary data that
> happens to be part of the format string.
It is worth noting that this restriction is a POSIX
On Thu, Nov 16, 2017 at 09:57:06AM -0500, Ted Unangst wrote:
> Ingo Schwarze wrote:
> > [EILSEQ]
> > A wide-character code that does not correspond to a valid
> > character has been detected.
> >
> > That means that the functions are *required* to fail ("shall fail")
> > if encodin
Ingo Schwarze wrote:
> [EILSEQ]
> A wide-character code that does not correspond to a valid
> character has been detected.
>
> That means that the functions are *required* to fail ("shall fail")
> if encoding errors can be detected, that -1 must be returned, and
> that errno must b
Hi Todd,
Todd C. Miller wrote on Tue, Nov 14, 2017 at 09:09:13AM -0700:
> On Tue, 14 Nov 2017 09:26:47 +0100, Theo Buehler wrote:
>> If we only support UTF-8 and ASCII, we do not need complicated multibyte
>> decoding to recognize a '%' in the format string.
>>
>> In his commit message, enh clai
Hi Theo,
it's bad that i slacked on this, causing people to spend so much effort,
but i failed to make up my mind at first. I think now i see clearly.
Theo Buehler wrote on Tue, Nov 14, 2017 at 09:26:47AM +0100:
> There is a simplification and optimization for __vfprintf()
> from android pointed
On Tue, 14 Nov 2017 09:26:47 +0100, Theo Buehler wrote:
> If we only support UTF-8 and ASCII, we do not need complicated multibyte
> decoding to recognize a '%' in the format string.
>
> In his commit message, enh claims that there is a 10x speedup. In my own
> benchmarking on amd64, a speedup be
There is a simplification and optimization for __vfprintf() from android
pointed out by tedu:
https://github.com/aosp-mirror/platform_bionic/commit/5305a4d4a723b06494b93f2df81733b83a0c46d3
If we only support UTF-8 and ASCII, we do not need complicated multibyte
decoding to recognize a '%' in the
22 matches
Mail list logo