Hi Andrey,
Andrey Chernov wrote on Wed, Aug 24, 2016 at 09:15:15PM +0300:
> And if we plan to change original 44lite
> function behavior, all BSD camps should agree at least.
I agree that it is preferable to do such bugfixes in consensus, if
possible; that helps to improve quality and coherence. Your review
already revealed one additional issue that i didn't see at first
(__SERR after malloc(3) failure), but that i now merged to OpenBSD
as well.
> On 24.08.2016 20:49, Ingo Schwarze wrote:
>> But i'd like to get both getln(3) and getwln(3) fixed, and both of
>> them in a similar way. And for getwln(3), the situation of being
>> able to read some wide characters before running into an (encoding)
>> error is not uncommon at all, and in that case, you noticed yourself
>> that the *next* read will typically not fail again, so the typical
>> loop will treat the encoding error as a newline, put the library
>> into an undefined shift state, and happily go on to read garbage.
> Could you show some code? In my testing fgetwln() fails on next read if
> previously there was partial line with tail EILSEQ. Stdio not advance
> its pointer over the sequence with EILSEQ.
See below for a radically stripped down version of FreeBSD rev(1).
When i revert my fgetwln(3) patch (as you did in FreeBSD) and compile
and run that stripped down rev(1) on OpenBSD, i get this:
$ export LC_CTYPE=en_US.UTF-8
$ printf "one\200two\200three" | ./frev
eno
owt
eerht
frev: Illegal byte sequence
Is there maybe yet another bug, maybe somewhere in OpenBSD fgetwc(3),
advancing a pointer where it shouldn't? What result do you see
when you run that test program on FreeBSD?
>> Actually, even in getln(3), this can go quite wrong in the form of
>> a race condition - for example, a program running with input connected
>> to a terminal using SIG_IGN on SIGTTIN. Imagine the first read
>> fails with EIO, then the program comes back to the foreground, and
>> by the time of the next read, that read succeeds again. So some
>> data is lost, but the temporary error looks exactly like a newline
>> (unless somebody checks ferror(3) after a successful read, which
>> isn't very reasonable to do).
> I don't know from where you got this newline idea. 1) if those
> functions goes so far to return newline, it means they meet no
> errors at all. 2) If the last character is not newline, it means
> (W)EOF or error.
Sorry, i didn't mean to say the string returned contains an actual
newline character; it doesn't. What i meant to say is: Without
my patch, when fgetwln(3) runs into an encoding error, it returns
the string up to that point, which the normal loop will treat the
same way as if input would have contained a newline in place of
the encoding error - see the test output above.
Thanks,
Ingo
$ cat frev.c
/*
* Copyright (c) 1987, 1992, 1993
* The Regents of the University of California. All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
* 4. Neither the name of the University nor the names of its contributors
* may be used to endorse or promote products derived from this software
* without specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
* ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
* FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
* DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
* OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
* HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
* OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
* SUCH DAMAGE.
*/
#include <sys/types.h>
#include <err.h>
#include <locale.h>
#include <stdio.h>
#include <wchar.h>
int
main(void)
{
wchar_t *beg, *end, *back;
size_t sz;
setlocale(LC_ALL, "");
/* Loop on lines. */
while ((beg = fgetwln(stdin, &sz)) != NULL) {
end = beg + sz - 1;
if (*end == L'\n')
--end;
/* Backward loop on characters. */
for (back = end; back >= beg; --back)
putwchar(*back);
putwchar(L'\n');
}
if (ferror(stdin))
err(1, NULL);
return 0;
}