On Mon, Sep 08, 2014 at 02:49:46PM +0100, Steven Chamberlain wrote: > retitle 760841 pspp: random testsuite failures > thanks > > Affects GNU/Linux too! It's odd the issue has only been seen yet on > kfreebsd rather than linux buildds; I reproduced it on linux-amd64, and > with similar ~2% probability: > > > ~/pspp-0.8.3$ for i in $(seq 1 1000) ; do printf > > 'e\0n\0t\0r\0\351\0e\0\n\0' | tests/libpspp/u8-istream-test read - Auto | > > xxd ; done | sort | uniq -c | sort -bn > > 21 0000000: 6500 6e00 7400 7200 c3a9 0065 000a 00 e.n.t.r....e... > > 979 0000000: 656e 7472 c3a9 650a entr..e. > > For sanity, I've checked that printf is working correctly: > > > ~/pspp-0.8.3$ for i in $(seq 1 1000) ; do printf > > 'e\0n\0t\0r\0\351\0e\0\n\0' | xxd ; done | sort | uniq -c | sort -bn > > 1000 0000000: 6500 6e00 7400 7200 e900 6500 0a00 e.n.t.r...e... > > So it may be related to how the tests/libpspp/u8-istream-test program > does its reads from stdin? I'm also curious why the read is truncated > only sometimes - possibly a glibc issue there?
Thanks for the analysis. This is a PSPP bug in the library that u8-istream-test uses. I pushed the following fix to the PSPP upstream repository. Friedrich, do you want to apply that to the PSPP packaging? I'll sponsor the upload for you. Thanks, Ben. --8<--------------------------cut here-------------------------->8-- From: Ben Pfaff <b...@cs.stanford.edu> Date: Tue, 9 Sep 2014 08:51:44 -0700 Subject: [PATCH] u8-istream: Fix handling of partial reads. The u8-istream code did not retry upon a partial read, assuming that that was the end of the file. When the partial read was shorter than ENCODING_GUESS_MIN, this could cause the encoding guesser, in turn, to guess the wrong encoding (especially if the encoding was really UTF-16 and the partial read was an odd number of bytes). Reported at https://bugs.debian.org/760841. Reported by Friedrich Beckmann and Steven Chamberlain. --- src/libpspp/u8-istream.c | 20 ++++++++++++++------ tests/libpspp/u8-istream.at | 4 +++- 2 files changed, 17 insertions(+), 7 deletions(-) diff --git a/src/libpspp/u8-istream.c b/src/libpspp/u8-istream.c index 2486ca1..6347a67 100644 --- a/src/libpspp/u8-istream.c +++ b/src/libpspp/u8-istream.c @@ -1,5 +1,5 @@ /* PSPP - a program for statistical analysis. - Copyright (C) 2010, 2011, 2012, 2013 Free Software Foundation, Inc. + Copyright (C) 2010, 2011, 2012, 2013, 2014 Free Software Foundation, Inc. This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by @@ -203,14 +203,22 @@ fill_buffer (struct u8_istream *is) is->head = is->buffer; /* Read more input. */ + n = 0; do { - n = read (is->fd, is->buffer + is->length, - U8_ISTREAM_BUFFER_SIZE - is->length); + ssize_t retval = read (is->fd, is->buffer + is->length, + U8_ISTREAM_BUFFER_SIZE - is->length); + if (retval > 0) + { + n += retval; + is->length += retval; + } + else if (retval == 0) + return n; + else if (errno != EINTR) + return n > 0 ? n : -1; } - while (n < 0 && errno == EINTR); - if (n > 0) - is->length += n; + while (is->length < U8_ISTREAM_BUFFER_SIZE); return n; } diff --git a/tests/libpspp/u8-istream.at b/tests/libpspp/u8-istream.at index 24af08c..5b5b4e0 100644 --- a/tests/libpspp/u8-istream.at +++ b/tests/libpspp/u8-istream.at @@ -140,7 +140,9 @@ AT_CLEANUP AT_SETUP([read UTF-16 as Auto]) AT_KEYWORDS([u8_istream]) AT_CHECK([i18n-test supports_encodings UTF-16 UTF-16BE UTF-16LE]) -AT_CHECK([printf '\0e\0n\0t\0r\0\351\0e\0\n' | u8-istream-test read - Auto], +dnl The "sleep 1" checks for a bug in which u8-istream did not properly +dnl handle receiving data in multiple chunks. +AT_CHECK([{ printf '\0e\0n\0t\0'; sleep 1; printf 'r\0\351\0e\0\n'; } | u8-istream-test read - Auto], [0], [entr??e ]) AT_CHECK([printf 'e\0n\0t\0r\0\351\0e\0\n\0' | u8-istream-test read - Auto], -- 1.9.1 -- To UNSUBSCRIBE, email to debian-bsd-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/20140909155353.ga30...@blp.benpfaff.org