On 21/09/17 17:23, Jack Howarth wrote: > On Thu, Sep 21, 2017 at 1:20 AM, Pádraig Brady <p...@draigbrady.com> wrote: > >> On 18/09/17 18:07, Jack Howarth wrote: >>> On Mon, Sep 18, 2017 at 7:40 PM, Jim Meyering <j...@meyering.net> wrote: >>> >>>> On Mon, Sep 18, 2017 at 4:26 PM, Jack Howarth >>>> <howarth.mailing.li...@gmail.com> wrote: >>>>> On Mon, Sep 18, 2017 at 5:08 PM, Jim Meyering <j...@meyering.net> >> wrote: >>>> ... >>>>>> Is there any chance your failing test was via a python2 framework? I'm >>>>>> asking (on Pádraig's behalf) because there is a known problem whereby >>>>>> SIGPIPE is mishandled in that case, and that might explain this >>>>>> failure, since the data-generation phase relies on SIGPIPE killing >>>>>> this test's "yes" command. >>>>> >>>>> I doubt it as the hang doesn't happen under 10.13 when run on a JHFS >>>>> formatted volume. >>>> >>>> How did you run the tests? >>>> >>> >>> Actually, I forgot to mention that the coreutils test suite hang only >>> occurred on the APFS volumes when the coreutils built against the gettext >>> and libiconv from fink. A build outside of fink which didn't build >> against >>> those packages didn't show the hang in the coreutils test suite. The fink >>> gettext and libiconv packages that I am using are those from... >>> >>> https://sourceforge.net/p/fink/package-submissions/4955/ >>> >>> and >>> >>> https://sourceforge.net/p/fink/package-submissions/5004/ >>> >>> which are both patched for the format string strictness in High Sierra. I >>> found that using --disable-nls in configuring coreutils was insufficient >> to >>> suppress the test suite hang which I assume is due to the presence of... >>> >>> #define HAVE_LIBINTL_H 1 >>> >>> in the generated ./lib/config.h >>> >>> despite the presence of... >>> >>> /* #undef HAVE_DCGETTEXT */ >>> /* #undef HAVE_GETTEXT */ >>> >>> when --disable-nls is used so it still could be a Unicode related change >> in >>> APFS, no? >>> Jack >> >> The libintl bit reminded me of https://lists.gnu.org/archive/ >> html/bug-gnulib/2014-10/msg00014.html >> I.E. on OSX enabling those libs creates implicit threads I think. >> Perhaps that's messing with SIGPIPE handling and only the implicit >> thread gets it, thus not killing the main yes(1) thread. >> However the yes(1) is also protected with a timeout(1) call. >> Perhaps timeout(1) is a silent noop. We should support OSX through >> DYLD_INSERT_LIBRARIES, >> but perhaps there is something preventing that on your system? >> But then would the timeout tests fail. Could you check the timeout tests >> with: >> >> make SUBDIRS=. TESTS=tests/misc/filter.sh check >> >> In any case we should protect calls to timeout(1) to ensure it's supported. >> The attached does that at least. >> >> cheers, >> Pádraig. >> > > Pádraig, > The hang on APFS volumes doesn't seem to be related to CoreFoundation > threading. If I repeat the steps that I used to track down a similar issue > in make 4.0/4.1 by rebuilding libiconv with --disable-nls and coreutils > with the same --disable-nls so that neither are linked against > CoreFoundation, the test suite hang still occurs. Also, for the stock > build, adding your proposed timeout changes doesn't eliminate the hang in > the test suite either.
Is is a wait or a cpu spin? Could you use the equivalent of strace on your platform to see what's happening? thanks, Pádraig