Re: rhel8 test failure confirmation? [PATCH for problem affecting Automake testsuite]

2023-03-31 Thread Jacob Bachmeyer

A quick introduction to the situation for the Autoconf list:

The Automake maintainers have encountered a bizarre issue with sporadic 
random test failures, seemingly due to "disk writes not taking effect" 
(as Karl Berry mentioned when starting the thread).  Bogdan appears to 
have traced the issue to autom4te caching and offered a patch.  I have 
attached a copy of Bogdan's patch.


Bogdan's patch is a subtle change:  the cache is now considered stale 
unless it is /newer/ than the source files, rather than being considered 
stale only if the source files are newer.  In short, this patch causes 
the cache to be considered stale if its timestamp /matches/ the source 
file, while it is currently considered valid if the timestamps match.  I 
am forwarding the patch to the Autoconf list now because I concur with 
the change, noting that Time:HiRes is also limited by the underlying 
filesystem and therefore is not a "magic bullet" solution.  Assuming the 
cache files are stale unless proven otherwise is therefore correct.


Note again that this is _Bogdan's_ patch I am forwarding unchanged.  I 
did not write it (but I agree with it).


[further comments inline below]

Bogdan wrote:
Bogdan , Sun Mar 05 2023 22:31:55 GMT+0100 (Central 
European Standard Time)
Karl Berry , Sat Mar 04 2023 00:00:56 GMT+0100 
(Central European Standard Time)
 Note that 'config.h' is older (4 seconds) than './configure', 
which

 shouldn't be the case as it should get updated with new values.

Indeed. That is the same sort of thing as I was observing with nodef.
But what (at any level) could be causing that to happen?
Files just aren't getting updated as they should be.

I haven't yet tried older releases of automake to see if their tests
succeed on the systems that are failing now. That's next on my list.


[...]


  Another tip, maybe: cache again. When I compare which files are 
newer than the only trace file I get in the failing 'backcompat2' 
test ('autom4te.cache/traces.0'), I see that 'configure.ac' is older 
than this file in the succeeding run, but it's newer in the failing 
run. This could explain why 'configure' doesn't get updated to put 
new values in config.h (in my case) - 'autom4te' thinks it's up-to-date.

  The root cause may be in 'autom4te', sub 'up_to_date':

   # The youngest of the cache files must be older than the oldest of
   # the dependencies.
   # FIXME: These timestamps have only 1-second resolution.
   # Time::HiRes fixes this, but assumes Perl 5.8 or later.

(lines 913-916 in my version).


This comment Bogdan cites is not correct:  Time::HiRes could be 
installed from CPAN on Perls older than 5.8, and might be missing from a 
5.8 or later installation if the distribution packager separated it into 
another package.  Nor is Time::HiRes guaranteed to fix the issue; the 
infamous example is the FAT filesystem, where timestamps only have 
2-second resolution.  Either way, Time::HiRes is now used if available, 
so this "FIXME" is fixed now.  :-)


  Perhaps 'configure.ac' in the case that fails is created "not late 
enough" (still within 1 second) when compared to the cache, and the 
cached values are taken, generating the old version of 'configure' 
which, in turn, generates old versions of the output files.


  Still a guess, but maybe a bit more probable now.

  Does it work when you add '-f' to '$AUTOCONF'? It does for me - 
again, about 20 sequential runs of the same set of tests and about 5 
parallel with 4 threads. Zero failures.
  I'd probably get the same result if I did a 'rm -fr autom4te.cache' 
before each '$AUTOCONF' invocation.

[...]

  More input (or noise):

1) The t/backcompat2.sh test (the only test which fails for me) is a 
test which modifies configure.ac and calls $AUTOCONF several times.


2) Autom4te (part of Autoconf) has a 1-second resolution in checking 
if the input files are newer than the cache.


Maybe.  That comment could be wrong; the actual "sub mtime" is in 
Autom4te::FileUtils.  Does your version of that module use Time::HiRes?  
Git indicates that use of Time::HiRes was added to Autoconf at commit 
3a9802d60156809c139e9b4620bf04917e143ee2 which is between the 2.72a and 
2.72c snapshot tags.


3) Thus, a sequence: 'autoconf' + quickly modify configure.ac + 
quickly run 'autoconf' may cause autom4te to use the old values from 
the cache instead of processing the new configure.ac. "Quickly" means 
within the same second.


It might be broader than that if your version is already using 
Time::HiRes.  If so, what filesystems are involved?  I could see a 
possible bug where multiple writes get the same mtime if they get 
flushed to disk together.  Time::HiRes will not help if this happens; 
your patch will work around such a bug.


4) I ran the provided list of tests (t/backcompat2.sh, 
t/backcompat3.sh, t/get-sysconf.sh, t/lex-depend.sh, t/nodef.sh, 
t/remake-aclocal-version-mismatch.sh, t/subdir-add2-pr46.sh, 
t/testsuite-summary-reference-log.sh) in batches of 

Re: rhel8 test failure confirmation?

2023-03-31 Thread Bogdan
Bogdan , Sun Mar 05 2023 22:31:55 GMT+0100 (Central 
European Standard Time)
Karl Berry , Sat Mar 04 2023 00:00:56 GMT+0100 
(Central European Standard Time)
 Note that 'config.h' is older (4 seconds) than './configure', 
which

 shouldn't be the case as it should get updated with new values.

Indeed. That is the same sort of thing as I was observing with nodef.
But what (at any level) could be causing that to happen?
Files just aren't getting updated as they should be.

I haven't yet tried older releases of automake to see if their tests
succeed on the systems that are failing now. That's next on my list.


[...]


  Another tip, maybe: cache again. When I compare which files are 
newer than the only trace file I get in the failing 'backcompat2' test 
('autom4te.cache/traces.0'), I see that 'configure.ac' is older than 
this file in the succeeding run, but it's newer in the failing run. 
This could explain why 'configure' doesn't get updated to put new 
values in config.h (in my case) - 'autom4te' thinks it's up-to-date.

  The root cause may be in 'autom4te', sub 'up_to_date':

   # The youngest of the cache files must be older than the oldest of
   # the dependencies.
   # FIXME: These timestamps have only 1-second resolution.
   # Time::HiRes fixes this, but assumes Perl 5.8 or later.

(lines 913-916 in my version).

  Perhaps 'configure.ac' in the case that fails is created "not late 
enough" (still within 1 second) when compared to the cache, and the 
cached values are taken, generating the old version of 'configure' 
which, in turn, generates old versions of the output files.


  Still a guess, but maybe a bit more probable now.

  Does it work when you add '-f' to '$AUTOCONF'? It does for me - 
again, about 20 sequential runs of the same set of tests and about 5 
parallel with 4 threads. Zero failures.
  I'd probably get the same result if I did a 'rm -fr autom4te.cache' 
before each '$AUTOCONF' invocation.


  If it does work for you, then maybe it would be better to add '-f' 
to the 'AUTOCONF' variable where it's defined? Tests with a single 
'$AUTOCONF' invocation won't notice (they will create their cache 
anyway), while tests with multiple calls can maybe get fixed all at once.



[...]

  More input (or noise):

1) The t/backcompat2.sh test (the only test which fails for me) is a 
test which modifies configure.ac and calls $AUTOCONF several times.


2) Autom4te (part of Autoconf) has a 1-second resolution in checking 
if the input files are newer than the cache.


3) Thus, a sequence: 'autoconf' + quickly modify configure.ac + 
quickly run 'autoconf' may cause autom4te to use the old values from 
the cache instead of processing the new configure.ac. "Quickly" means 
within the same second.


4) I ran the provided list of tests (t/backcompat2.sh, 
t/backcompat3.sh, t/get-sysconf.sh, t/lex-depend.sh, t/nodef.sh, 
t/remake-aclocal-version-mismatch.sh, t/subdir-add2-pr46.sh, 
t/testsuite-summary-reference-log.sh) in batches of 20 or more runs.


5) With the tools as they are on my system, I got a failure in the 
t/backcompat2.sh test in the first batch (18th round, IIRC).


6) I modified my autom4te using the attached patch, which essentially 
makes the mentioned sub 'up_to_date' work as if the cache is out of 
date if its modtime (up to 1-second precision) is not only earlier, 
but also equal to the modtime of any dependencies (including 
configure.ac).


7) After modifying autom4te, I ran 120 rounds of the same set of tests 
in single-threaded mode, and additional 120 rounds in parallel mode 
(-j4). Total of 240 runs of all those 8 mentioned tests each. ZERO 
FAILURES.


8) I brought autom4te to its original state and started running the 
tests again. I got the first failure quite early (32nd run, IIRC).


  I believe that shortening the sleep in Automake in the suspected 
commit (720a1153134b833de9298927a432b4ea266216fb "m4: speed up 
filesystem modification checks") is not an error in Automake. It 
simply causes this issue in autom4te (not enough precision) to show up.


  I'm not 100% sure about my results and conclusions, because 
t/backcompat2.sh doesn't seem to be that quick in modifying 
configure.ac to cause a false cache hit in autom4te. But these results 
seem to confirm my previous results where using -f to bypass the cache 
helped.


  Another thing is that we have a bit of a special case here - quick 
modifications and reconfigurations. So, even though autom4te has a 
"FIXME" there, probably over 99% of users are not impacted. People 
usually modify configure.ac manually, then run autoconf manually 
(after 1 second) and then just ./configure.


  What can we do about this?

- have autom4te patched and wait for the fix to reach a release (and 
get installed on every possible end-user system?), and revert the 
sleep until this is done,


- revert the commit and increase the test time again,

- fix the randomly-failing tests by adding '-f' to second and next 
invocations of