Re: -j fails on DYNIX/ptx

2000-05-31 Thread Paul Eggert

   Date: Wed, 31 May 2000 14:22:39 -0400 (EDT)
   From: "Paul D. Smith" <[EMAIL PROTECTED]>

   Two solutions immediately present themselves:

1) Just wrap stat(2) in a loop checking for EINTR, even though that's
   not possible on any standard UNIX system.

Actually, EINTR is possible on a POSIX-compliant system, since POSIX
allows (but does not require) stat to fail with EINTR.  Wrapping stat
is the right fix, I think.

Also, while we're on the subject, that code should check for other
failures.  For example, if the file can't be stat'ed because the
parent directory is unreadable, an error should be reported.

Here's a proposed patch.


2000-05-31  Paul Eggert  <[EMAIL PROTECTED]>

* remake.c (name_mtime): Check for stat failures.
Retry if EINTR.

===
RCS file: remake.c,v
retrieving revision 3.79.0.2
retrieving revision 3.79.0.3
diff -pu -r3.79.0.2 -r3.79.0.3
--- remake.c2000/05/22 16:45:52 3.79.0.2
+++ remake.c2000/05/31 21:25:10 3.79.0.3
@@ -1216,8 +1216,13 @@ name_mtime (name)
 {
   struct stat st;
 
-  if (stat (name, &st) < 0)
-return (FILE_TIMESTAMP) -1;
+  while (stat (name, &st) != 0)
+if (errno != EINTR)
+  {
+   if (errno != ENOENT && errno != ENOTDIR)
+ perror_with_name ("stat:", name);
+   return (FILE_TIMESTAMP) -1;
+  }
 
   return FILE_TIMESTAMP_STAT_MODTIME (st);
 }




RE: -j fails on DYNIX/ptx

2000-05-31 Thread Howard Chu

I can confirm that many filesystem operations can fail with EINTR when
operating over NFS. However, someone had to be causing the signals in
question - SIGALRM, maybe, or keyboard SIGINT, etc. Protecting stat() is
usually the most important remedy, since failure will cause you to think
that a file does not exist, even
though it does. The big question is what is generating the signals -
presumably
it's SIGCHLD in this case, and there's probably not much you can do about
that.

  -- Howard Chu
  Chief Architect, Symas Corp.   Director, Highland Sun
  http://www.symas.com   http://highlandsun.com/hyc

> -Original Message-
> From: Paul D. Smith [mailto:[EMAIL PROTECTED]]
> Sent: Wednesday, May 31, 2000 12:22 PM
> To: Michael Sterrett -Mr. Bones.-
> Cc: [EMAIL PROTECTED]
> Subject: Re: -j fails on DYNIX/ptx
>
>
> %% "Michael Sterrett -Mr. Bones.-" <[EMAIL PROTECTED]> writes:
>
>   ms> Do the specifications say that EINTR is not required or that it
>   ms> is forbidden?
>
> Hmm.  They say that a function may have more return error codes than
> listed in the standards, so you're right: I guess there's nothing
> technically preventing EINTR.
>
> stat() is listed as required to be safe to call within a signal handler,
> which isn't directly related of course.
>
>   ms>  EINTR   A signal was caught during the stat() or
>   ms>  lstat() function.
>
>   ms> Without the code, there's no way for me to know if Solaris will
>   ms> actually ever fail with EINTR, but the man pages seems
> to indicate
>   ms> that it *could*.
>
> I suspect, but can't prove, that you won't get EINTR from stat(2) for
> "normal" filesystems.
>
>   ms> I guess I'm still not convinced that this problem couldn't be
>   ms> reproduced if sufficiently adverse conditions were
>   ms> encountered, even on "normal" UNIX systems.  I'm for adding
>   ms> the loop on EINTR to the GNU make code base.
>
> I just find it hard to believe that there have been no other reported
> cases of this anywhere else; if it's possible, no matter how obscure, it
> seems like _someone_ else would have hit it somewhere, by now.  Sigh.
>
> The problem is, there are a number of places that use stat(2) in GNU
> make in addition to this one.  Do we need to armor them all?  What about
> other system calls?
>
> If this is a real (even potential) problem for most/all OS's, then maybe
> a different solution than wrapping all the system calls in EINTR checks
> is in order.
>
> --
> --
> -
>  Paul D. Smith <[EMAIL PROTECTED]>  Find some GNU make tips at:
>  http://www.gnu.org
http://www.ultranet.com/~pauld/gmake/
 "Please remain calm...I may be mad, but I am a professional." --Mad
Scientist




Re: -j fails on DYNIX/ptx

2000-05-31 Thread Paul D. Smith

%% "Michael Sterrett -Mr. Bones.-" <[EMAIL PROTECTED]> writes:

  ms> Do the specifications say that EINTR is not required or that it
  ms> is forbidden?

Hmm.  They say that a function may have more return error codes than
listed in the standards, so you're right: I guess there's nothing
technically preventing EINTR.

stat() is listed as required to be safe to call within a signal handler,
which isn't directly related of course.

  ms>  EINTR   A signal was caught during the stat() or
  ms>  lstat() function.

  ms> Without the code, there's no way for me to know if Solaris will
  ms> actually ever fail with EINTR, but the man pages seems to indicate
  ms> that it *could*.

I suspect, but can't prove, that you won't get EINTR from stat(2) for
"normal" filesystems.

  ms> I guess I'm still not convinced that this problem couldn't be
  ms> reproduced if sufficiently adverse conditions were
  ms> encountered, even on "normal" UNIX systems.  I'm for adding
  ms> the loop on EINTR to the GNU make code base.

I just find it hard to believe that there have been no other reported
cases of this anywhere else; if it's possible, no matter how obscure, it
seems like _someone_ else would have hit it somewhere, by now.  Sigh.

The problem is, there are a number of places that use stat(2) in GNU
make in addition to this one.  Do we need to armor them all?  What about
other system calls?

If this is a real (even potential) problem for most/all OS's, then maybe
a different solution than wrapping all the system calls in EINTR checks
is in order.

-- 
---
 Paul D. Smith <[EMAIL PROTECTED]>  Find some GNU make tips at:
 http://www.gnu.org  http://www.ultranet.com/~pauld/gmake/
 "Please remain calm...I may be mad, but I am a professional." --Mad Scientist




Re: -j fails on DYNIX/ptx

2000-05-31 Thread Michael Sterrett -Mr. Bones.-

On Wed, 31 May 2000, Paul D. Smith wrote:

> OK, I've investigated this further.  The reason you never see this
> problem on "normal" UNIX systems is that it's not legal for stat(2) to
> fail with EINTR.  In other words, stat(2) is not interruptible, by
> definition, due to signals.  Looking at both the POSIX and SingleUNIX
> specifications for stat(2), EINTR is not a legal error state when
> stat(2) returns.
> 
> Two solutions immediately present themselves:
> 
>  1) Just wrap stat(2) in a loop checking for EINTR, even though that's
> not possible on any standard UNIX system.  I don't think this would
> be much of a slowdown in the code, but others might disagree (for
> sure this stat(2) is one of the most common system calls make uses).
> 
>  2) Use a configure check for this OS (I don't see how a configure macro
> for this can easily be written) and only wrap the stat(2) in an
> EINTR loop on this OS (i386-sequent-sysv4).

Paul - 

Do the specifications say that EINTR is not required or that it
is forbidden?

From the man page for stat(2) on Solaris:


--CUT---
ERRORS
 stat() and lstat() fail if one or more of the following  are
 true:

 EINTR   A signal was caught during the stat() or
 lstat() function.
--CUT---

Without the code, there's no way for me to know if Solaris will
actually ever fail with EINTR, but the man pages seems to indicate
that it *could*.

I guess I'm still not convinced that this problem couldn't be
reproduced if sufficiently adverse conditions were encountered, even
on "normal" UNIX systems.  I'm for adding the loop on EINTR to the
GNU make code base.

Thanks for your work on this,

Michael Sterrett
  -Mr. Bones.-
[EMAIL PROTECTED]





Re: -j fails on DYNIX/ptx

2000-05-31 Thread Paul D. Smith

%% "Michael Sterrett -Mr. Bones.-" <[EMAIL PROTECTED]> writes:

Re: an issue with GNU make 3.78 and above on DYNIX/ptx...

  ms> $ gmake --version
  ms> GNU Make version 3.78.1, by Richard Stallman and Roland McGrath.
  ms> Built for i386-sequent-sysv4

  ms> $ uname -a
  ms> DYNIX/ptx roll 4.0 V4.4.4 i386

  ms> Incidently, I can also reproduce it on V4.4.7.

  ms> TARGETS = $(patsubst %.abc,%.xyz,$(wildcard *[0-9].abc))

  ms> %.xyz: %.abc
  ms>   @touch $@

  ms> all: $(TARGETS)

  ms> There are 100 files in the directory called 1.abc, 2.abc, and so on.

  ms> $ gmake -j
  ms> gmake: *** No rule to make target `12.abc', needed by `12.xyz'.  Stop.
  ms> gmake: *** Waiting for unfinished jobs

  ms> The number varies, but it usually fails somewhere.  Without the -j
  ms> option, or with -j1, the build completes as expected.  Also, gmake
  ms> -j2 is just as unreliable so I don't think it's a resource or
  ms> memory problem.

The problem turns out to be that the stat(2) system call in
remake.c:name_mtime() is failing with EINTR.  Wrapping it in a loop to
repeat on EINTR solves the problem.

-

OK, I've investigated this further.  The reason you never see this
problem on "normal" UNIX systems is that it's not legal for stat(2) to
fail with EINTR.  In other words, stat(2) is not interruptible, by
definition, due to signals.  Looking at both the POSIX and SingleUNIX
specifications for stat(2), EINTR is not a legal error state when
stat(2) returns.

Two solutions immediately present themselves:

 1) Just wrap stat(2) in a loop checking for EINTR, even though that's
not possible on any standard UNIX system.  I don't think this would
be much of a slowdown in the code, but others might disagree (for
sure this stat(2) is one of the most common system calls make uses).

 2) Use a configure check for this OS (I don't see how a configure macro
for this can easily be written) and only wrap the stat(2) in an
EINTR loop on this OS (i386-sequent-sysv4).

-- 
---
 Paul D. Smith <[EMAIL PROTECTED]>  Find some GNU make tips at:
 http://www.gnu.org  http://www.ultranet.com/~pauld/gmake/
 "Please remain calm...I may be mad, but I am a professional." --Mad Scientist




RE: HP-UX 64 bit bug

2000-05-31 Thread Mark Syms

Sorry folks false alarm. It seems that the Imake that generated the
makefiles that GNU make died on was the culprit as it wasn't being built
correctly for some reason and so generated bad rules.

Regards,

Mark Syms

> -Original Message-
> From: Mark Syms 
> Sent: Wednesday, May 24, 2000 10:37 AM
> To:   '[EMAIL PROTECTED]'
> Subject:  HP-UX 64 bit bug
> 
> We are having some problems using GNU make on  a new HP 9000 L2000 machine
> (64 bit) having moved from using older 32 bit machines.
> 
> It appears that the dependency checking is getting confused somewhere when
> trying to build a file.
> 
> Makefile snippet
> 
> 
> transport.c: $(TRANSCOMMSRC)/transport.c
>   $(RM) $@
>   $(LN) $? $@
> 
> TRANSCOMMSRC is ../../lib/xtrans
> transport.c is itself a symbolic link to a source tree i.e. the expected
> result is a symbolic link to a symbolic link to a real file.
> 
> HP 9000 L2000 (64 bit PA RISC)
> --
> 
> Reading makefiles...
> Reading makefile `Makefile'...
> Updating makefiles
> Makefile `Makefile' might loop; not remaking it.
> Updating goal targets
> Considering target file `../../lib/xtrans/transport.c'.
>  Looking for an implicit rule for `../../lib/xtrans/transport.c'.
>  Trying pattern rule with stem `transport'.
>  Trying rule dependency `/src/cascade/main/X/lib/ICE'.
>  Trying implicit dependency `../../lib/xtrans//transport.c'.
>  Found an implicit rule for `../../lib/xtrans/transport.c'.
>   Considering target file `/src/cascade/main/X/lib/ICE'.
> 
> 
> HP 9000 C110 (32 bit PA-RISC)
> -
> 
> Reading makefiles...
> Reading makefile `Makefile'...
> Updating makefiles
> Makefile `Makefile' might loop; not remaking it.
> Updating goal targets
> Considering target file `transport.c'.
>  File `transport.c' does not exist.
>   Considering target file `../../lib/xtrans/transport.c'.
>Looking for an implicit rule for `../../lib/xtrans/transport.c'.
>Trying pattern rule with stem `transport'.
>Trying implicit dependency
> `../../lib/xtrans//src/cascade/main/X/lib/ICE/transport.c'.
>Trying pattern rule with stem `transport'.
>Trying implicit dependency `../../lib/xtrans/transport.y'.
> 
> --
> ---
> 
> As can be seen from the debug information the 32 bit and 64 bit machines
> diverge on line 8 with the 64 bit trying a rule dependency and the 32
> using an implicit rule. If this continues the following happens :-
> 
> Considering target file `../../lib/xtrans///transport.c'.
>  Looking for an implicit rule for `../../lib/xtrans///transport.c'.
>  Trying pattern rule with stem `transport'.
>  Trying rule dependency `/src/cascade/main/X/lib/ICE'.
>  Trying implicit dependency `../../lib/xtranstransport.c'.
>  Found an implicit rule for `../../lib/xtrans///transport.c'.
>   Considering target file `/src/cascade/main/X/lib/ICE'.
>   File `/src/cascade/main/X/lib/ICE' was considered already.
>   Considering target file `../../lib/xtranstransport.c'.
>Looking for an implicit rule for `../../lib/xtranstransport.c'.
>Trying pattern rule with stem `transport'.
>Trying rule dependency `/src/cascade/main/X/lib/ICE'.
> 
> As can easily be seen extra /'s are added to the rule checking, this
> continues seemingly without termination until a limit is reached
> (recursion depth possibly ?).
> 
> This problem has been observed in 3.75, 3.77 and 3.79 gmake with binary
> and source (built on both 32 or 64 bit platforms) distributions. Having
> had a cursory look with a debugger on the 64 bit machine it seems that the
> rules are corrupt before the parsing operation is completed.
> 
> Any assistance would be appreciated. If any further information is
> required please contact me.
> 
> Mark Syms
> 
> Software Engineer
> Citrix Systems (Research and Development) Ltd
> +44 1223 568 953
> [EMAIL PROTECTED]