> >> Interesting observation. The code for waiting for valid port file values
> >> basically looks like
> >>
> >>      for (int i = 0; i < 10; i++) {
> >>          checkPortFile();
> >>          if (successful)
> >>              break;
> >>          sleep(500);
> >>      }
> >>
> >> so the fact that it even reaches 6676 ms looks suspicious when it comes to
> >> load.
> >
> >
> > Why? Under load those sleep(500)'s might not return for much longer; and the
> > whole things might be time preempted at any point for an extended period of
> > time.

What I meant was that the fact that the code takes 6676 ms to complete 
increases my suspicion about it being due to a load issue.


> What about putting another loop around this loop which prints a
> warning to stdout (e.g. "..trying to connect to sjavac server since X
> seconds") for another five or so times. We could also print the system
> load [1] although I'm not sure it's worth it. On our AIX machines it
> is often a network/NFS problem which causes long startup times of new
> executables and this won't be observable by looking at the system load
> (but it may at least give a hint').
> 
> [1] 
> http://docs.oracle.com/javase/6/docs/api/java/lang/management/OperatingSystemMXBean.html#getSystemLoadAverage%28%29

I don't think a loop around the loop is necessary. The actual code (which is 
slightly different from the snippet I posted above) already prints a message 
between each attempt.

I was thinking of just bumping the timeout from 5 seconds to, say, 60 seconds. 
If it's a load issue, we should se something like "Port file values found after 
9000 ms", in which case we know for sure that it was a premature timeout issue. 
If no port files materialize after >60 seconds, we can probably safely assume 
that the issue is due to something else.

-- Andreas

Reply via email to