In Quoting:
The code that Paul posted was a snip from the test_cluster script (at the
location he mentions). Look for that section of the file and modify to
match as he shows. This simply increases the minimum timeout for the
test.
> Also it seems that Chris Hazelrig has found a solution to getting the
> MPICH test to work by removing the RedHat LAM/MPI, however he
> was using RedHat 7.1 and I'm using RedHat 7.3. And if this is a solution
> what is the best way to remove these from my path without affecting
> OSCAR's setup?
>Mr. Hazelrig's problem had little to do with any MPICH problems. He had
>the wrong LAM installed, which caused problems with switcher, and thus
>with the shell environment, and thus with the tests. You can check the
>same things I suggested he check, to make sure that LAM mixups are not the
>problem for you.
>Thanks,
>Jason
What I was trying to say, was that in our particular instance, by increasing the initial timeout on the MPICH test, we were able to have the MPICH test pass. Further investigation and collaboration with others seemed to think that the latency of the 100Mbit switch, coupled with the 100Mbit nics, increasing the interval to 600 seconds was not unreasonable. Thanks also to Jason for clearing up the other misconceptions of the oscar-users. The different versions of LAM/MPICH, and 7.1 version of RedHat had nothing to do with my post. Hope this clears that up for Shane and the others having problems.
regards,
paul bounds
