Hmm, well something is still the matter.  Presumably, I didn't get
everything rebuilt and linked correctly.  But its not clear what is
the matter exactly, and I've double checked that the correct pvfs and
mpich is in use.  And I rebuilt the mpich executable with the new
mpich, so . . .

PVFS servers run and ping successfully still.  The mpi job still craps
out, but it looks to be failing differently:

Here is a run without debug (connection refused errors?):

http://www.parl.clemson.edu/~bradles/downloads/anl-io-bm-mx-16-2.o168517

And here is with the debug enabled:

http://www.parl.clemson.edu/~bradles/downloads/anl-io-bm-mx-16-2.o168520

That looks more like the timeout stuff again I guess, but a lot less
network activity this time around.

Any more assistance?

Cheers,
Brad

On Thu, Mar 5, 2009 at 2:37 PM, Scott Atchley <[email protected]> wrote:
> On Mar 5, 2009, at 1:52 PM, Bradley Settlemyer wrote:
>
>> Heh, the job works whenever I do that:
>>
>> http://www.parl.clemson.edu/~bradles/downloads/anl-io-bm-mx-16-2.o168456
>>
>> However, this run had a really slow write in the second instance:
>>
>> http://www.parl.clemson.edu/~bradles/downloads/anl-io-bm-mx-16-2.o168495
>>
>> Both include debug from two procs (on seperate nodes).  Hope that is okay.
>>
>> Cheers,
>> Brad
>
> Brad,
>
> This bug is fixed in PVFS 2.8.1.
>
> What happened in the second run above is the client disconnected and then
> reconnected. The server did not realize that the client went away and the
> server never replies to the new connection request.
>
> Remember to unset PVFS2_DEBUGMASK or your performance will be horrible. :-)
>
> Scott
>

_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users

Reply via email to