I'm getting huge logs with repeated errors of this sort:

mca_oob_tcp_accept: accept() failed: Too many open files (24).

I googled a bit and see that this seems to be an MPI complaint about too
many files open. I checked ulimit and that says "unlimited"

Any tips about what I ought to be looking at; I'm a bit lost as to how I can
get to the source of this particular error. Furthermore it does not occur
systematically but only once in a while. Unfortunately once it happens it
seems to be a catastrophe!

Any suggestions?

-- 
Rahul
_______________________________________________
Beowulf mailing list, [email protected]
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to