I'm getting huge logs with repeated errors of this sort: mca_oob_tcp_accept: accept() failed: Too many open files (24).
I googled a bit and see that this seems to be an MPI complaint about too many files open. I checked ulimit and that says "unlimited" Any tips about what I ought to be looking at; I'm a bit lost as to how I can get to the source of this particular error. Furthermore it does not occur systematically but only once in a while. Unfortunately once it happens it seems to be a catastrophe! Any suggestions? -- Rahul
_______________________________________________ Beowulf mailing list, [email protected] To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
