I would like to try out the notifier framework, problem is I am having trouble finding documentation for it, I am digging around the website and not finding much.

Currently we have a problem where hosts are throwing up errors like:
[nyx0891.engin.umich.edu][[25560,1],45][btl_tcp_endpoint.c: 631:mca_btl_tcp_endpoint_complete_connect] connect() failed: Connection timed out (110) [nyx0887.engin.umich.edu][[25560,1],36][btl_tcp_endpoint.c: 631:mca_btl_tcp_endpoint_complete_connect] connect() failed: Connection timed out (110) [nyx0881.engin.umich.edu][[25560,1],13][btl_tcp_endpoint.c: 631:mca_btl_tcp_endpoint_complete_connect] connect() failed: Connection timed out (110) [nyx0888.engin.umich.edu][[25560,1],44][btl_tcp_endpoint.c: 631:mca_btl_tcp_endpoint_complete_connect] connect() failed: Connection timed out (110) [nyx0880.engin.umich.edu][[25560,1],12][btl_tcp_endpoint.c: 631:mca_btl_tcp_endpoint_complete_connect] connect() failed: Connection timed out (110) [nyx0880.engin.umich.edu][[25560,1],10][btl_tcp_endpoint.c: 631:mca_btl_tcp_endpoint_complete_connect] connect() failed: Connection timed out (110)
etc,

We would like when this happens to notify us, so we can put time stamps on events going on on the network. Is this even possible with the frame work? See we don't show any interfaces coming up and down, or any errors on interfaces, so we are looking to isolate the problem more. Only the MPI library knows when this happens.

Just that lack of docs for notifier framework, I am assuming I am just blind though. Any help on it would be great!

Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
bro...@umich.edu
(734)936-1985



Reply via email to