Hi, Rob Latham <[email protected]> writes: >>(in >> any case, I have tried setting the striping_unit as well, but no >> difference). So far I have no idea what is going on. ~1500 procs is >> where the trouble begins, but the number of processors that breaks the >> program is not fixed. I run it sucessfully with 1515 processors, then it >> failed with 1480... > > I suppose all one can do is get a backtrace from a few processors (by, for > example, attaching to a hung process with gdb) and see if you are stuck in > communication or if you are stuck in a case where the processes are making > very > many teeny-tiny read operations (so not stuck, but performing I/O so poorly as > to be making imperceptible progress)
I will try to attach to some process and see if I can get somewhere, but the issue seems definitely a communication one: I changed the program so that no actual reading is done, just opening the file and closing it, and still gets hung at the h5fopen_f call, so for some reason the file cannot even get opened when I go beyond ~1500 procs... Thanks, -- Ángel de Vicente http://www.iac.es/galeria/angelv/ --------------------------------------------------------------------------------------------- ADVERTENCIA: Sobre la privacidad y cumplimiento de la Ley de Protecci�n de Datos, acceda a http://www.iac.es/disclaimer.php WARNING: For more information on privacy and fulfilment of the Law concerning the Protection of Data, consult http://www.iac.es/disclaimer.php?lang=en
_______________________________________________ Hdf-forum is for HDF software users discussion. [email protected] http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org Twitter: https://twitter.com/hdf5
