On May 10, 2010, at 12:02 PM, Igor wrote:
> at the
> end of execution I get the following not very pleasant error messages.
> They repeat as many times as number of cores was passed to mpirun:
>
> *** An error occurred in MPI_Comm_free
> *** after MPI was finalized
> *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
> [blade87:23606] Abort after MPI_FINALIZE completed successfully; not
> able to guarantee that all
> other processes were killed
> The questions I have:
> 1. What could be the reason of MPI errors?
I don't honestly know. We get them too and agree that they are annoying.
> I had to disable some *warnings* in openmpi to make it work. As far as
> I understood they concerned smth like Infiniband network connection
> between the nodes, etc. This was done with:
> export OMPI_MCA_btl="^udapl,openib"
Warnings prevented a build? That's odd.
> Also, I had the error while making easy_install of mpi4py. However I
> don't think it's critical because it did install. Some of install log:
> ..
> Running mpi4py-1.2.1/setup.py -q bdist_egg --dist-dir
> /tmp/easy_install-CYyW8n/mpi4py-1.2.1/egg-dist-tmp-J78hzC
> _configtest.c:3:17: error: mpe.h: No such file or directory
> ..
> Installed .../site-packages/mpi4py-1.2.1-py2.6-linux-x86_64.egg
> Processing dependencies for mpi4py
> Finished processing dependencies for mpi4py
Hmmm, OK.
> 2. How can I print var.getGlobalValue() ones only and not the number
> of times equal to -np value? (mainProcessor() doesn't work or I don't
> know from where to import it).
from fipy.tools import parallel
if parallel.procID == 0:
print var.getGlobalValue()
>
> Possible useful PS:
>
> 1.My Administrator has configured trilinos like this:
>> mkdir build
>> cd build
>> cmake -D Trilinos_ENABLE_ALL_PACKAGES:BOOL=ON -D
>> Trilinos_ENABLE_TESTS:BOOL=ON -D TPL_ENABLE_MPI:BOOL=ON -D
>> MPI_BASE_DIR:PATH="%_libdir/openmpi/%openmpi-gcc/" -D
>> Trilinos_ENABLE_PyTrilinos:BOOL=ON -D BUILD_SHARED_LIBS:BOOL=ON -D
>> Trilinos_ENABLE_ALL_OPTIONAL_PACKAGES:BOOL=ON
>> -DPythonInterp_FIND_VERSION:STRING=2.6 -D
>> CMAKE_INSTALL_PREFIX:PATH=%buildroot/opt/trilinos -D
>> PyTrilinos_INSTALL_PREFIX:PATH=%buildroot%_prefix/local/Python26/ ..
>> make %{?_smp_mflags}
OK, thanks. That looks reasonable enough.
> 2. (!) I don't see swig in the configure (possible source of error I get?)
If swig is findable with on the current $PATH, I don't believe it's necessary
to specify SWIG_EXECUTABLE.
> 3. nm libpytrilinos.so does not return any symbols!
Proably becacause CMAKE_BUILD_TYPE is not set to DEBUG. I now don't think the
presence or absence of symbols is critical; it's just the first difference I
noticed between my serial and parallel builds.
> 4. Some benchmark I did:
> Grid2D: 140x120 --- no gain on 2 real cores vs 1, even a bit slower
> (possibly due to message overhead)
> Grid2D: 140x120 --- 2.1 times gain on 8 cores (4 real hyperthreaded cores) vs
> 1
> Grid2D: 280x120 --- 3.42 times gain on 8 cores (4 real hyperthreaded cores)
> vs 1
Thanks for these.