Hi,

I'm experiencing a strange problem with running LIGGGHTS on  48 core
workstation running Ubuntu 14.04.4 LTS.

If I cold boot the workstation and start one of the examples form
LIGGGHTS then everything looks fine:

$ mpirun -np 48 liggghts < in.chute_wear

launches the example on all 48 cores, htop in a second window shows that
all cores are occupied and run at nearly 100% workload.

So far so good. Now I just reboot the workstation and do the exact same
steps as abovre.

This time the job just runs on a few cores (16 to 20) and the cores
don't even run at 100% load.

So now I'm trying to find out what is wrong. Bad luck is that I can't
just ask the vendor of the workstation since I'm working for that vendor
and trying to solve this issue. :-)

I guess that something that OpenMPI needs is initialized different when
I do a cold boot or a warm boot. But how can I find out what is wrong?

Already tried to look for differences in the Ubuntu boot logs, but there
is nothing different.

ompi_info --all or even the parsable format  doesn't show any difference
between cold boot and warm boot.

Any ideas what could be wrong after the reboot that causes such a behaviour?

Thanks,
Rainer
-- 
Dipl.-Inf. (FH) Rainer Koenig
Project Manager Linux Clients
Dept. PDG WPS R&D SW OSE

Fujitsu Technology Solutions
Bürgermeister-Ullrich-Str. 100
86199 Augsburg
Germany

Telephone: +49-821-804-3321
Telefax:   +49-821-804-2131
Mail:      mailto:rainer.koe...@ts.fujitsu.com

Internet         ts.fujtsu.com
Company Details  ts.fujitsu.com/imprint.html

Reply via email to