JFYI: the sane issue is also in Open MPI 4.1.1.
I cannot open an Gitlab issue due to lack of account(*) so I would kindly
ask
somebody to open one, if possible.
Have a nice day
Paul Kapinos
(* too many accounts in my life. )
On 4/16/21 6:02 PM, Paul Kapinos wrote:
Dear Open MPI
rnal)... no
> ...
> PMIx support: Internal
This is surprising and feels like an error. Could you have a look at this? Thank
you!
Have a nice day,
Paul Kapinos
P.S. grep for 'PMIx' in config-log
https://rwth-aachen.sciebo.de/s/xtNIx2dJlTy2Ams
(pastebin and gist both need accounts and I ha
rovide this and all other kind of information)
Have a nice day,
Paul Kapinos
[1] https://developer.nvidia.com/hpc-compilers
FCLD libmpi_usempif08.la
/usr/bin/ld: .libs/comm_spawn_multiple_f08.o: relocation R_X86_64_32S against
`.rodata' can not be used when making a shared objec
Likely I see another one of type
https://github.com/open-mpi/ompi/issues/4466
Most amazing is that only one version of Open MPI (the patched 3.0.0 one) stops
to work instead of all. Seem's we're lucky. WOW.
will report on results of 3.0.0p rebuild.
best,
Paul Kapinos
$ objdump -S
release; I'm fighting
with some ISV to let they update their Sw to 1.10.x NOW; we know about one who
just managed to go from 1.6.x to 1.8.x a half year ago...)
Thank you very much!
Paul Kapinos
On 10/12/2017 09:31 AM, Gilles Gouaillardet wrote:
> Paul,
>
>
> i made PR #4331 htt
path 1.10.7 release - likely because you develop on much much
newer version of Open MPI.
Q1: on *which* release the path 4331 should be applied?
Q2: I assume it is unlikely that this patch would be back-ported to 1.10.x?
Best
Paul Kapinos
On 10/12/2017 09:31 AM, Gilles Gouaillardet wrote:
&g
4 and
2.0.2.
Well, The Question: is there a way/a chance to effectively disable the busy wait
using Open MPI?
Best,
Paul Kapinos
[1] http://www.open-mpi.de/faq/?category=running#force-aggressive-degraded
[2]
http://blogs.cisco.com/performance/polling-vs-blocking-message-passingprogre
> DESCRIPTION
>This routine, or MPI_Init_thread, must be called before any other MPI
>routine (apart from MPI_Initialized) is called. MPI can be initialized
>at most once; subsequent calls to MPI_Init or MPI_Init_thread are erro-
>neous.
--
Dipl.
ered as installed
improperly?
Best,
Paul Kapinos
--
Dipl.-Inform. Paul Kapinos - High Performance Computing,
RWTH Aachen University, IT Center
Seffenter Weg 23, D 52074 Aachen (Germany)
Tel: +49 241/80-24915
smime.p7s
Description: S
ude/lustre/liblustreapi.h file, included from
'openmpi-2.0.1/ompi/mca/fs/lustre/fs_lustre.c' file (line 46, doh).
well, it is about you on change or keep the way the Lustre headers being
included in Open MPI. Just my $2%.
Have a nice day,
Paul Kapinos
pk224850@lnm001:/w0/tmp/pk224850/Op
es the Intel/17 + OpenMPI combination do not like this?
If NO why to hell none of other compiler+MPI combinations complain about this?
:o)
Have a nice day,
Paul Kapinos
P.S. Did you noticed also this one?
https://www.mail-archive.com/users@lists.open-mpi.org//msg30
On 08/08/16 18:01, Nathan Hjelm wrote:
On Aug 08, 2016, at 05:17 AM, Paul Kapinos wrote:
Dear Open MPI developers,
there is already a thread about 'sm BTL performace of the openmpi-2.0.0'
https://www.open-mpi.org/community/lists/devel/2016/07/19288.php
and we also see 30% bandwidt
, please? It's about 30%
of performance
Best
Paul
P.S. btl_openib_get_alignment and btl_openib_put_alignment are by default '0' -
setting they high did not change the behaviour...
--
Dipl.-Inform. Paul Kapinos - High Performance Computing,
RWTH Aachen University, IT Center
Seffenter Weg
ave a nice day,
Paul Kapinos
pk224850@linuxc2:/opt/MPI/openmpi-1.8.1/linux/intel/include[519]$ ls -la
mpp/shmem.fh
lrwxrwxrwx 1 pk224850 pk224850 11 Jul 13 13:20 mpp/shmem.fh -> ../shmem.fh
Cheers,
Gilles
On Wednesday, July 13, 2016, Paul Kapinos mailto:kapi...@itc.rwth-aachen.de>&g
And by the way: is here a way to limit the maximum include depths in CMake for
header files? This would workaround this one 'infinite include loop' issue...
Have a nice day,
Paul Kapinos
..
access("/opt/MPI/openmpi-1.10.2/linux/intel_16.0.2.181/include/mpp/shmem.f
Hello all, JFYI and for log purposes:
*In short: 'caddr_t' issue is known and is addressed in new(er) ROMIO releases.*
Below the (off-list) answer (snippet) from Rob Latham.
On 12/08/15 13:16, Paul Kapinos wrote:
In short: ROMIO in actual OpenMPI versions cannot configure using ol
irs may be found
at (3) (89 MB!)
Best wishes
Paul Kapinos
1) https://www.open-mpi.org/community/lists/devel/2014/10/16106.php
2) https://www.open-mpi.org/community/lists/devel/2014/10/16109.php
https://github.com/hppritcha/ompi/commit/53fd425a6a0843a5de0a8c544901fbf01246ed31
3)
https://rwt
fixes your problem, the fix is already included
in the upcoming v1.10.1 release.
Indeed, that was it. Fixed!
Many thanks for support!
Best
Paul
--
Dipl.-Inform. Paul Kapinos - High Performance Computing,
RWTH Aachen University, IT Center
Seffenter Weg 23, D 52074 Aachen (Germany)
Tel: +49
a about this 20%+ bandwidth loss?
Best
Paul Kapinos
--
Dipl.-Inform. Paul Kapinos - High Performance Computing,
RWTH Aachen University, IT Center
Seffenter Weg 23, D 52074 Aachen (Germany)
Tel: +49 241/80-24915
MCA btl: parameter "btl_openib_verbose" (current val
in somewhere in the bush.
Could someone with moar xperience in linking libs and especially Open MPI take a
look at this? (sorry for pushing this, but all this smells for me being an
general linking problem rooted somewhere in Open MPI and '--disable-dlopen', see
"fun fact
P.S. We also have Sun/Oracle Studio:
$ module avail studio
On 12/11/14 19:45, Jeff Squyres (jsquyres) wrote:
Ok.
FWIW: I test with gcc and the intel compiler suite. I do not have a PGI
license to test with.
--
Dipl.-Inform. Paul Kapinos - High Performance Computing,
RWTH Aachen
t login to get in-deepth feeling?
Best
Paul Kapinos
Attached: some logs from Instalation at 27.05 and today't try, and quota.h
(changed at 29.09). Note that also the kernel changed (and maybe the Scientific
Linux version from 6.4 to 6.5?)
pk224850@cluster:~[502]$ ls -la /usr/include/sys/q
m
To: , Open MPI Developers , "Kapinos,
Paul" , "Göbbert, Jens Henrik"
On 10/28/2014 06:00 AM, Paul Kapinos wrote:
Dear Open MPI and ROMIO developer,
We use Open MPI v.1.6.x and 1.8.x in our cluster.
We have Lustre file system; we wish to use MPI_IO.
So the OpenMPI'
.rz.RWTH-Aachen.DE 2.6.32-431.29.2.el6.x86_64 #1 SMP Tue Sep 9
13:45:55 CDT 2014 x86_64 x86_64 x86_64 GNU/Linux
pk224850@cluster:~[510]$ cat /etc/issue
Scientific Linux release 6.5 (Carbon)
Note that openmpi/1.8.1 seem to be fully OK (MPI_IO works) in our environment.
Best
Paul Kapinos
P.S.
2014, at 7:51 AM, Ralph Castain wrote:
Forwarding this for Paul until his email address gets updated on the User list:
Begin forwarded message:
Date: October 17, 2014 at 6:35:31 AM PDT
From: Paul Kapinos
To: Open MPI Users
Cc: "Kapinos, Paul" ,
Subject: Open MPI 1.8: link pr
Is that behaviour intentionally? (it is quite uncomfortable, huh)
Best
Paul Kapinos
P.S. Tested versions: 1.6.5, 1.7.4
--
Dipl.-Inform. Paul Kapinos - High Performance Computing,
RWTH Aachen University, IT Center
Seffenter Weg 23, D 52074 Aachen (Germany)
Tel: +49 241/80-24915
#!
d a bunch of diagnostic statements that should help me
track it down.
Thanks
Ralph
On Feb 12, 2014, at 1:26 AM, Paul Kapinos wrote:
As said, the change in behaviour is new in 1.7.4 - all previous versions has been worked.
Moreover, setting "-mca oob_tcp_if_include ib0" is a workaround for o
why we don't pickup and use
ib0 if it is present and specified in if_include - we should be doing it.
For now, can you run this with "-mca oob_base_verbose 100" on your cmd line and
send me the output? Might help debug the behavior.
Thanks
Ralph
On Feb 11, 2014, at 1:22 AM
-
No idea, why the file share/openmpi/help-oob-tcp.txt has not been installed in
1.7.4, as we compile this version in pretty the same way as previous versions..
Best,
Paul Kapinos
--
Dipl.-Inform. Paul Kapinos - High Performance Computing,
RWTH Aac
On 12/04/13 14:53, Jeff Squyres (jsquyres) wrote:
On Dec 4, 2013, at 4:31 AM, Paul Kapinos wrote:
Argh - what a shame not to see "btl:usnic" :-|
What a shame you don't have Cisco hardware to use the usnic BTL! :-p
Well, this is far above my decision level :o)
Look
On 12/03/13 23:27, Jeff Squyres (jsquyres) wrote:
On Nov 22, 2013, at 1:19 PM, Paul Kapinos wrote:
Well, I've tried this path on actual 1.7.3 (where the code is moved some 12
lines - beginning with 2700).
!! - no output "skipping device"! Also when starting main processe
osity path did
not worked.
Well, is there any progress on this frontline? Or, can I activate more verbosity
/ what did I do wrong with the path? (see attached file)
Best!
Paul Kapinos
*) the nodes used for testing are also Bull BCS nodes but vonsisting of just two
boards instead of 4
--
thank you in advanced
Mohammad
--
Webmail: http://mail.livenet.ch
Glauben entdecken: http://www.jesus.ch
Christliches Webportal: http://www.livenet.ch
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/de
tructure
of nodes automatically?
Background: Currently we're using the 'carto' framework on our kinda special
'Bull BCS' nodes. Each such node consist of 4 boards with own IB card but build
a shared memory system. Clearly, communicating should go over the nearest IB
inte
ile-system=testfs+ufs+nfs+lustre'
--enable-mpi-ext ..
(adding paths, compiler-specific optimisation things and -m32 or -m64)
An config.log file attached FYI
Best
Paul
--
Dipl.-Inform. Paul Kapinos - High Performance Computing,
RWTH Aachen University, Center for Computing and Comm
tinfo.cgi/devel
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel
--
Dipl.-Inform. Paul Kapinos - High Performance Computing,
RWTH Aachen University, Center for Computing and Communication
Seffenter Weg 23, D 52074 Aachen (Germany)
ure. But I'd like to hear Paul's input on this first.
Did it work with log_num_mtt=26?
I don't have that kind of machines to test this.
-- YK
On Nov 3, 2012, at 6:33 PM, Yevgeny Kliteynik wrote:
Hi Paul,
On 10/31/2012 10:22 PM, Paul Kapinos wrote:
Hello Yevgeny, hello all,
Yev
, but the maximum
amount of registrable memory I was able to get was one TB (23 / 5). All tries
to get more (24/5, 23/6 for 2 TB) lead to not responding InfiniBand HCAs.
Is there any another limits in the kernel have to be adjusted in order to be
able to register that a bunch of memo
unning#mpi-preconnect) there is no such
huge latency outliers for the first sample.
Well, we know about the warm-up and lazy connections.
But 200x ?!
Any comments about that is OK so?
Best,
Paul Kapinos
(*) E.g. HPCC explicitely say in http://icl.cs.utk.edu/hpcc/faq/index.html#132
> Addit
probably forbid any fallback to workaround this scenarios in future.
Maybe a bit more verbosity at this place is a good idea?
Best,
Paul Kapinos
--
Dipl.-Inform. Paul Kapinos - High Performance Computing,
RWTH Aachen University, Center for Computing and Communication
Seffenter Weg 23, D 52074
lowing 2x memory for being registered could be a good idea.
Does this make sense?
Best,
Paul Kapinos
P.S. The used example program is of course an synthetical thing but it is
strongly sympathized with the Serpent software. (however serpent usually use
chunks whereby the actual error arise if all t
el/2012/08/11377.php
so currently I cannot check it again. Remember me again if the link issue is
fixed!
Best,
Paul
--
Dipl.-Inform. Paul Kapinos - High Performance Computing,
RWTH Aachen University, Center for Computing and Communication
Seffenter Weg 23, D 52074 Aachen (Germany)
T
s):
$ mpiexec a.out 108000 108001
Well, we know about the need to raise the values of one of these parameters, but
I wanted to let you to know that your workaround for the problem is still not
100% perfect but only 99%.
Best,
Paul Kapinos
P.S: A note about the inform
his:
-mca oob_tcp_if_include ib0 -mca btl_tcp_if_include ib0
Nevertheless, I cannot reproduce your initial issue with 1.6.1rc2 in our
environment.
Best
Paul Kapinos
$ time /opt/MPI/openmpi-1.6.1rc2mt/linux/intel/bin/mpiexec -mca
oob_tcp_if_include ib0 -mca btl_tcp_if_include ib0 -np 4 -H
linuxscc005,linuxsc
ue? Are you interested in reproduce this?
Best,
Paul Kapinos
P.S: The same test with Intel MPI cannot run using DAPL, but run very fine opef
'ofa' (= native verbs as Open MPI use it). So I believe the problem is rooted in
the communication pattern of the program; it send very LARGE messag
45 matches
Mail list logo