Kind of sounds to me like they are using the wrong proc when receiving. Here is an example of what a modex receive should look like:https://github.com/open-mpi/ompi/blob/main/opal/mca/btl/ugni/btl_ugni_endpoint.c#L44-NathanOn Aug 3, 2022, at 11:29 AM,
"Jeff Squyres (jsquyres) via devel" wrote:Gl
It really is a shame that could not go forward. There are really three end
goals in mind:
1) Consistency. We all have different coding styles and following a common
coding style is more and more considered a best practice. The number of
projects using clang-format grows continuously. I find it
for that at this time?
>
>> -Original Message-
>> From: devel [mailto:devel-boun...@lists.open-mpi.org] On Behalf Of
>> Nathan Hjelm via devel
>> Sent: Friday, April 12, 2019 11:19 AM
>> To: Open MPI Developers
>> Cc: Nathan Hjelm ; Castain, Ralph H
>> ; Y
d but it'll use more resources.
>
> On Mon, Apr 15, 2019 at 9:00 AM Nathan Hjelm via devel
> wrote:
> If you do that it may run out of resources and deadlock or crash. I recommend
> either 1) adding a barrier every 100 iterations, 2) using allreduce, or 3)
> enab
If you do that it may run out of resources and deadlock or crash. I recommend
either 1) adding a barrier every 100 iterations, 2) using allreduce, or 3)
enable coll/sync (which essentially does 1). Honestly, 2 is probably the
easiest option and depending on how large you run may not be any slowe
That is accurate. We expect to support OPA with the btl/ofi component. It
should give much better performance than osc/pt2pt + mtl/ofi. What would be
good for you to do on your end is verify everything works as expected and that
the performance is on par for what you expect.
-Nathan
> On Apr 1
Appears to be broken. Its failing and simply saying:
Testing in progress..
-Nathan
On Oct 18, 2018, at 11:34 AM, Geoffrey Paulsen wrote:
I've re-enabled IBM CI for PRs.
___
devel mailing list
devel@lists.open-mpi.org
https://lists.open-mpi.org/
Ah yes, 18f23724a broke things so we had to fix the fix. Didn't apply it to the
v2.x branch. Will open a PR to bring it over.
-Nathan
On Oct 17, 2018, at 11:28 AM, Eric Chamberland
wrote:
Hi,
since commit 18f23724a, our nightly base test is broken on v2.x branch.
Strangely, on branch v3.x
Nope. We just never bothered to disable it on osx. I think Jeff was working on
a patch.
-Nathan
> On Sep 28, 2018, at 3:21 PM, Barrett, Brian via devel
> wrote:
>
> Is there any practical reason to have the memory patcher component enabled
> for MacOS? As far as I know, we don’t have any t
no
Sent from my iPhone
> On Aug 27, 2018, at 8:51 AM, Jeff Squyres (jsquyres) via devel
> wrote:
>
> Will this get through?
>
> --
> Jeff Squyres
> jsquy...@cisco.com
>
> ___
> devel mailing list
> devel@lists.open-mpi.org
> https://lists.open-mpi
> On Jul 16, 2018, at 11:18 PM, Marco Atzeri wrote:
>
>> Am 16.07.2018 um 23:05 schrieb Jeff Squyres (jsquyres) via devel:
>>> On Jul 13, 2018, at 4:35 PM, Marco Atzeri wrote:
>>>
For one. The C++ bindings are no longer part of the standard and they are
not built by default in v3.1
For one. The C++ bindings are no longer part of the standard and they are not
built by default in v3.1x. They will be removed entirely in Open MPI v5.0.0.
Not sure why the fortran one is not building.
-Nathan
> On Jul 13, 2018, at 2:02 PM, Marco Atzeri wrote:
>
> Hi,
> may be I am missing s
Looks like a bug to me. The second argument should be a value in v3.x.x.
-Nathan
> On Jul 6, 2018, at 4:00 PM, r...@open-mpi.org wrote:
>
> I’m seeing this when building the v3.0.x branch:
>
> runtime/ompi_mpi_init.c:395:49: warning: passing argument 2 of
> ‘opal_atomic_cmpset_32’ makes intege
Should have it fixed today or tomorrow. Guess I didn't have a sufficiently old
gcc to catch this during testing.
-Nathan
> On Jun 2, 2018, at 1:09 AM, gil...@rist.or.jp wrote:
>
> Hi Ralph,
>
>
>
> see my last comment in https://github.com/open-mpi/ompi/pull/5210
>
>
>
> long story shor
I put together a pull request that does the following:
1) Make all MPI-3.0 obsoleted interfaces conditionally built. They will
still show up in mpi.h (though I can remove them from there with some
configury magic) but will either be #if 0'd out or marked with
__attribute__((__error__)). The later
ox's publicly-stated
support positions).
Is it time to deprecate / print warning messages / remove the openib BTL?
Begin forwarded message:
From: Nathan Hjelm
Subject: Re: [OMPI users] Eager RDMA causing slow osu_bibw with 3.0.0
Date: April 5, 2018 at 12:48:08 PM EDT
To: Open MPI Users
Fixed in master and I'm the 3.0.x branch. Try the nightly tarball.
> On Mar 9, 2018, at 10:01 AM, Alan Wild wrote:
>
> I’ve been running the OSU micro benchmarks (
> http://mvapich.cse.ohio-state.edu/benchmarks/ ). on my various MPI
> installations. One test that has been consistently faili
All MPI implementations have support for using CMA to transfer data between
local processes. The performance is fairly good (not as good as XPMEM) but the
interface limits what we can do with to remote process memory (no atomics). I
have not heard about this new proposal. What is the benefit of
Should be fixed by PR #4569 (https://github.com/open-mpi/ompi/pull/4569).
Please treat and let me know.
-Nathan
> On Dec 1, 2017, at 7:37 AM, DERBEY, NADIA wrote:
>
> Hi,
>
> Our validation team detected a hang when running osu_bibw
> micro-benchmarks from the OMB 5.3 suite on openmpi 2.0.2
I have a fix I am working on. Will open a PR tomorrow morning.
-Nathan
> On Jun 22, 2017, at 6:11 PM, r...@open-mpi.org wrote:
>
> Here’s something even weirder. You cannot build that file unless mpi.h
> already exists, which it won’t until you build the MPI layer. So apparently
> what is happ
A quick glance makes me think this might be related to the info changes. I am
taking a look now.
-Nathan
On Jun 01, 2017, at 09:35 AM, "r...@open-mpi.org" wrote:
Hey folks
I scanned the nightly MTT results from last night on master, and the RTE looks
pretty solid. However, there are a LOT o
Probably a bug. Can you open an issue on github?
-Nathan
> On May 8, 2017, at 2:16 PM, Dahai Guo wrote:
>
>
> Hi,
>
> The attached test code pass with MPICH well, but has problems with OpenMPI.
>
> There are three tests in the code, the first passes, the second one hangs,
> and the third o
By default MPI errors are fatal and abort. The error message says it all:
*** An error occurred in MPI_Reduce
*** reported by process [3645440001,0]
*** on communicator MPI_COMM_WORLD
*** MPI_ERR_COUNT: invalid count argument
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort
3226
>
> Sounds like Nathan is going to fix it shortly.
>
>
>> On Mar 23, 2017, at 10:39 AM, Nathan Hjelm wrote:
>>
>> Looks like an uncaught merge error. The call should have an _ before it.
>> Will fix.
>>
>> On Mar 23, 2017, at 6:33 AM, Clement F
Looks like an uncaught merge error. The call should have an _ before it. Will
fix.
> On Mar 23, 2017, at 6:33 AM, Clement FOYER wrote:
>
> Hi everyone,
>
> While testing localy on my computer code using one-sided operations, the
> selected module is the pt2pt OSC component, which always crash
l)
>
>
> Cheers,
>
>
> Gilles
>
>
> On 11/22/2016 12:29 PM, Nathan Hjelm wrote:
>> MPI_Win_lock does not have to be blocking. In osc/rdma it is blocking in
>> most cases but not others (lock all with on-demand is non-blocking) but in
>> osc/pt2pt
lling process to the target rank on the specified window",
> which can be read as being a noop if no pending operations exists.
>
> George.
>
>
>
> On Mon, Nov 21, 2016 at 8:29 PM, Nathan Hjelm wrote:
> MPI_Win_lock does not have to be blocking. In osc/rdma it is b
MPI_Win_lock does not have to be blocking. In osc/rdma it is blocking in most
cases but not others (lock all with on-demand is non-blocking) but in osc/pt2pt
is is almost always non-blocking (it has to be blocking for proc self). If you
really want to ensure the lock is acquired you can call MPI
Yeah, that looks like a bug to me. We need to keep the check before the lock
but otherwise this is fine and should be fixed in 2.0.2.
-Nathan
> On Sep 21, 2016, at 3:16 AM, DEVEZE, PASCAL wrote:
>
> I encountered a deadlock in sync_wait_mt().
>
> After investigations, it appears that a first
ly MTT, so we can monitor progress on this support. It might
> not make it into tonight's testing, but should be tomorrow. I might also try
> to add it to our Jenkins testing too.
>
> On Wed, Sep 7, 2016 at 7:36 PM, Nathan Hjelm wrote:
> Thanks for reporting this! Glad the pr
Thanks for reporting this! Glad the problem is fixed. We will get this into
2.0.2.
-Nathan
> On Sep 7, 2016, at 9:39 AM, Vallee, Geoffroy R. wrote:
>
> I just tried the fix and i can confirm that it fixes the problem. :)
>
> Thanks!!!
>
>> On Sep 2, 2016, at 6:18 AM, Jeff Squyres (jsquyres)
Posted a possible fix to the intercomm hang. See
https://github.com/open-mpi/ompi/pull/2061
-Nathan
> On Sep 7, 2016, at 6:53 AM, Nathan Hjelm wrote:
>
> Looking at the code now. This code was more or less directly translated from
> the blocking version. I wouldn’t be surprised
Looking at the code now. This code was more or less directly translated from
the blocking version. I wouldn’t be surprised if there is an error that I
didn’t catch with MTT on my laptop.
That said, there is an old comment about not using bcast to avoid a possible
deadlock. Since the collective
This might be the last straw in IA64 support. If we can’t even test it anymore
it might *finally* be time to kill the asm. If someone wants to use IA64 they
can use the builtin atomic support.
-Nathan
> On Aug 30, 2016, at 4:42 PM, Paul Hargrove wrote:
>
> I don't recall the details of the la
The best way to put this is his compiler defaults to --std=gnu89. That gives
him about 90% of what we require from C99 but has weirdness like __restrict.
The real solution is the list of functions that are called out on link and spot
fixing with the gnu_inline attribute if -fgnu89-inline does n
Looking at the bug in google cache
(https://webcache.googleusercontent.com/search?q=cache:p2WZm7Vlt2gJ:https://llvm.org/bugs/show_bug.cgi%3Fid%3D5960+&cd=1&hl=en&ct=clnk&gl=us)
then isn’t the answer to just use -fgnu89-inline on this platform? Does that
not solve the linking issue? From what I c
FYI, C99 has been required since late 2012. Going through the commits there is
no way Open MPI could possibly compile with —std=c89 or —std=gnu99. Older
compilers require we add —std=c99 so we can not remove the configure check.
commit aebd1ea43237741bd29878604b742b14cc87d68b
Author: Nathan
patch is simple and non-performance impacting.
>
> Original Message
> From: Nathan Hjelm
> Sent: Saturday, August 27, 2016 20:23
> To: Open MPI Developers
> Reply To: Open MPI Developers
> Subject: Re: [OMPI devel] C89 support
>
> Considering gcc more or less had full C99 supp
Considering gcc more or less had full C99 support in 3.1 (2002) and SLES10
dates back to 2004 I find this surprising. Clangs goal from the beginning was
full C99 support. Checking back it looks like llvm 1.0 (2003) had C99 support.
What version of clang/llvm are you using?
-Nathan
> On Aug 27,
__malloc_initialize_hook got “poisoned” in a newer release of glibc. We
disabled use of that symbol in 2.0.x. It might be worth adding to 1.10.4 as
well.
-Nathan
> On Aug 25, 2016, at 8:09 AM, Karol Mroz wrote:
>
> __malloc_initialize_hook
___
deve
> On Aug 19, 2016, at 4:24 PM, r...@open-mpi.org wrote:
>
> Hi folks
>
> I had a question arise regarding a problem being seen by an OMPI user - has
> to do with the old bugaboo I originally dealt with back in my LANL days. The
> problem is with an app that repeatedly hammers on a collective, a
gt;>>>>> ***[manage:25442] Signal: Segmentation fault (11)[manage:25442]
>>> Signal
>>>>>>> code: Address not mapped (1)[manage:25442] Failing at address:
>>>>>>>> 0x8
>>>>>>>>
>>>>>>&
base_btl_array_get_size(&bml_endpoint->btl_rdma);
>>>>>>> + int num_eager_btls = mca_bml_base_btl_array_get_size
>>> (&bml_endpoint->
>>>>> btl_eager);> double weight_total = 0;> int num_btls_used = 0;>> @@
>>> -57,6
>
0[core 4[hwt 0]], socket
>>>> 0[core 5[hwt0]]:> >> [B/B/B/B/B/B][./././././.]> >>
>>> [manage.cluster:27444] MCW rank 1 bound to socket 0[core 0[hwt
> 0]],socket>
>>>>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so> >> cket 0[core
gt; double
> weight_total = 0;> + int rdma_count = 0;>> - for(i = 0; i < num_btls && i <
>
>> mca_pml_ob1.max_rdma_per_request; i+
>> +) {> - rdma_btls[i].bml_btl => - mca_bml_base_btl_array_get_next
> (&bml_endpoint->btl_rdma);> - rdma_btls[i].btl
tls[rdma_count++].btl_reg = NULL;
+
+ weight_total += bml_btl->btl_weight;
}
- mca_pml_ob1_calc_weighted_length(rdma_btls, i, size,
weight_total);
+ mca_pml_ob1_calc_weighted_length (rdma_btls, rdma_count, size,
weight_total);
- return i;
+ return rdma_count;
}
> On Aug 7, 2016, at 6:51 PM,
On Aug 08, 2016, at 05:17 AM, Paul Kapinos wrote:Dear Open MPI developers,there is already a thread about 'sm BTL performace of the openmpi-2.0.0'https://www.open-mpi.org/community/lists/devel/2016/07/19288.phpand we also see 30% bandwidth loss, on communication *via InfiniBand*.And we also have a
We have established that the bug does not occur unless an RDMA network is being
used (openib, ugni, etc). The fix has been identified and will be included in
the 2.0.1 release.
-Nathan
> On Aug 8, 2016, at 3:20 AM, Christoph Niethammer wrote:
>
> Hello Howard,
>
>
> If I use tcp I get sligh
> mca_pml_ob1.max_rdma_per_request;
>> i++) {
>>> + mca_bml_base_btl_t *bml_btl = mca_bml_base_btl_array_get_next
>> (&bml_endpoint->btl_rdma);
>>> +bool ignore = true;
>>> +
>>> +for (int i = 0 ; i < num_eager_bt
h (rdma_btls, rdma_count, size,
weight_total);
-return i;
+return rdma_count;
}
> On Aug 7, 2016, at 6:51 PM, Nathan Hjelm wrote:
>
> Looks like the put path probably needs a similar patch. Will send another
> patch soon.
>
>> On Aug 7, 2016, at 6:01 PM, tmish.
= 0 ; i < num_eager_btls ; ++i) {
>> +mca_bml_base_btl_t *eager_btl =
> mca_bml_base_btl_array_get_index (&bml_endpoint->btl_eager, i);
>> +if (eager_btl->btl_endpoint == bml_btl->btl_endpoint) {
>> +ignore = false;
>> +
tinue;
+}
if (btl->btl_register_mem) {
/* do not use the RDMA protocol with this btl if 1) leave pinned
is disabled,
-Nathan
> On Aug 5, 2016, at 8:44 AM, Nathan Hjelm wrote:
>
> Nope. We are not going to change the flags as this will disable the blt for
&g
Nope. We are not going to change the flags as this will disable the blt for
one-sided. Not sure what is going on here as the openib btl should be 1) not
used for pt2pt, and 2) polled infrequently. The btl debug log suggests both of
these are the case. Not sure what is going on yet.
-Nathan
> O
Look at ompi/mpi/c/start.c. Just add a check there thst redirects to the coll
component for coll requests. You will probably need to expand the coll
interface in ompi/mca/coll/coll.h to include a start function.
> On Jul 29, 2016, at 10:20 AM, Bradley Morgan wrote:
>
>
> Hello Gilles,
>
> Th
sm is deprecated in 2.0.0 and will likely be removed in favor of vader in 2.1.0.
This issue is probably this known issue:
https://github.com/open-mpi/ompi-release/pull/1250
Please apply those commits and see if it fixes the issue for you.
-Nathan
> On Jul 26, 2016, at 6:17 PM, tmish...@jcity.m
It looks to me like double free on both send and receive requests. The receive
free is an extra OBJ_RELEASE of MPI_DOUBLE which was not malloced (invalid
free). The send free is an assert failure in OBJ_RELEASE of an OBJ_NEW() object
(invalid magic). I plan to look at in in the next couple of da
> On Jul 12, 2016, at 12:01 AM, Sreenidhi Bharathkar Ramesh
> wrote:
>
> [ query regarding an old thread ]
>
> Hi,
>
> It looks like "--disable-smp-locks" is still available as an option.
>
> 1. Will this be continued or deprecated ?
It was completely discontinued. The problem with the opti
Its correct as written. the && takes precidence over the || and the statement
gets evaluated in the order i intended. i will add the parentheses to quiet the
warning when i get a chance
> On Jul 3, 2016, at 9:01 AM, Ralph Castain wrote:
>
> I agree with the compiler - I can’t figure out exactl
https://github.com/open-mpi/ompi/pull/1788
On Jun 16, 2016, at 05:16 PM, Nathan Hjelm wrote:
Not sure why happened but it is indeed a regression. Will submit a fix now.
-Nathan
On Jun 16, 2016, at 02:19 PM, Lisandro Dalcin wrote:
Could you please check/confirm you are supporting passing
Not sure why happened but it is indeed a regression. Will submit a fix now.
-Nathan
On Jun 16, 2016, at 02:19 PM, Lisandro Dalcin wrote:
Could you please check/confirm you are supporting passing
split_type=MPI_UNDEFINED to MPI_Comm_split_type() ? IIRC, this is a
regression from 2.0.0rc2.
$ ca
Will track this one down as well tomorrow.
-Nathan
> On Jun 15, 2016, at 7:13 PM, Paul Hargrove wrote:
>
> With a PPC64/Fedora20/gcc-4.8.3 system configuring for "-m32":
>
> configure --prefix=[...] --enable-debug \
> CFLAGS=-m32 --with-wrapper-cflags=-m32 \
> CXXFLAGS=-m32 --w
Ok, that was working. Our PPC64 system is back up and I will finally be able to
fix it tomorrow.
-Nathan
> On Jun 15, 2016, at 7:35 PM, Paul Hargrove wrote:
>
> Also seen now on a big-endian Power7 with XLC-13.1
>
> -Paul
>
> On Wed, Jun 15, 2016 at 6:20 PM, Paul Hargrove wrote:
> On a litt
Opps, my compiler didn’t catch that. Will fix that now.
> On Jun 2, 2016, at 7:07 PM, George Bosilca wrote:
>
> Nathan,
>
> I see a lot of [for once valid] complains from clang regarding the last UGNI
> related commit. More precisely the MCA_BTL_ATOMIC_SUPPORTS_FLOAT value is too
> large with
The osc hang is fixed by a PR to fix bugs in start in cm and ob1. See #1729.
-Nathan
> On Jun 2, 2016, at 5:17 AM, Gilles Gouaillardet
> wrote:
>
> fwiw,
>
> the onsided/c_fence_lock test from the ibm test suite hangs
>
> (mpirun -np 2 ./c_fence_lock)
>
> i ran a git bisect and it incrimina
release branches on a rolling basis. You're too
> late for the 2.0.x series, but the door is open for the v2.1.x series (and
> beyond, of course).
>
>
>
> > On May 30, 2016, at 11:15 AM, Nathan Hjelm wrote:
> >
> > I should clarify. The PR adds support fo
I should clarify. The PR adds support for ARM64 atomics and CMA when the linux
headers are not installed. It does not update the timer code and still needs
some testing.
-Nathan
> On May 30, 2016, at 8:37 AM, Nathan Hjelm wrote:
>
> We already have a PR open to add ARM64 support. Pl
We already have a PR open to add ARM64 support. Please test
https://github.com/open-mpi/ompi/pull/1634 and let me know if it works for you.
Additional contributions are greatly appreciated!
-Nathan
> On May 30, 2016, at 4:32 AM, Sreenidhi Bharathkar Ramesh
> wrote:
>
> Hello,
>
> We may be
Only thing I can think of is the request stuff. It was working last time I
tested George’s branch. I will take a look at MTT tomorrow.
-Nathan
> On May 26, 2016, at 8:24 PM, Ralph Castain wrote:
>
> I’m seeing a lot of onesided hangs on master when trying to run an MTT scan
> on it tonight -
add_procs is always called at least once. This is how we set up shared
memory communication. It will then be invoked on-demand for non-local
peers with the reachability argument set to NULL (because the bitmask
doesn't provide any benefit when adding only 1 peer).
-Nathan
On Tue, May 17, 2016 at
The return code of your progress function should be related to the
activity (send, recv, put, get, etc completion) on your network. The
return is not really used right now but may be meaningful in the
future.
Your BTL signals progress through two mechanisms:
1) Send completion is indicated by e
Go ahead, I don't have access to xlc so I couldn't verify myself. I
don't fully understand why the last : can be omitted when there are no
clobbers.
-Nathan
On Wed, May 04, 2016 at 01:34:48PM -0500, Josh Hursey wrote:
>Did someone pick this up to merge into master & v2.x?
>I can confirm
Should be fixed by https://github.com/open-mpi/ompi/pull/1618
Thanks for catching this.
-Nathan
On Mon, May 02, 2016 at 02:30:19PM -0700, Paul Hargrove wrote:
>I have an Pentium III Linux system which fails "make check" with:
>make[3]: Entering directory
>
> `/home/phargrov/OMPI/ope
Fixed by https://github.com/open-mpi/ompi/pull/1619
Thanks for catching this.
-Nathan
On Mon, May 02, 2016 at 01:57:07PM -0700, Paul Hargrove wrote:
>I have an x86-64/Linux system with a fairly standard install of Scientific
>Linux 6.3 (a RHEL clone like CentOS).
>However, it appear
Looks like patcher/linux needs some work on 32-bit systems. I will try
to get this fixed in the next day or two.
-Nathan
On Mon, May 02, 2016 at 03:36:35PM -0700, Paul Hargrove wrote:
>New since the last time I did testing, I now have access to Linux MIPS64
>(Cavium Octeon II) systems.
>
Thanks Paul, I will see what I can do to fix this one.
-Nathan
On Mon, May 02, 2016 at 02:30:19PM -0700, Paul Hargrove wrote:
>I have an Pentium III Linux system which fails "make check" with:
>make[3]: Entering directory
>
> `/home/phargrov/OMPI/openmpi-2.0.0rc2-linux-x86-OpenSuSE-1
at a time, and the only pml that
>could use btl's was ob1. If that is the case, how can the openib btl run
>at the same time as cm and yalla?
>
>Also, what is UD?
>
>Thanks,
> David
>
>On 04/21/2016 09:25 AM, Nathan Hjelm wrote:
>
> T
The openib btl should be able to run alongside cm/mxm or yalla. If I
have time this weekend I will get on the mustang and see what the
problem is. The best answer is to change the openmpi-mca-params.conf in
the install to have pml = ob1. I have seen little to no benefit with
using MXM on mustang.
Hah, just caught that as well. Commented on the commit on
github. Definitely looks wrong.
-Nathan
On Thu, Apr 07, 2016 at 05:43:17PM +, Dave Goodell (dgoodell) wrote:
> [inline]
>
> On Apr 7, 2016, at 12:53 PM, git...@crest.iu.edu wrote:
> >
> > This is an automated email from the git hook
This is done to provide the functionality when the compiler doesn't
support inline asm. I do not know how testing is done with the atomics
in opal/asm/base atomics so its possible some of them are incorrect.
-Nathan
On Fri, Apr 01, 2016 at 02:39:39PM +0530, Sreenidhi Bharathkar Ramesh wrote:
>
The prepare_dst function was a bottleneck to providing fast one-sided
support using network RDMA. As the function was only used in the RDMA
path it was removed in favor of btl_register_mem + a more complete
put/get interface. You can look at the way the various btls moved the
functionality. The si
etti situation.
>
>Thank you
>Durga
>Life is complex. It has real and imaginary parts.
>On Fri, Mar 4, 2016 at 11:06 AM, Nathan Hjelm wrote:
>
> On Thu, Mar 03, 2016 at 05:26:45PM -0500, dpchoudh . wrote:
> >Hello all
> >
> >
On Thu, Mar 03, 2016 at 05:26:45PM -0500, dpchoudh . wrote:
>Hello all
>
>Here is a 101 level question:
>
>OpenMPI supports many transports, out of the box, and can be extended to
>support those which it does not. Some of these transports, such as
>infiniband, provide hardwar
I will add to how crazy this is. The C standard has been very careful
to not break existing code. For example the C99 boolean is _Bool not
bool because C reserves _[A-Z]* for its own use. This means a valid C89
program is a valid C99 and C11 program. It Look like this is not true in
C++.
-Nathan
Hmm, I think you are correct. There may be instances where two different
local processes may use the same CID for different communicators. It
should be sufficient to add the PID of the current process to the
filename to ensure it is unique.
-Nathan
On Tue, Feb 02, 2016 at 09:33:29PM +0900, Gille
orms with this idea.
> There are environments, for example, that use PBSPro for one part of
>the system (e.g., IO nodes), but something else for the compute
>section.
>>
>> Personally, I'd rather follow Howard's suggestion.
>
On Mon, Jan 25, 2016 at 05:55:20PM +, Jeff Squyres (jsquyres) wrote:
> Hmm. I'm of split mind here.
>
> I can see what Howard is saying here -- adding complexity is usually a bad
> thing.
>
> But we have gotten these problem reports multiple times over the years:
> someone *thinking* that
Looks like there is a missing conditional in
mca_btl_vader_component_close(). Will add it and PR to 1.10 and 2.x.
-Nathan
On Tue, Dec 15, 2015 at 11:18:11AM +0100, Justin Cinkelj wrote:
> I'm trying to port Open MPI to OS with threads instead of processes.
> Currently, during MPI_Finalize, I get
This happens because we do not currently have a way to detect
connectivity without allocating ompi_proc_t's for every rank in the
window. I added the osc_rdma_btls MCA variable to act as a short-circuit
that avoids the costly connectivity lookup. By default the value is
ugni,openib. You can set it
I think this is from a known issue. Try applying this and run again:
https://github.com/open-mpi/ompi/commit/952d01db70eab4cbe11ff4557434acaa928685a4.patch
-Nathan
On Wed, Oct 14, 2015 at 06:33:07PM +0200, Paul Kapinos wrote:
> Dear Open MPI developer,
>
> We're puzzled by reproducible perform
On Wed, Oct 14, 2015 at 02:40:00PM +0100, Vladimír Fuka wrote:
> Hello,
>
> I have a problem with using the quadruple (128bit) or extended
> (80bit) precision reals in Fortran. I did my tests with gfortran-4.8.5
> and OpenMPI-1.7.2 (preinstalled OpenSuSE 13.2), but others confirmed
> this beha
Hah, opps. Typo in the coverity fixes. Fixing now.
-Nathan
On Tue, Sep 22, 2015 at 10:24:29AM -0600, Howard Pritchard wrote:
>Hi Folks,
>Is anyone seeing a problem compiling ompi today?
>This is what I'm getting
> CC osc_pt2pt_passive_target.lo
>In file included from .
No, it was not. Will fix.
-Nathan
On Wed, Sep 16, 2015 at 07:26:58PM -0700, Ralph Castain wrote:
>Yes - Nathan made some changes related to the add_procs code. I doubt that
>configure option was checked...
>On Wed, Sep 16, 2015 at 7:13 PM, Jeff Squyres (jsquyres)
> wrote:
>
>
Not sure. I give a +1 for blowing them away. We can bring them back
later if needed.
-Nathan
On Wed, Sep 16, 2015 at 01:19:24PM -0400, George Bosilca wrote:
>As they don't even compile why are we keeping them around?
> George.
>On Wed, Sep 16, 2015 at 12:05 PM, Natha
cesses.
>
> Thanks
> Edgar
>
> On 9/16/2015 10:42 AM, Nathan Hjelm wrote:
> >
> >The reproducer is working for me with master on OX 10.10. Some changes
> >to ompi_comm_set went in yesterday. Are you on the latest hash?
> >
> >-Nathan
> >
> &g
iboffload and bfo are opal ignored by default. Neither exists in the
release branch.
-Nathan
On Wed, Sep 16, 2015 at 12:02:29PM -0400, George Bosilca wrote:
>While looking into a possible fix for this problem we should also cleanup
>in the trunk the leftover from the OMPI_FREE_LIST.
>
I just realized my branch is behind master. Updating now and will retest.
-Nathan
On Wed, Sep 16, 2015 at 10:43:45AM -0500, Edgar Gabriel wrote:
> yes, I did fresh pull this morning, for me it deadlocks reliably for 2 and
> more processes.
>
> Thanks
> Edgar
>
> On 9/16/
The reproducer is working for me with master on OX 10.10. Some changes
to ompi_comm_set went in yesterday. Are you on the latest hash?
-Nathan
On Wed, Sep 16, 2015 at 08:49:59AM -0500, Edgar Gabriel wrote:
> something is borked right now on master in the management of inter vs. intra
> communica
The formatting of the code got all messed up. Please send a diff and I
will take a look. ompi free list no longer exists in master or the next
release branch but the change may be worthwhile for the opal free list
code.
-Nathan
On Wed, Sep 16, 2015 at 04:03:44PM +0300, Алексей Рыжих wrote:
>
+1
On Mon, Aug 24, 2015 at 07:08:02PM +, Jeff Squyres (jsquyres) wrote:
> FWIW, we have had verbal agreement in the past that the v1.8 series was the
> last one to contain MX support. I think it would be fine for all MX-related
> components to disappear from v1.10.
>
> Don't forget that M
f Squyres (jsquyres)
> wrote:
>
> (the fix has been merged in to v1.8 and v1.10 branches)
> > On Aug 20, 2015, at 12:18 PM, Nathan Hjelm wrote:
> >
> >
> > I see the problem. Both Ralph and I missed an error in the
> > che
1 - 100 of 423 matches
Mail list logo