y require the
> notion of a jobid and rank-within-that-job. If the current ones don't, I
> assure you that at least one off-trunk one definitely does
>
> Some of the MTL's, of course, definitely rely on those fields.
>
>
> On Jul 23, 2014, at 7:15 PM, George Bosilca <bo
We are talking MB not KB isn't it?
George.
On Thu, Jul 24, 2014 at 2:57 PM, Rolf vandeVaart
wrote:
> WHAT: Bump up the minimum sm pool size to 128K from 64K.
> WHY: When running OSU benchmark on 2 nodes and utilizing a larger
> btl_smcuda_max_send_size, we can run
All,
I take advantage of this thread to clarify what is missing to have a perfectly
MPI agnostic BTL interface. Some of these issues are pretty straightforward
(getting rid of RTE and OMPI vestiges), some others will require some thinking
from their developers in order to cope with a not
This means you are trying to initialize things too early. Most of the
information made available in opal/util/proc.h is only available once the
RTE was setup, i.e. only after the call to rte_init. Thus, the BTL can only
use it after the init call...
George.
On Mon, Jul 28, 2014 at 1:01 PM,
So do we want to sequence the BTL interfaces between jobs or only between
local processes on the same job?
I'm also fine with removing this option if it is not in use.
George.
On Mon, Jul 28, 2014 at 1:09 PM, Ralph Castain wrote:
>
> On Jul 28, 2014, at 10:02 AM, Jeff
do_component_setup()
> >>
> >> Yuck.
> >>
> >> Is there a better way?
> >>
> >> Crazy idea: should we add more hooks during the init / setup sequence?
> E.g., a BTL component_init_after_rte_has_been_initialized() that is
> guaranteed to be called before any module functions are
such information.
Patience ...
George.
On Mon, Jul 28, 2014 at 1:38 PM, George Bosilca <bosi...@icl.utk.edu> wrote:
> Well, I'm slightly confused as the BTL are initialized outside opal_init.
> There must be a specific call to mca_base_framework_open for the BTL, and
> curr
This has been clear from day one: everything based on RML to setup will
need to be rewritten. This is not only SM, it also related to IB.
Meanwhile, one must build with dlopen enabled in order to get access to
these calls.
George.
On Mon, Jul 28, 2014 at 4:02 PM, Nathan Hjelm
rint name2
>>
>> $2 = (const orte_process_name_t *) 0xbaf76c
>>
>> (gdb) print *name2
>>
>> $3 = {jobid = 2452946945, vpid = 1}
>>
>> (gdb)
>>
>>
>>
>>
>>
>>
>>
>> >-Original Message-
>&g
issue across the code base now, so we'll have to troll and fix it. I was
> doing the minimal change required to fix the trunk in the meantime.
>
> On Jul 30, 2014, at 9:06 AM, George Bosilca <bosi...@icl.utk.edu> wrote:
>
> Yes. opal_process_name_t has basically no meaning by
gs the "new" way, that's all
>
> On Jul 30, 2014, at 9:17 AM, George Bosilca <bosi...@icl.utk.edu> wrote:
>
> No, this is not going to be an issue if the opal_identifier_t is used
> correctly (aka only via the exposed accessors).
>
> George.
>
>
>
&
Why do you want to add new versions? This will lead to having two, almost
identical, sets of atomics that are conceptually equivalent but different
in terms of code. And we will have to maintained both!
I did a similar change in a fork of OPAL in another project but instead of
adding another
I can also picture an environment where different projects can supply
component that would technically belong to a framework from another
project. Let me take an example. Imagine we decide to keep the RML-based
connection setup for SM, thing that is not currently possible in the OPAL
layer. In
What is your definition of “global job size”?
George.
On Jul 31, 2014, at 11:06 , Pritchard Jr., Howard wrote:
> Hi Folks,
>
> I think given the way we want to use the btl's in lower levels like opal,
> it is pretty disgusting for a btl to need to figure out on its own
I definitively think you misunderstood this scope of this RFC. The information
that is so important to you to configure the mailbox size is available to you
when you need it. This information is made available by the PML through the
call to add_procs, which comes with all the procs in the
Hjelm <hje...@lanl.gov> wrote:
>
> That is what I would prefer. I was trying to not disturb things too
> much :). Please bring the changes over!
>
> -Nathan
>
> On Wed, Jul 30, 2014 at 03:18:44PM -0400, George Bosilca wrote:
>> Why do you want to add new
...@lbl.gov> wrote:
>
>>
>> On Thu, Jul 31, 2014 at 4:13 PM, George Bosilca <bosi...@icl.utk.edu>
>> wrote:
>>
>>> Paul, I know you have a pretty diverse range computers. Can you try to
>>> compile and run a “make check” with the following patch?
>
A missing include. Should be fixed by r32388.
Thanks,
George.
On Thu, Jul 31, 2014 at 11:15 PM, Paul Hargrove wrote:
>
> $ INST/bin/mpirun -mca btl sm,self -np 2 examples/ring_c'
> ld.so.1: ring_c: fatal: relocation error: file
>
i.org>
> <r...@open-mpi.org> wrote:
>
>
> FWIW: we had Siegmar try that and it didn't solve the problem. Paul?
>
>
> On Jul 31, 2014, at 8:28 PM, svn-commit-mai...@open-mpi.org wrote:
>
>
> Author: bosilca (George Bosilca)
> Date: 2014-07-31 23:28
Castain <r...@open-mpi.org> wrote:
>
> On Aug 1, 2014, at 8:27 AM, George Bosilca <bosi...@icl.utk.edu> wrote:
>
> This commit brings two things. One if the renaming suggested by Gilles.
> The second one is forcing the ORTE process down on the OPAL. This doesn't
> fit the
ove that assert?
>
>
> On Aug 1, 2014, at 9:30 AM, George Bosilca <bosi...@icl.utk.edu> wrote:
>
> I missed the fact that the app doesn't force it. But if this is indeed the
> case then it is extremely weird that you are seing someone else releasing
> your proc.
>
> Regarding
Another version of the atomic patch. Paul has tested it on a bunch of
platforms. At this point we have confirmation from all architectures except
SPARC (v8+ and v9).
George.
atomics.patch
Description: Binary data
On Jul 31, 2014, at 19:13 , George Bosilca <bosi...@icl.utk.edu>
on an heterogeneous cluster,
> this is now fixed in r32425.
>
> i am not convinced i chose the most elegant way to achieve the desired
> result ...
> could you please double check this commit ?
>
> Thanks,
>
> Gilles
>
> On 2014/08/02 0:14, George Bosilca wrote:
>
>
t around this problem. I missed that he did it by
> location instead of named fields - perhaps we should do that instead?
>
As soon as we impose the ORTE naming scheme at the OPAL level (aka. the
notion of jobid and vpid) this approach will become possible.
George.
>
>
> On A
Yossi,
I think you raised an interesting corner-case, and a possible bug in the MTL
implementation. As the request is marked as complete by the CM/PML the cancel
should never succeed. As the CM/PML is forcing the completion on all bend
requests, it should also enforce that all completed
org> wrote:
>
> On Aug 5, 2014, at 10:23 AM, George Bosilca <bosi...@icl.utk.edu> wrote:
>
> On Tue, Aug 5, 2014 at 1:15 PM, Ralph Castain <r...@open-mpi.org> wrote:
>
>> Hmmm...wouldn't that then require that you know (a) the other side is
>> little
, Paul Hargrove <phhargr...@lbl.gov> wrote:I have confirmed that George's latest version works on both SPARC ABIs.ARMv7 and three MIPS ABIs still pending...-PaulOn Fri, Aug 1, 2014 at 9:40 AM, George Bosilca <bosi...@icl.utk.edu> wrote:Another version of the atomic patch. Paul has tested i
are of jobid and vpid but this is a bit
> more heavyweight imho.
>
> i'll try this today and make sure it works.
>
> any thoughts ?
>
> Cheers,
>
> Gilles
>
>
> On Wed, Aug 6, 2014 at 8:17 AM, Ralph Castain <r...@open-mpi.org>
> <r...@open-mpi.org>
I have an extremely vague recollection about a similar issue in the
datatype engine: on the SPARC architecture the 64 bits integers must be
aligned on a 64bits boundary or you get a bus error.
Takahiro you can confirm this by printing the value of data when signal is
raised.
George.
On Fri,
Paul's tests identified an small issue with the previous patch (a real
corner-case for ARM v5). The patch below is fixing all known issues.
Btw, there is still room for volunteers for the .asm work.
George.
On Tue, Aug 5, 2014 at 2:23 PM, George Bosilca <bosi...@icl.utk.edu>
This is a gigantic patch for an almost trivial issue. The current problem
is purely related to the fact that in a single location (nidmap.c) the
orte_process_name_t (which is a structure of 2 integers) is supposed to be
aligned based on the uint64_t requirements. Bad assumption!
Looking at the
store pointer points to the store function
> (db_hash.c:178)
> and proc is only used id memcpy at line 194, so 64 bits alignment is not
> required.
> (and comment is explicit : /* to protect alignment, copy the data across
> */
>
> that might sounds pedantic, but are
r32467 should fix the problem.
George.
On Fri, Aug 8, 2014 at 1:20 PM, Jeff Squyres (jsquyres)
wrote:
> That'll do it...
>
> George: can you fix?
>
>
> On Aug 8, 2014, at 1:11 PM, Ralph Castain wrote:
>
> > I think it might be getting pulled in from
It is not that I care, but it was one of our supported platforms and we
don't usually drop support for anything without a proper RFC.
George.
On Mon, Aug 11, 2014 at 12:09 PM, Dave Goodell (dgoodell) <
dgood...@cisco.com> wrote:
> On Aug 7, 2014, at 11:37 PM, George Bosi
Dave,
We all understand your concerns. However, the current issue has nothing to
do with Nathan, the code for supporting ARMv5 is already in the patch I
submitted and that Paul validated.
What Nathan said he might take a look at is a different method for
generating assembly code, one that only
There are many differences between the trunk and 1.8 regarding the TCP BTL.
The major I remember about is that the TCP in the trunk is reporting errors
to the upper level via the callbacks attached to fragments, while the 1.8
TCP BTL doesn't.
So, I guess that once a connection to a particular
this in the 1.8 ...
On Wed, Aug 13, 2014 at 3:33 PM, Jeff Squyres (jsquyres) <jsquy...@cisco.com
> wrote:
> On Aug 13, 2014, at 12:52 PM, George Bosilca <bosi...@icl.utk.edu> wrote:
>
> > There are many differences between the trunk and 1.8 regarding the TCP
> BTL.
The atomic.h file should also be trimmed of the SPARC relique.
George.
Index: opal/include/opal/sys/atomic.h
===
--- opal/include/opal/sys/atomic.h (revision 32531)
+++ opal/include/opal/sys/atomic.h (working copy)
@@ -162,8
SHARED is only supported when the pthread library does support spinlock,
while in all other case it falls back into using atomic locks. Providing
support only for a small fraction of environments without reporting errors
or providing any alternative on other systems is difficult to accept.
I
Nathan,
Indeed the original design allowed for multiple usages of the same
descriptor, not concurrent as the text in the btl.h indicates but
consecutive. The MCA_BTL_FLAGS_RDMA_MATCHED flag is a weirdness needed for
Portal, and I am not use it is currently in use anywhere in the code base.
My
The MPI standard clearly states (in 8.7.1 Allowing User Functions at
Process Termination) that the mechanism you describe is only allowed on
MPI_COMM_SELF. The most relevant part starts at line 14.
George.
On Tue, Aug 26, 2014 at 11:20 AM, Lisandro Dalcin wrote:
> Another
alc...@gmail.com> wrote:
> On 26 August 2014 21:29, George Bosilca <bosi...@icl.utk.edu> wrote:
> > The MPI standard clearly states (in 8.7.1 Allowing User Functions at
> Process
> > Termination) that the mechanism you describe is only allowed on
> > MPI_COMM_SELF. The
The proposed patch has several issues, all of them detailed on the ticket.
A correct patch as well as a broaden tester are provided.
George.
On Tue, Aug 26, 2014 at 8:21 PM, Jeff Squyres (jsquyres) wrote:
> Good catch.
>
> I filed
Lisandro,
Thanks for the tester. I pushed a fix in the trunk (r32613) and I requested
a CMR for the 1.8.3.
George.
On Tue, Aug 26, 2014 at 6:53 AM, Lisandro Dalcin wrote:
> I've just installed 1.8.2, something is still wrong with
> HINDEXED_BLOCK datatypes.
>
> Please
t 2014 23:59, George Bosilca <bosi...@icl.utk.edu> wrote:
> > Lisandro,
> >
> > You rely on a feature clearly prohibited by the MPI standard. Please read
> > the entire section I pinpointed you to (8.7.1).
> >
> > There are 2 key sentences in the section.
&g
It complains about 0x2b1b1ed9 being misaligned which seems as a valid
complaint. What is the dst value when this trigger? What is
var->mbv_storage?
George.
On Thu, Sep 11, 2014 at 5:29 AM, Jeff Squyres (jsquyres) wrote:
> On Sep 10, 2014, at 4:23 PM, Jeff Squyres
Or copy the handshake protocol design of the TCP BTL...
George.
On Fri, Sep 19, 2014 at 6:23 PM, Ralph Castain wrote:
> You know, I'm almost beginning to dread opening my email in the morning
> for fear of seeing another "race condition" subject line! :-)
>
> I think the
, Sep 28, 2014 at 6:29 AM, Lisandro Dalcin <dalc...@gmail.com> wrote:
> On 22 April 2014 03:02, George Bosilca <bosi...@icl.utk.edu> wrote:
> > Btw, the proposed validator was incorrect the first printf instead of
> >
> > printf(“[%d] rbuf[%d]=%2d expected:
I see no change in the topo interface in any of the patches attached to this
thread. Is there any other patch related to this discussion?
George.
> On Oct 1, 2014, at 14:52, Jeff Squyres (jsquyres) wrote:
>
>> On Oct 1, 2014, at 6:48 AM, Gilles Gouaillardet
>>
On Oct 3, 2014, at 17:06 , Howard wrote:
> Hello OMPI folks,
>
> As part of the code cleanup for release 1.9/2.0, there are several
> opal components that are on the radar for possible removal.
>
> These include:
>
> mca/btl/template (not clear anyone is maintaining
It’s a tough call. This proposal will create significant differences between
the debug and fast builds. As the entire objects will be set to zero this might
reduce bugs in the debug build, bugs that will be horribly difficult to track
in any non-debug builds. Moreover, if the structures are
Even to US attendees Atlanta might seem more appealing, as it is one hop
away from most locations and it has reasonable weather forecast for
January/February (not as good as Dallas I concede).
George.
On Wed, Nov 5, 2014 at 1:18 PM, Jeff Squyres (jsquyres)
wrote:
> SHORT
I pushed a slightly better patch for the TCP BTL
(54ddb0aece0892dcdb1a1293a3bd3902b5f3acdc). The correct scheme would be to
OBJ_RETAIN the proc once it is attached to the btl_proc and release it upon
destruction of the btl_proc. However, for some obscure reason this doesn't
quite works, as the
> On Nov 11, 2014, at 17:13 , Jeff Squyres (jsquyres)
> wrote:
>
>> More particularly, it looks like add_procs is being called a second time
>> during MPI_Intercomm_create and being passed a process that is already
>> connected (passed into the first add_procs call). Is
Edgar,
The restriction you are facing doesn't come from Open MPI, but instead it
comes from the default behavior of how dlopen loads the .so files. As we do
not manually force the RTLD_GLOBAL flag the scope of our modules is local,
which means that the symbols defined in this library are not made
Takahiro,
Sorry for the delay in answering. Thanks for the bug report and the patch.
I applied you patch, and added some tougher tests to make sure we catch
similar issues in the future.
Thanks,
George.
On Mon, Sep 29, 2014 at 8:56 PM, Kawashima, Takahiro <
t-kawash...@jp.fujitsu.com> wrote:
The FIFO implementation doesn't look right to me. I don't have time to look
at it right now, but just looking at the push it will not correctly succeed
if two threads are pushing items in same time.
A FIFO is a very sensitive algorithm, and should be treated accordingly.
Moreover, there is no
You can't use the PML error reporting mechanism in this particular
instance, it is too early in the setup process (in the BTL component init
function) and the PML has not setup the error callback yet.
This function is called during the MPI_Init, at a time where most of the
Open MPI infrastructure
t 12:35 PM, Jeff Squyres (jsquyres) <jsquy...@cisco.com>
> wrote:
> >
> > Oh, good catch -- thanks.
> >
> > I wouldn't call abort -- that will dump core. Just show_help() and
> exit(nonzero), I guess.
> >
> >
> > On Dec 4, 2014, at 3:31 PM, Ge
After updating to the latest master (3a14c8e), I start having issues with the
VPATH build on Mac OS X. The autogen.pl and configure succeeded but when make
is invoked I got the following error:
Making all in opal
Making all in include
/Applications/Xcode.app/Contents/Developer/usr/bin/make
y bug-report on this (and the work-around) here:
> http://www.open-mpi.org/community/lists/devel/2014/11/16371.php
>
> 2014-12-09 7:57 GMT+01:00 George Bosilca <bosi...@icl.utk.edu>:
>
>> After updating to the latest master (3a14c8e), I start having issues with
>> the VPATH build
The overall design in OMPI was that no OMPI module should be allowed to
decide if threads are on (thus it should not rely on the value
returned by opal_using_threads
during it's initialization stage). Instead, they should respect the level
of thread support requested as an argument during the
question is for Pascal at Bull: why do you feel
> this earlier setting is required?
>
This might allow to see if using functions that require protection, such as
opal_lifo_push, will work by default or one should use directly their
atomic version?
George.
>
>
> On
t I
found.
George.
>
> Cheers,
>
> Gilles
>
> On 2014/12/12 10:30, Ralph Castain wrote:
>
> Just to help me understand: I don’t think this change actually changed any
> behavior. However, it certainly *allows* a different behavior. Isn’t that
> true?
>
>
it_thread().
>
> I saw also that opal_using_threads() exists and was used by other BTLs.
>
>
>
> Maybe the solution is to find the way to set enable_mpi_threads to 0 when
> MPI_Init() is called.
>
>
>
>
>
> *De :* devel [mailto:devel-boun...@open-mpi.org] *De la part de* Ge
I also noticed a drastic increase in the number of linking warnings. This
is on 64 bits SciLinux Carbon (6.6) with using the Intel compilers 14.0.0
20130728. I run some tests and everything seems to work just fine, so this
might not be such a deal breaker.
George.
libtool: install: warning:
A opal_pmix.fence seems like a perfect replacement.
George.
On Fri, Dec 19, 2014 at 10:26 AM, Adrian Reber wrote:
> Again I am trying to get the FT code working. This time I am unsure how
> to resolve the code changes from this commit:
>
> commit
The trunk is broken:
libfabric/libfabric/include/fi.h:50:25: error: stdatomic.h: No such file or
directory
George.
On Fri, Dec 19, 2014 at 2:03 AM, Gilles Gouaillardet <
gilles.gouaillar...@iferc.org> wrote:
> Jeff,
>
> the include path is $top_srcdir/opal/mca/event/libevent2021/libevent
>
;
> Jeff Squyres
>
> jsquy...@cisco.com
>
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
> --
> *From:* devel [devel-boun...@open-mpi.org] on behalf of George Bosilca [
> bosi...@icl.utk.ed
Successive alteration of the build system made this option less relevant
and especially less meaningful. However, while removing it sounds like a
desirable cleanup, we have to keep in mind that this will enable all locks
and all memory barriers even in cases where they are not necessary
(via
so turns on the lock prefix
for the atomic operations, forcing them to always be atomic. I am not sure
that this has no unexpected side-effects on the code.
George.
>
>
> On Jan 6, 2015, at 4:12 PM, George Bosilca <bosi...@icl.utk.edu> wrote:
>
> > Successive alteration
Why do you need the memory write barrier inside the loop ?
George.
On Thu, Jan 8, 2015 at 11:16 AM, Nathan Hjelm wrote:
>
> Fixed on master. I forgot a write memory barrier in the 64-bit version
> of opal_fifo_pop_atomic.
>
> -Nathan
>
> On Thu, Jan 08, 2015 at 02:29:05PM
I have some comments about this ticket and the corresponding patch.
Honestly, the patch lacks most of the things we have talked about during
our last developers meeting. However, my main concern in this particular
email is about the SIGNAL flag.
1. The fact that currently there is little
Today's trunk compiled with icc fails to complete the check on 2 tests:
opal_lifo and opal_tree.
For opal_tree the output is:
OPAL dss:unpack: got type 9 when expecting type 3
Failure : failed tree deserialization size compare
SUPPORT: OMPI Test failed: opal_tree_t (1 of 12 failed)
and
opal_tree_item_deserialize_fn_t
> deserialize,
> - char *curr_delim,
> + volatile char *curr_delim,
> int depth)
> {
> int idx = 1, rc;
>
> On 2015/01/16 8:57, George Bosilca wrote:
&
The extent should not be part of the decision, what matters is the amount
of data to be pushed on the wire, and not it's span in memory.
George.
On Mon, Jan 19, 2015 at 12:17 AM, Gilles Gouaillardet <
gilles.gouaillar...@iferc.org> wrote:
> Adrian,
>
> i just fixed this in the master
> (
>
Btw,
MPI_Type_hvector(2, 1, 0, MPI_INT, );
Is just a weird datatype. Because the stride is 0, this datatype a memory
layout that includes 2 times the same int. I'm not sure this was indeed
intended...
George.
On Mon, Jan 19, 2015 at 12:17 AM, Gilles Gouaillardet <
t;
> Gilles
>
> On 2015/01/21 3:00, Jeff Squyres (jsquyres) wrote:
> > George is right -- Gilles: was this the correct solution?
> >
> > Put differently: the extent of the 20K vector created below is 4 (bytes).
> >
> >
> >
> >> On Jan 19, 2015, at 2:39 A
of datatype
entries, so the cost was prohibitive.
George.
On Wed, Jan 21, 2015 at 9:43 AM, Jeff Squyres (jsquyres) <jsquy...@cisco.com
> wrote:
> On Jan 20, 2015, at 10:10 PM, George Bosilca <bosi...@icl.utk.edu> wrote:
> >
> > Receiving with such a datatype
a tentative fix is available at https://github.com/open-mpi/ompi/pull/355
>
> i asked Nathan to review it before it lands into the master
>
> Cheers,
>
> Gilles
>
>
> On 2015/01/22 7:08, George Bosilca wrote:
>
> Current trunk compiled with any compiler (gcc or icc)
During some experiments we have identified several major issues with coll ML
with a very recent version of Open MPI master (22ab638 Jan 20 13:21:44). Based
on the description below I consider these issues as major drawbacks that
require immediate action (or disabling coll ML by default in all
I took care of the TCP warnings.
George.
On Tue, Jan 27, 2015 at 7:20 AM, Ralph Castain wrote:
> btl_tcp_frag.c: In function 'mca_btl_tcp_frag_dump':
>
> btl_tcp_frag.c:99: warning: comparison between signed and unsigned
>
> btl_tcp_frag.c:104: warning: comparison
The RM should not be expected to read and accept the code itself, but his
role should be limited to accepting the idea behind the patch and making
sure it is compatible with the rules in place. As such, removing the
RM-approval mark is not yielding any benefits.
Moreover, based on the ideas
My feeling is that the current patch hide the symptoms without addressing
the real issue.
As a side note: The compiler incriminated in this thread, works perfectly
for 128 bits atomic operations in other projects where I use atomic LIFO &
FIFO (but not the one from OMPI as I already raised my
Dave,
Based on your ompi_info.all the following bandwidth are reported on your
system:
MCA btl: parameter "btl_openib_bandwidth" (current value:
"4", data source: default, level: 5 tuner/detail, type: unsigned)
Approximate maximum bandwidth of
nt FIFOs with CAS2, and even after peer reviews some of
them turned out to be incorrect. What I am saying is that we are quick to
blame these failures on the icc compiler, while we have no formal proof
that the FIFO algorithm in Open MPI is correct.
George.
>
> Cheers,
>
> Gilles
>
ks with recent icc, i would not go "all in" with this ...
>
> And as you pointed, even if the problem does come from the compiler, that
> does not mean ompi algo are necessarily correct.
>
> Cheers,
>
> Gilles
>
> George Bosilca <bosi...@icl.utk.edu>
I did alter these to 40960 and 10240 as someone else
>> suggested to me. The attached graph shows the base red line, along with
>> the manual balanced blue line and auto balanced green line (0's for both).
>> This shift lower suggests to me that the higher TCP latency
Seriously?
George.
On Thu, Feb 12, 2015 at 1:00 PM, Nathan Hjelm wrote:
>
> I think I see the issue. Looks like there is a missing memory barrier
> after the head consistency code. I will add one and see if that fixes
> your problem.
>
> BTW, I can't reproduce the issue on
While looking the MPI_Testany issue, I came across a very unsettling
sentence in the MPI standard (3.0 page 58 line 36).
> The array is indexed from zero in C, and from one in Fortran.
This sentence seems to indicate that the index returned by the TestAny and
TestSome (as well as the
Sorry but I miss the connection between this test and the issue of TestAny
in Fortran?
On Thu, Feb 19, 2015 at 2:00 PM, Dave Goodell (dgoodell) <dgood...@cisco.com
> wrote:
> On Feb 19, 2015, at 10:15 AM, George Bosilca <bosi...@icl.utk.edu> wrote:
>
> > While looking
; George,
> >
> > this is correctly handled in ompi_testany_f :
> >
> > /* Increment index by one for fortran conventions. Note that
> >all Fortran compilers have FALSE==0; we just need to check
> >for any nonzero value (because T
gt; should be.
>> >
>> >Dave
>> >
>> > On Wed, Feb 11, 2015 at 11:00 AM, <devel-requ...@open-mpi.org> wrote:
>> > Send devel mailing list submissions to
>> > de...@open-mpi.org
>> >
>> >
Just pushed some fixes into the trunk. However, the naming of the MCA
variable for verbs fork is not following our usual requirements. I guess
the code owners should address this topic.
George.
On Thu, Feb 26, 2015 at 4:52 PM, Howard Pritchard
wrote:
> Hi Folks,
>
>
A better fix is underway. One that will be checked on a verbs-enabled
environment.
George
On Thu, Feb 26, 2015 at 5:08 PM, Jeff Squyres (jsquyres) wrote:
> Howard --
>
> It looks like https://github.com/open-mpi/ompi/pull/415 was merged before
> it was ready. George
synonyms (and marked them as deprecated).
George.
On Thu, Feb 26, 2015 at 5:15 PM, George Bosilca <bosi...@icl.utk.edu> wrote:
> A better fix is underway. One that will be checked on a verbs-enabled
> environment.
>
> George
>
>
> On Thu, Feb 26, 2015 at 5:08
I answered to the PR but I'll bring my comment here as well. In addition to
the performance implication, there might be a correctness implication here.
del_procs does not have to be called globally by all participating
processes in same time, and can be called with a subset of processes. As an
Gilles,
I might misread these commit, but the changes proposed here do not look correct
to me. At no moment the new_comm can be equal to MPI_COMM_NULL in this code
(especially not at line 172 in the too_base_cart_create.c).
George.
> On Mar 9, 2015, at 02:26 , git...@crest.iu.edu wrote:
>
Do you have the same behavior when you disable the vader BTL ? (--mca btl
^vader).
George.
On Fri, Mar 13, 2015 at 2:20 PM, Orion Poplawski
wrote:
> We currently have openmpi-1.8.4-99-20150228 built in Fedora Rawhide. I'm
> now
> seeing crashes/hangs when running the
I had the same impression but them I went and read the Intel documentation
and xchg is one of these exceptions where the lock is implicit.
George.
On Wed, Mar 25, 2015 at 4:59 PM, Dave Goodell (dgoodell) wrote:
> On Mar 25, 2015, at 3:02 PM, git...@crest.iu.edu wrote:
>
1 - 100 of 1109 matches
Mail list logo