Re: [OMPI devel] SM component init unload

2012-07-03 Thread George Bosilca
Juan, Something weird is going on there. The selection mechanism for the SM coll and SM BTL should be very similar. However, the SM BTL successfully select itself while the SM coll fails to determine that all processes are local. In the coll SM the issue is that the remote procs do not have

Re: [OMPI devel] non-blocking barrier

2012-07-06 Thread George Bosilca
No, it is not right. With the ibarrier usage you're making below, the output should be similar to the first case (all should leave at earlist at 6.0). The ibarrier is still a synchronizing point, all processes MUST reach it before anyone is allowed to leave. However, if you move the ibarrier

Re: [OMPI devel] SM component init unload

2012-07-06 Thread George Bosilca
mand line (with tuned too). >>> I am not going to use this release in production, only for playing with the >>> code :-) >>> >>> Regards, >>> Juan Antonio. >>> >>> El 04/07/2012, a las 02:59, George Bosilca escribió: >>> >>>> Ju

Re: [OMPI devel] [OMPI svn] svn:open-mpi r26801 - trunk/ompi/include

2012-07-19 Thread George Bosilca
vision so the trunk > remains buildable? > > > On Jul 18, 2012, at 7:23 AM, svn-commit-mai...@open-mpi.org wrote: > >> Author: bosilca (George Bosilca) >> Date: 2012-07-18 10:23:23 EDT (Wed, 18 Jul 2012) >> New Revision: 26801 >> URL: https://svn.open-mpi.

Re: [OMPI devel] Existing frameworks for remote device memory exclusive read/write

2012-07-23 Thread George Bosilca
Dima, A while back we investigated the potential of a memcpy module in the OPAL layer. We had some proof of concept, but finally didn't went forward due to lack of resources. However, we the skeleton of the code is still in the trunk (in opal/mca/memcpy). While I don't think it will cover all

[OMPI devel] Blame the compiler …

2012-07-23 Thread George Bosilca
These compilers guys that enforce standards with random limitations because they understand the benefit of never-ending "help" messages … ;) george show_help_lex.c:1185: warning: 'input' defined but not used ../../../../ompi/opal/mca/hwloc/base/hwloc_base_open.c: In function

Re: [OMPI devel] [patch] MPI_Cancel should not cancel a request if it has a matched recv frag

2012-07-26 Thread George Bosilca
Takahiro, Indeed we were way to lax on canceling the requests. I modified your patch to correctly deal with the MEMCHECK macro (remove the call from the branch that will requires a completion function). The modified patch is attached below. I will commit asap. Thanks, george. Index:

Re: [OMPI devel] [patch] MPI_Cancel should not cancel a request if it has a matched recv frag

2012-07-26 Thread George Bosilca
some manner, or else there will be a gap in the arriving sequence > numbers, and the matching logic will prevent any further progress. > > Rich > > -Original Message- > From: devel-boun...@open-mpi.org [mailto:devel-boun...@open-mpi.org] On > Behalf Of George Bosilca >

Re: [OMPI devel] [OMPI svn] svn:open-mpi r26868 - in trunk/orte/mca/plm: base rsh

2012-07-26 Thread George Bosilca
r26868 seems to have some issues. It works well as long as all processes are started on the same node (aka. there is a single daemon), but it breaks with the error message attached below if there are more than two daemons. $ mpirun -np 2 --bynode ./runme [node01:07767] [[21341,0],1]

[OMPI devel] The hostfile option

2012-07-27 Thread George Bosilca
I'm somewhat puzzled by the behavior of the -hostfile in Open MPI. Based on the FAQ it is supposed to provide a list of resources to be used by the launcher (in my case ssh) to start the processes. Make sense so far. However, if the configuration file contain a value for orte_default_hostfile,

Re: [OMPI devel] The hostfile option

2012-07-30 Thread George Bosilca
en hostfile, but weren't in the default > hostfile > > And subsequently do the same for -host. I think that would retain the spirit > of the discussion, but provide more flexibility and provide a tad more > "expected" behavior. > > I don't have an iron in this fire

Re: [OMPI devel] The hostfile option

2012-07-31 Thread George Bosilca
On Jul 30, 2012, at 15:29 , Ralph Castain wrote: > > On Jul 30, 2012, at 2:37 AM, George Bosilca wrote: > >> I think that as long as there is a single home area per cluster the >> difference between the different approaches might seem irrelevant to most of >> the

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r27161 - trunk/orte/mca/grpcomm/base

2012-08-30 Thread George Bosilca
A strange race condition happening for undisclosed reasons, and only fixable by replication is jeopardizing our reference count system. That sounds definitively almost scary (!) I think that the proposed solution is just a band-aid. It somehow fixes this particular instance of the issue but

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r27161 - trunk/orte/mca/grpcomm/base

2012-08-30 Thread George Bosilca
On Aug 30, 2012, at 17:15 , Ralph Castain <r...@open-mpi.org> wrote: > > On Aug 30, 2012, at 8:10 AM, George Bosilca <bosi...@eecs.utk.edu> wrote: > >> A strange race condition happening for undisclosed reasons, and only fixable >> by replication is jeopar

Re: [OMPI devel] RFC: hwloc object userdata

2012-10-03 Thread George Bosilca
In the case such a functionality become necessary, I would suggest we use a mechanism similar to the attributes in MPI (but without the multi-language mess). That will allow whoever want to attach data to a hwloc node, to do it without the mess of dealing with reserving a slot. It might require

Re: [OMPI devel] MPI_Reduce Hangs in my Application

2012-10-10 Thread George Bosilca
Your code works for me on two platforms. Thus, I guess the problem is with the communication layer (BTL) is Open MPI. What network do you use? If Ethernet how many interfaces? Thanks, george. On Oct 10, 2012, at 09:30 , Santhosh Kokala wrote: > I have a

Re: [OMPI devel] MPI_Reduce Hangs in my Application

2012-10-10 Thread George Bosilca
38 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:1000 > RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) > > From: devel-boun...@open-mpi.org [mailto:devel-boun...@open-mpi.org] On > Behalf Of George Bosilca > Sent: Wednesday, October 10, 2012

Re: [OMPI devel] [patch] Invalid MPI_Status for null or inactive request

2012-10-15 Thread George Bosilca
Takahiro, I fail to see the cases your patch addresses. I recognize I did not have the time to look over all the instances where we deal with persistent inactive requests, but at the first occurrence, the one in req_test.c line 68, the case you exhibit there is already covered by the test

Re: [OMPI devel] Cross Memory Attach: What am I Missing?

2012-10-18 Thread George Bosilca
Check the permissions granted by pam. Look in the /etc/security to check for any type of restrictions. george. On Oct 17, 2012, at 23:30 , "Gutierrez, Samuel K" wrote: > Hi, > > I'm trying to run with CMA support, but process_vm_readv is failing with > EPERM when trying

Re: [OMPI devel] [patch] SEGV on processing unexpected messages

2012-10-18 Thread George Bosilca
Takahiro, Nice catch. A nicer fix will be to check the type of the header, and copy the header accordingly. Attached is a patch following this idea. Thanks, george. hdr_copy.patch Description: Binary data On Oct 18, 2012, at 03:06 , "Kawashima, Takahiro"

Re: [OMPI devel] [OMPI svn] svn:open-mpi r27451 - in trunk: ompi/mca/allocator/bucket ompi/mca/bcol/basesmuma ompi/mca/bml/base ompi/mca/btl ompi/mca/btl/base ompi/mca/btl/openib ompi/mca/btl/sm ompi/

2012-10-24 Thread George Bosilca
I have some issues starting my applications lately. Here is an example: mpirun -x LD_LIBRARY_PATH -np 8 -hostfile /etc/hostfile -bynode ./testing_dtrmm -N 4000 -p 4 -x And the corresponding output: /home/bosilca/opt/trunk/debug/bin/orted: Error: unknown option "-p" And then the daemons

Re: [OMPI devel] Multirail + Open MPI 1.6.1 = very big latency for the first communication

2012-11-01 Thread George Bosilca
It will depend on the protocol used by the OpenIB BTL to wire up the peers (OOB, UDCM, RDMACM). In the worst case (OOB), the connection process will be done using TCP. We are looking at a handshake (over TCP 40 ms latency for a one-way message is standard, the handshake will take at least

Re: [OMPI devel] RFC: fix frameworks usage of opal_output

2012-11-01 Thread George Bosilca
On Nov 1, 2012, at 16:18 , Nathan Hjelm <hje...@lanl.gov> wrote: > On Thu, Nov 01, 2012 at 04:07:32PM -0400, George Bosilca wrote: >> Nathan, >> >> Here is a quick question regarding the topi framework. >> >> - The mca_topo_base_output is opened

Re: [OMPI devel] RFC: fix frameworks usage of opal_output

2012-11-01 Thread George Bosilca
On Nov 1, 2012, at 19:07 , Nathan Hjelm wrote: > I was going to address this second inconsistency with another patch but now > seems like a good time to get a see if anyone has an opinion about how this > should be fixed. I can think of two simple fixes: > 1) Since

Re: [OMPI devel] RFC: fix various leaks in trunk (touches coll/ml, vprotocol, pml/v, btl/openib, and mca/base)

2012-11-05 Thread George Bosilca
+1! george. On Nov 5, 2012, at 18:59 , Jeff Squyres wrote: > +1 on the ompi/mca/btl/openib/btl_openib_mca.c and > opal/mca/base/mca_base_param.c. > > I didn't check the others. > > > On Nov 5, 2012, at 6:31 PM, Nathan Hjelm wrote: > >> What: I used valgrind on

Re: [OMPI devel] [OMPI svn] svn:open-mpi r27580 - in trunk: ompi/mca/btl/openib ompi/mca/btl/wv ompi/mca/coll/ml opal/util/keyval orte/mca/rmaps/rank_file

2012-12-03 Thread George Bosilca
I remember there were some discussions about lex (or flex) and their version, but I don't remember the specifics. Whatever the outcome was, we're back at having a problem there, more specifically a missing reference (opal_util_keyval_yylex_destroy) which seems to indicate the issue was not

Re: [OMPI devel] [OMPI svn] svn:open-mpi r27580 - in trunk: ompi/mca/btl/openib ompi/mca/btl/wv ompi/mca/coll/ml opal/util/keyval orte/mca/rmaps/rank_file

2012-12-03 Thread George Bosilca
mp;@ redhat) and cleans up the lex state correctly in modern flex. It > should be done in the next day or so. > > -Nathan > > On Monday, December 03, 2012 6:28 PM, devel-boun...@open-mpi.org > [devel-boun...@open-mpi.org] on behalf of George Bosilca > [bosi...@icl.utk.edu]

Re: [OMPI devel] [OMPI svn] svn:open-mpi r27580 - in trunk: ompi/mca/btl/openib ompi/mca/btl/wv ompi/mca/coll/ml opal/util/keyval orte/mca/rmaps/rank_file

2012-12-04 Thread George Bosilca
pal/util/keyval/keyval_lex.c? If that works you might want to run > configure/make from an empty directory. > > -Nathan > > On Monday, December 03, 2012 6:28 PM, devel-boun...@open-mpi.org > [devel-boun...@open-mpi.org] on behalf of George Bosilca > [bosi...@icl.utk.edu] wrote

Re: [OMPI devel] CRIU checkpoint support in Open-MPI?

2012-12-06 Thread George Bosilca
Samuel, Yes, all contributions are welcomed. It should be almost trivial to write a new backend in Open MPI to support what the kernel developers will agree to add as C/R capabilities. A good starting point to look at are the existing modules in opal/mca/crs. george. On Dec 6, 2012, at

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r27739 - in trunk: ompi/mca/btl/sm ompi/mca/common/sm ompi/mca/mpool/sm opal/mca/shmem opal/mca/shmem/mmap opal/mca/shmem/posix opal/mca/shmem/sysv opal/m

2013-01-04 Thread George Bosilca
Sam, This is a major change and would have deserved an RFC, as it impose a drastic/major non-scalable change (up to now the backend file creation was centralized, not in addition we exchange the data through the modex). A quick look highlight the fact that quite a lot of new modex entries have

Re: [OMPI devel] [OMPI svn] svn:open-mpi r27744 - trunk/ompi/runtime

2013-01-04 Thread George Bosilca
pi.org> wrote: > >> Whoa - that function is used, I believe, to retrieve the pointer to the >> hostname info in the ompi_proc_t >> >> >> On Jan 4, 2013, at 12:50 PM, svn-commit-mai...@open-mpi.org wrote: >> >>> Author: bosilca (George Bosilca)

Re: [OMPI devel] [OMPI svn] svn:open-mpi r27744 - trunk/ompi/runtime

2013-01-04 Thread George Bosilca
tion that completes the set. Or > let's set a policy and go thru every class and framework defined in > opal/orte/ompi and remove all APIs that aren't currently used - after all, we > can restore those from svn someday too, can't we? > > > On Jan 4, 2013, at 1:18 PM, George Bos

Re: [OMPI devel] [OMPI svn] svn:open-mpi r27744 - trunk/ompi/runtime

2013-01-04 Thread George Bosilca
their local areas. >>> >>> I'd prefer to have some currently-unused function that completes the set. >>> Or let's set a policy and go thru every class and framework defined in >>> opal/orte/ompi and remove all APIs that aren't currently used - after all, >>> we c

[OMPI devel] mpirun @ 100%

2013-01-07 Thread George Bosilca
I just noticed that mpirun (r27751) is taking 100% of CPU even for apps with no output. George.

Re: [OMPI devel] "Open MPI"-based MPI library used by K computer

2013-01-10 Thread George Bosilca
gt;> http://www.fujitsu.com/downloads/MAG/vol48-3/paper11.pdf >>> >>> What are the criteria for adding papers to the "Open MPI Publications" page? >>> >>> Rayson >>> >>> == >>> Open

Re: [OMPI devel] mpirun @ 100%

2013-01-15 Thread George Bosilca
, 2013, at 1:30 AM, George Bosilca <bosi...@icl.utk.edu> wrote: > >> I just noticed that mpirun (r27751) is taking 100% of CPU even for apps with >> no output. >> >> George. >> ___ >> devel mailing list >&g

Re: [OMPI devel] [patch] MPI-2.2: Ordering of attribution deletion callbacks on MPI_COMM_SELF

2013-01-17 Thread George Bosilca
Takahiro, Thanks for the patch. I deplore the lost of the hash table in the attribute management, as the potential of transforming all attributes operation to a linear complexity is not very appealing. As you already took the decision C, it means that at the communicator destruction stage the

Re: [OMPI devel] MPI-2.2 status #2223, #3127

2013-01-18 Thread George Bosilca
Takahiro, The MPI_Dist_graph effort is happening in ssh://h...@bitbucket.org/bosilca/ompi-topo. I would definitely be interested in seeing some test cases, and giving this branch a tough test. George. On Jan 18, 2013, at 02:43 , "Kawashima, Takahiro" wrote: >

Re: [OMPI devel] Open MPI (not quite) on Cray XC30

2013-01-18 Thread George Bosilca
Luckily for us all the definitions contain the same constant (orte). r27864 should fix this. George. On Jan 18, 2013, at 06:21 , Paul Hargrove wrote: > My employer has a nice new Cray XC30 (aka Cascade), and I thought I'd give > Open MPI a quick test. > > Given that

Re: [OMPI devel] MPI-2.2 status #2223, #3127

2013-01-18 Thread George Bosilca
org/jsquyres/ompi-topo-fixes-fixed? Or did you effectively > fork, and you guys will put back to SVN when you're done? > > > On Jan 18, 2013, at 5:47 AM, George Bosilca <bosi...@icl.utk.edu> > wrote: > >> Takahiro, >> >> The MPI_Dist_graph effort i

Re: [OMPI devel] MPI-2.2 status #2223, #3127

2013-01-18 Thread George Bosilca
m the topo-fixes-fixed repo? IIRC, there was some other > fixes/updates to the topo base in there, not just the new dist_graph > improvements. > > > On Jan 18, 2013, at 11:06 AM, George Bosilca <bosi...@icl.utk.edu> > wrote: > >> It's a fork from the official ompi

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r27881 - trunk/ompi/mca/btl/tcp

2013-01-22 Thread George Bosilca
rge -- > > Similar question on this one: should it be CMR'ed to v1.7? (I kinda doubt > it's appropriate for v1.6) > > > On Jan 21, 2013, at 6:41 AM, svn-commit-mai...@open-mpi.org wrote: > >> Author: bosilca (George Bosilca) >> Date: 2013-01-21 06:41:08 EST (Mo

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r27880 - trunk/ompi/request

2013-01-22 Thread George Bosilca
o.com> wrote: > George -- > > Is there any reason not to CMR this to v1.6 and v1.7? > > > On Jan 21, 2013, at 6:35 AM, svn-commit-mai...@open-mpi.org wrote: > >> Author: bosilca (George Bosilca) >> Date: 2013-01-21 06:35:42 EST (Mon, 21 Jan 2013) >> New Rev

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r27881 - trunk/ompi/mca/btl/tcp

2013-01-23 Thread George Bosilca
tionality, > and target that stuff for v1.7? Or should all of this just wait until 1.9? > > (I don't really care either way; I'm asking out of curiosity) > > > On Jan 22, 2013, at 7:24 PM, George Bosilca <bosi...@icl.utk.edu> wrote: > >> Nobody cared about erro

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r27881 - trunk/ompi/mca/btl/tcp

2013-01-24 Thread George Bosilca
http://fault-tolerance.org/ George. On Wed, Jan 23, 2013 at 5:10 PM, Jeff Squyres (jsquyres) <jsquy...@cisco.com> wrote: > On Jan 23, 2013, at 10:27 AM, George Bosilca <bosi...@icl.utk.edu> wrote: > >> While we always strive to improve this functionality, it was ava

Re: [OMPI devel] Open MPI on Cray XC30 - suspicous configury

2013-01-28 Thread George Bosilca
What Paul is saying is that there is a path mismatch between the two cases. Few lines above using_cle5_install is only set to yes if /usr/lib/alps/libalps.a exist. Then in the snippet pasted in Paul's email if using_cle5_install is yes then you set the orte_check_alps_libdir to something in

Re: [OMPI devel] [OMPI bugs] [Open MPI] #3489: Move r27954 to v1.7 branch

2013-01-28 Thread George Bosilca
Ralph, What if I say it wasn't a "stale" option nobody cares about. You just removed one of the critical pieces of the configury, completely disabling the work of other people. I am absolutely sorry that I didn't make it in the 27 minutes you generously provided for comments. Removing from the

Re: [OMPI devel] [OMPI svn] svn:open-mpi r28016 - trunk/ompi/mca/btl/tcp

2013-02-01 Thread George Bosilca
Jeff, So far, all interfaces specified via MCA parameters for the BTL TCP are required to exist. Otherwise an error message is printed and an error returned to the upper level, with the intent that no BTLs of this type will be enabled (as an example btl_tcp_component.c:682). If I correctly

Re: [OMPI devel] [OMPI svn] svn:open-mpi r28016 - trunk/ompi/mca/btl/tcp

2013-02-01 Thread George Bosilca
George. On Fri, Feb 1, 2013 at 6:50 PM, Jeff Squyres (jsquyres) <jsquy...@cisco.com> wrote: > On Feb 1, 2013, at 6:28 PM, George Bosilca <bosi...@icl.utk.edu> wrote: > >> So far, all interfaces specified via MCA parameters for the BTL TCP >> are required to

Re: [OMPI devel] [OMPI svn] svn:open-mpi r28016 - trunk/ompi/mca/btl/tcp

2013-02-04 Thread George Bosilca
If it ain't broke, don't fix it. I am more than skeptical about the interest of this new notation. The two behaviors you describe for include and exclude do not look conflicting to me. Inclusion is a strong request, the user enforce the usage of a specific interface. If the interface is not

Re: [OMPI devel] [OMPI svn] svn:open-mpi r28029 - trunk/opal/class

2013-02-04 Thread George Bosilca
Ralph, There are valid reasons why we decided not to add such macros. Adding elements to a list do not increase the element ref count. Similarly, removing an element from a list does not decrease its refcount either. Thus, there is no obvious link between the refcount of the elements in a list

Re: [OMPI devel] [OMPI svn] svn:open-mpi r28016 - trunk/ompi/mca/btl/tcp

2013-02-04 Thread George Bosilca
On Mon, Feb 4, 2013 at 8:45 PM, Jeff Squyres (jsquyres) wrote: > That will still be quite difficult to do in MTT. Remember: all the tests > that are run in MTT are shared across all of us via the ompi-tests SVN repo. > Are you suggesting that I alias every test in the

Re: [OMPI devel] MCA variable system slides and notes

2013-02-05 Thread George Bosilca
The major benefit of the second method is that it has the obvious potential to save us some memory. Not much I guess, but somewhere in the order of few Kb. But in order to save this memory, the originator must keep a pointer to the data in order to be able to free it after the mca_params

Re: [OMPI devel] mpi/java question

2013-02-20 Thread George Bosilca
That is wrong with MPI_INT64_T ? (MPI 3.0 standard page 26.) George. On Feb 20, 2013, at 21:12 , Ralph Castain wrote: > > On Feb 20, 2013, at 12:08 PM, Dmitri Gribenko wrote: > >> On Wed, Feb 20, 2013 at 10:05 PM, Ralph Castain

Re: [OMPI devel] v1.7.0rc7

2013-02-26 Thread George Bosilca
These warnings are now fixed (r28106). Thanks for reporting them. George. On Feb 26, 2013, at 04:27 , marco atzeri wrote: > CC to_self.o > /pub/devel/openmpi/openmpi-1.7rc7-1/src/openmpi-1.7rc7/test/datatype/to_self.c: > In function

Re: [OMPI devel] Open MPI BTL meeting in Knoxville

2013-03-05 Thread George Bosilca
to forward these notes to everyone. Here's some notes > from the BTL meeting we had in Knoxville a few weeks ago. > >> Date: >> Feb. 12, 2013 >> >> People: >> Thomas Herault >> George Bosilca >> Jeff Squyres >> Brian Barrett >> A

Re: [OMPI devel] RFC: assert() to ensure OBJ_CONSTRUCT'ed objects don't get destroyed

2013-03-07 Thread George Bosilca
Please refrain from doing so, the assumption #1 this patch is based on is false. First, OBJ_CONSTRUCT can be run to construct a specific type of object in a preallocated memory region (not only on the stack or heap). In fact, it is the only way we can dynamically initialize an object in a

Re: [OMPI devel] RFC: assert() to ensure OBJ_CONSTRUCT'ed objects don't get destroyed

2013-03-08 Thread George Bosilca
On Mar 8, 2013, at 11:55 , Jeff Squyres (jsquyres) <jsquy...@cisco.com> wrote: > On Mar 7, 2013, at 7:37 PM, George Bosilca <bosi...@icl.utk.edu> wrote: > >> An example will be a memory region without a predefined size, that I >> manipulate as opal_list_item_t

Re: [OMPI devel] RFC: assert() to ensure OBJ_CONSTRUCT'ed objects don't get destroyed

2013-03-08 Thread George Bosilca
, 2013, at 02:19 , Ralph Castain <r...@open-mpi.org> wrote: > > On Mar 7, 2013, at 4:37 PM, George Bosilca <bosi...@icl.utk.edu> wrote: > >> Please refrain from doing so, the assumption #1 this patch is based on is >> false. First, OBJ_CONSTRUCT can be run to constr

Re: [OMPI devel] RFC: assert() to ensure OBJ_CONSTRUCT'ed objects don't get destroyed

2013-03-08 Thread George Bosilca
quests); and then ompi_freelist.c:86. George. > > > > On Mar 8, 2013, at 7:22 AM, George Bosilca <bosi...@icl.utk.edu> wrote: > >> I'm sorry Ralph, I'm puzzled by your approach. You knowingly use a broken >> example to justify a patch that under correct/consistent usage

Re: [OMPI devel] RFC: assert() to ensure OBJ_CONSTRUCT'ed objects don't get destroyed

2013-03-08 Thread George Bosilca
On Mar 8, 2013, at 15:56 , "Jeff Squyres (jsquyres)" <jsquy...@cisco.com> wrote: > On Mar 8, 2013, at 9:39 AM, George Bosilca <bosi...@icl.utk.edu> wrote: > >> I have a more advanced use case for you. Based on the MPI standard, >> MPI_Finalize can be call

Re: [OMPI devel] RFC: assert() to ensure OBJ_CONSTRUCT'ed objects don't get destroyed

2013-03-08 Thread George Bosilca
;Jeff Squyres (jsquyres)" <jsquy...@cisco.com> wrote: > On Mar 8, 2013, at 10:20 AM, George Bosilca <bosi...@icl.utk.edu> wrote: > >>> If the app REQUEST_FREE'd a nonblocking send/receive, don't we block in >>> ompi_mpi_finalize() before th

Re: [OMPI devel] RFC: assert() to ensure OBJ_CONSTRUCT'ed objects don't get destroyed

2013-03-08 Thread George Bosilca
On Mar 8, 2013, at 17:37 , "Jeff Squyres (jsquyres)" wrote: > He removed a bunch of text in the middle (see > https://svn.mpi-forum.org/trac/mpi-forum-web/ticket/143). In short: there is > NO way for a user to know when a REQUEST_FREEd request has completed, because >

Re: [OMPI devel] assert in opal_datatype_is_contiguous_memory_layout

2013-04-08 Thread George Bosilca
Eric, Thanks for the report. I used your example to replicate the issue and I confirm it appears in all versions in debug mode. However, the assert in the convertor code is correct and your code as well. The issue is more complex, and it is triggered by a usage of the convertor which should

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r28319 - trunk/opal/datatype

2013-04-10 Thread George Bosilca
-mai...@open-mpi.org wrote: > >> Author: bosilca (George Bosilca) >> Date: 2013-04-09 19:01:54 EDT (Tue, 09 Apr 2013) >> New Revision: 28319 >> URL: https://svn.open-mpi.org/trac/ompi/changeset/28319 >> >> Log: >> Fix an issue identified by Thom

Re: [OMPI devel] Bugfix for pending zero byte packages

2013-04-25 Thread George Bosilca
Sure, it should be included in the 1.6 as well. George. On Apr 25, 2013, at 03:39 , Jeff Squyres (jsquyres) wrote: > Ok; thanks. > > It looks like this should go to v1.6, too -- right (Nathan/George/Brian)? > > > > On Apr 24, 2013, at 9:31 PM, Ralph Castain

Re: [OMPI devel] [EXTERNAL] Developer meeting: mid/late summer?

2013-04-27 Thread George Bosilca
I would but that particular week I'm teaching a summer school. Hopefully you can setup a webex. George. On Apr 27, 2013, at 00:21 , "Jeff Squyres (jsquyres)" wrote: > Ok, we can probably do this. > > Is anyone else interested? > > > On Apr 24, 2013, at 1:25 PM,

Re: [OMPI devel] [OMPI svn] svn:open-mpi r28417 - trunk/ompi/mca/vprotocol/base

2013-04-30 Thread George Bosilca
This commit broke the trunk. George. On Apr 30, 2013, at 17:21 , svn-commit-mai...@open-mpi.org wrote: > Author: hjelmn (Nathan Hjelm) > Date: 2013-04-30 11:21:42 EDT (Tue, 30 Apr 2013) > New Revision: 28417 > URL: https://svn.open-mpi.org/trac/ompi/changeset/28417 > > Log: > vprotocol:

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r27880 - trunk/ompi/request

2013-04-30 Thread George Bosilca
t;>> MPI development team, >>> Fujitsu >>> >>>> To be honest it was hanging in one of my repos for some time. If I'm not >>>> mistaken it is somehow related to one active ticket (but I couldn't find >>>> the info). It might be good to pu

Re: [OMPI devel] Datatype initialization bug?

2013-05-17 Thread George Bosilca
Takahiro, Nice catch, I really wonder how this one survived for soo long. I pushed a patch in r28535 addressing this issue. It is not the best solution, but it provide an easy way to address the issue. A little bit of history. A datatype is composed by (let's keep it short) 2 component, a

Re: [OMPI devel] Datatype initialization bug?

2013-05-22 Thread George Bosilca
Takahiro, I used your second patch the one that remove the copy of the description in the OMPI level (r28553). Thanks for your help and your patience in investigating this issues. George. On May 22, 2013, at 02:05 , "Kawashima, Takahiro" wrote: > George, > >

Re: [OMPI devel] RFC: Add static initializer for opal_mutex_t

2013-06-08 Thread George Bosilca
static initializer because there > was no static initializer for mutexes in windows. > > > On Jun 7, 2013, at 9:28 AM, George Bosilca <bosi...@icl.utk.edu> wrote: > >> Im curious to know why Windows support is to be blamed for the lack of such >> functionality? &

Re: [OMPI devel] RFC: Add static initializer for opal_mutex_t

2013-06-10 Thread George Bosilca
On Jun 10, 2013, at 17:18 , Nathan Hjelm <hje...@lanl.gov> wrote: > On Sat, Jun 08, 2013 at 12:28:02PM +0200, George Bosilca wrote: >> All Windows objects that are managed as HANDLES can easily be modified to >> have static initializer. A clean solution is att

Re: [OMPI devel] RFC: improve the hash function used by opal_hash_table_t

2013-06-11 Thread George Bosilca
The one-at-the-time version computes on chars, if the performance of the hash function is a critical element in the equation then you will be better off avoiding its usage. I would suggest going with Murmur (http://en.wikipedia.org/wiki/MurmurHash) instead, which is faster and perform well in

Re: [OMPI devel] RFC: improve the hash function used by opal_hash_table_t

2013-06-11 Thread George Bosilca
On Jun 12, 2013, at 00:22 , Nathan Hjelm wrote: > Though a hardware accelerated crc32 (if available) would probably work great > as well. http://google-opensource.blogspot.fr/2011/04/introducing-cityhash.html with code available under MIT @ https://code.google.com/p/cityhash/

Re: [OMPI devel] BTL sendi

2013-06-19 Thread George Bosilca
Then let me provide a more elaborate answer. In the original design of the btl_sendi operation we do not provide an upper limit for the sendi (in the same sense as the eager protocol). Thus, an upper layer (PML in this instance) cannot know if the sendi will succeed or not before the call

Re: [OMPI devel] Problem when using struct types at specific offsets

2013-06-21 Thread George Bosilca
Thomas, I'm not aware about any other issue with the datatypes. There might an easy way to see what the issue with your application is. If you can debug your application, and know exactly which datatype has problems, then attach with gdb and call ompi_datatype_dump(type), where type is the

Re: [OMPI devel] RGET issue when send is less than receive

2013-06-21 Thread George Bosilca
The amount of bytes received is atomically updated on the completion callback, and the completion test is clearly spelled-out int the recv_request_pml_complete_check function (of course minus the lock part). Rolf I think your patch is correct. That being said req_bytes_expected is a special

Re: [OMPI devel] Problem when using MPI_Type_create_struct + MPI_Type_dup

2013-06-24 Thread George Bosilca
Thomas, I tried your test with the current svn version of the 1.6 (to be 1.6.5 I guess), and your test pass without any issues. George. On Jun 24, 2013, at 15:22 , Thomas Jahns wrote: > Hello, > > the following code exposes a problem we are experiencing with our OpenMPI >

[OMPI devel] RFC MPI 2.2 Dist_graph addition

2013-06-24 Thread George Bosilca
WHAT:Support for MPI 2.2 dist_graph WHY: To become [almost entierly] MPI 2.2 compliant WHEN:Monday July 1st As discussed during the last phone call, a missing functionality of the MPI 2.2 standard (the distributed graph topology) is ready for prime-time. The attached patch provide

Re: [OMPI devel] Cross Memory Attach support in OpenMPI

2013-06-27 Thread George Bosilca
https://svn.open-mpi.org/trac/ompi/changeset/26134 George. On Jun 27, 2013, at 16:43 , Lukasz Flis wrote: > Dear All, > > Some time ago there was a discussion on this list regarding enabling CMA > support in OpenMPI. There were 2 positive votes > >

Re: [OMPI devel] RFC MPI 2.2 Dist_graph addition

2013-07-01 Thread George Bosilca
Guys, Thanks for the patch and for the tests. All these changes/cleanups are correct, I have incorporate them all in the patch. Please find below the new patch. As the deadline for the RFC is today, I'll move forward and push the changes into the trunk, and if there are still issues we can

Re: [OMPI devel] RFC MPI 2.2 Dist_graph addition

2013-07-01 Thread George Bosilca
The patch has been pushed into the trunk in r28687. George. On Jul 1, 2013, at 13:55 , George Bosilca <bosi...@icl.utk.edu> wrote: > Guys, > > Thanks for the patch and for the tests. All these changes/cleanups are > correct, I have incorporate them all in the patch.

Re: [OMPI devel] Barrier Implementation Oddity

2013-07-01 Thread George Bosilca
Yes, Bruck for barrier is a variant of the dissemination algorithm as described in: - Debra Hensgen, Raphael Finkel, and Udi Manbet. Two algorithms for barrier synchronization. International Journal of Parallel Programming, 17(1):1–17, 1988. George. On Jun 29, 2013, at 12:05 , Ronny

Re: [OMPI devel] RFC MPI 2.2 Dist_graph addition

2013-07-01 Thread George Bosilca
Ahem … George. topo.patch Description: Binary data On Jul 1, 2013, at 13:55 , George Bosilca <bosi...@icl.utk.edu> wrote: > Guys, > > Thanks for the patch and for the tests. All these changes/cleanups are > correct, I have incorporate them all in the patch. Please

Re: [OMPI devel] RFC MPI 2.2 Dist_graph addition

2013-07-01 Thread George Bosilca
t; [savbu-usnic-a:24891] 4 more processes have sent help message > help-mpi-errors.txt / mpi_errors_are_fatal > [savbu-usnic-a:24891] Set MCA parameter "orte_base_help_aggregate" to 0 to > see all help / error messages > [6:51] savbu-usnic-a:~/s/o/dist_graph ❯❯❯ > ---

[OMPI devel] RFC: OMPI_FREE_LIST_{GET|WAIT} lose the rc argument

2013-07-02 Thread George Bosilca
Our macros for the OMPI-level free list had one extra argument, a possible return value to signal that the operation of retrieving the element from the free list failed. However in this case the returned pointer was set to NULL as well, so the error code was redundant. Moreover, this was a

Re: [OMPI devel] [EXTERNAL] RFC: OMPI_FREE_LIST_{GET|WAIT} lose the rc argument

2013-07-02 Thread George Bosilca
gt; > > On Jul 2, 2013, at 10:40 AM, "Barrett, Brian W" <bwba...@sandia.gov> wrote: > >> On 7/2/13 8:22 AM, "George Bosilca" <bosi...@icl.utk.edu> wrote: >> >>> Our macros for the OMPI-level free list had one extra argument, a possible >

Re: [OMPI devel] [EXTERNAL] RFC: OMPI_FREE_LIST_{GET|WAIT} lose the rc argument

2013-07-04 Thread George Bosilca
RFC completed at revision r28722. George. On Jul 2, 2013, at 18:17 , "Barrett, Brian W" <bwba...@sandia.gov> wrote: > Jeff thought it was me and I thought it was you, so I think we're ok :). > > Brian > > On 7/2/13 9:45 AM, "George Bosilca" <bosi

Re: [OMPI devel] Annual OMPI membership review: SVN accounts

2013-07-09 Thread George Bosilca
Indeed Thomas is now part of UTK. George. On Jul 9, 2013, at 7:47, Brice Goglin wrote: > Le 09/07/2013 00:32, Jeff Squyres (jsquyres) a écrit : >> INRIA >> >> bgoglin: Brice Goglin >> arougier: Antoine Rougier

Re: [OMPI devel] [bug] One-sided communication with a duplicated datatype

2013-07-14 Thread George Bosilca
Takahiro, Nice catch. That particular code was an over-optimizations … that failed. Please try with the patch below. Let me know if it's working as expected, I will push it in the trunk once confirmed. George. Index: ompi/datatype/ompi_datatype_args.c

Re: [OMPI devel] [bug] One-sided communication with a duplicated datatype

2013-07-14 Thread George Bosilca
Takahiro, Please find below another patch, this time hopefully fixing all issues. The problem with my original patch and with yours was that they try to address the packing of the data representation without fixing the computation of the required length. As a result the length on the packer

Re: [OMPI devel] [bug] One-sided communication with a duplicated datatype

2013-07-15 Thread George Bosilca
Thanks for testing it. It is now in trunk r28790. George. On Jul 15, 2013, at 12:29 , KAWASHIMA Takahiro wrote: > George, > > Thanks. I've confirmed your patch. > I wrote a simple program to test your patch and no problems are found. > The test program is

Re: [OMPI devel] RFC: revised ORTE error handling

2013-07-15 Thread George Bosilca
Ralph, Sorry for the late answer, we have quite a few things on our todo list right now. Here are few concerns I'm having about the proposed approach. 1. We would have preferred to have a list of processes for the ompi_errhandler_runtime_callback function. We don't necessary care about the

Re: [OMPI devel] RFC: add support for large counts using derived datatypes

2013-07-16 Thread George Bosilca
Nathan, I read your code and it's definitively looking good. I have however few minor issues with your patch. 1. MPI_Aint is unsigned as it must represent the difference between two memory arbitrary locations. In your MPI_Type_get_[true_]extent_x you go through size_t possibly reducing it's

Re: [OMPI devel] RFC: add support for large counts using derived datatypes

2013-07-16 Thread George Bosilca
On Jul 16, 2013, at 22:29 , Jeff Squyres (jsquyres) <jsquy...@cisco.com> wrote: > On Jul 16, 2013, at 4:22 PM, George Bosilca <bosi...@icl.utk.edu> wrote: > >> Btw, I have a question to you fellow MPI Forum attendees. I just can't >> remember why the M

Re: [OMPI devel] RFC: add support for large counts using derived datatypes

2013-07-16 Thread George Bosilca
gt; > On Tue, Jul 16, 2013 at 10:48:12PM +0200, George Bosilca wrote: >> It's a typo, MPI_Aint is of course unsigned. >> >> George. >> >> On Jul 16, 2013, at 22:37 , David Goodell (dgoodell) <dgood...@cisco.com> >> wrote: >> >>> On

Re: [OMPI devel] RFC: add support for large counts using derived datatypes

2013-07-16 Thread George Bosilca
On Jul 16, 2013, at 23:07 , "Jeff Squyres (jsquyres)" <jsquy...@cisco.com> wrote: > On Jul 16, 2013, at 5:03 PM, George Bosilca <bosi...@icl.utk.edu> wrote: > >>> Yes, it can -- it has to be the largest integer type (i.e., it even has to

Re: [OMPI devel] RFC: add support for large counts using derived datatypes

2013-07-16 Thread George Bosilca
On Jul 16, 2013, at 23:11 , "David Goodell (dgoodell)" <dgood...@cisco.com> wrote: > On Jul 16, 2013, at 4:03 PM, George Bosilca <bosi...@icl.utk.edu> > wrote: > >> On Jul 16, 2013, at 22:29 , Jeff Squyres (jsquyres) <jsquy...@cisco.com> >> w

[OMPI devel] ompi_info

2013-07-16 Thread George Bosilca
I would like to question the choice for the new … spartan ompi_info output? I would not mind restoring the default behavior, aka. have a verbose "--all", instead of some [random] MCA params. Btw, something is wrong i the following output. I have an "btl = sm,self" in my

<    1   2   3   4   5   6   7   8   9   10   >