r20275 looks good. I suggest that we CMR that into 1.3 and get rc6 rolled
and tested. (actually, Jeff just did the CMR...so off to rc6)
--brad
On Wed, Jan 14, 2009 at 1:16 PM, Edgar Gabriel wrote:
> so I am not entirely sure why the bug only happened on trunk, it could in
>
so I am not entirely sure why the bug only happened on trunk, it could
in theory also appear on v1.3 (is there a difference on how
pointer_arrays are handled between the two versions?)
Anyway, it passes now on both with changeset 20275. We should probably
move that over to 1.3 as well,
So, if it looks okay on 1.3...then there should not be anything holding up
the release, right? Otherwise, George we need to decide on whether or not
this is a blocker, or if we go ahead and release with this as a known issue
and schedule the fix for 1.3.1. My vote is to go ahead and release, but
I'm already debugging it. the good news is that it only seems to appear
with trunk, with 1.3 (after copying the new tuned module over), all the
tests pass.
Now if somebody can tell me a trick on how to tell mpirun not kill the
debugger under my feet, then I could even see where the problem
All these errors are in the MPI_Finalize, it should not be that hard
to find. I'll take a look later this afternoon.
george.
On Jan 14, 2009, at 06:41 , Tim Mattox wrote:
Unfortunately, although this fixed some problems when enabling
hierarch coll,
there is still a segfault in two of
Unfortunately, although this fixed some problems when enabling hierarch coll,
there is still a segfault in two of IU's tests that only shows up when we set
-mca coll_hierarch_priority 100
See this MTT summary to see how the failures improved on the trunk,
but that there are still two that
Here we go by the book :)
https://svn.open-mpi.org/trac/ompi/ticket/1749
george.
On Jan 13, 2009, at 23:40 , Jeff Squyres wrote:
Let's debate tomorrow when people are around, but first you have to
file a CMR... :-)
On Jan 13, 2009, at 10:28 PM, George Bosilca wrote:
Unfortunately, this
George, I suggest that you file a CMR for r20267 and we can
go from there. If it makes 1.3 it makes it, otherwise we have
it ready for 1.3.1 At this point the earliest 1.3 will go out is
Wednesday late morning (presuming I'm the one moving
the bits), and is more likely to hit the website in the
Let's debate tomorrow when people are around, but first you have to
file a CMR... :-)
On Jan 13, 2009, at 10:28 PM, George Bosilca wrote:
Unfortunately, this pinpoint the fact that we didn't test enough the
collective module mixing thing. I went over the tuned collective
functions and
Unfortunately, this pinpoint the fact that we didn't test enough the
collective module mixing thing. I went over the tuned collective
functions and changed all instances to use the correct module
information. It is now on the trunk, revision 20267. Simultaneously,I
checked that all other
Thanks for digging into this. Can you file a bug? Let's mark it for
v1.3.1.
I say 1.3.1 instead of 1.3.0 because this *only* affects hierarch, and
since hierarch isn't currently selected by default (you must
specifically elevate hierarch's priority to get it to run), there's no
danger
I just debugged the Reduce_scatter bug mentioned previously. The bug is
unfortunately not in hierarch, but in tuned.
Here is the code snipplet causing the problems:
int reduce_scatter (, mca_coll_base_module_t *module)
{
...
err = comm->c_coll.coll_reduce (, module)
...
}
but
12 matches
Mail list logo