Creating nightly hwloc snapshot SVN tarball was a success.
Snapshot: hwloc 1.3.1rc3r4071
Start time: Thu Dec 15 21:04:17 EST 2011
End time: Thu Dec 15 21:07:12 EST 2011
Your friendly daemon,
Cyrador
I have an idea. How about we set those the MPIR variables as weak. Just tested
it with STAT.
Can you replace orte/tools/orterun/orterun.c with the attached version and see
if it fixes the issue?
-Nathan
On Thu, 15 Dec 2011, Ashley Pittman wrote:
padb just calls gdb, you can see the error
padb just calls gdb, you can see the error using gdb alone using just the trace
I sent when I started this thread.
Perhaps the difference is in versions of gdb, I could give you a login to my
test machine if you need?
Ashley.
On 15 Dec 2011, at 22:49, Nathan Hjelm wrote:
> Whats odd is
Whats odd is totalview, STAT, and GDB see the correct values despite them being
in the B section. What does padb do differently?
This is a dynamic, optimized build of 1.5.5rc1.
-Nathan Hjelm
HPC-3, LANL
On Thu, 15 Dec 2011, Ashley Pittman wrote:
If I add a new symbol to
If I add a new symbol to orte/mca/debugger/base/debugger_base_open.c and
declare it in orte/mca/debugger/base/base.h, the same as MPIR_proctable_size is
defined then it appears in the .so but not in the binary, if I then reference
this variable in orte/tools/orterun/orterun.c the symbol
+1 on Ralph's comment -- it's not on the trunk. Perhaps the CMR didn't
properly remove it from v1.5, but that explains why it's not in the v1.5
Makefile.am.
On Dec 15, 2011, at 5:08 PM, George Bosilca wrote:
> This is quite impressive. After digging a little bit more, it appears that
> the
orte/tools/orterun/debuggers.c does not exist anymore (its not in the 1.5.5rc1
tarball). I don't know why the symbols are showing up in section B of orterun.
Investigating now.
-Nathan Hjelm
HPC-3, LANL
On Thu, 15 Dec 2011, George Bosilca wrote:
On Dec 15, 2011, at 16:55 , Ashley Pittman
This is quite impressive. After digging a little bit more, it appears that the
orte/tools/orterun/debuggers.c is in the repository but it is not used for
compilation. Thus, I really don't see where the second definition is coming
from?
george.
On Dec 15, 2011, at 17:02 , George Bosilca
This file does not exist in the trunk, and should not exist in 1.5 any more.
Perhaps the patch for 1.5 didn't correctly delete it?
On Dec 15, 2011, at 3:02 PM, George Bosilca wrote:
> ./orte/tools/orterun/debuggers.c:142:struct MPIR_PROCDESC *MPIR_proctable =
> NULL;
>
That appears to be a similar problem to the MPIR_Breakpoint bug. Let me play
around and see if I can find a fix.
-Nathan Hjelm
HPC-3, LANL
On Thu, 15 Dec 2011, Ashley Pittman wrote:
There is a problem with 1.5.5rc1 that prevents padb from loading the process
table start from the orterun
On Dec 15, 2011, at 16:55 , Ashley Pittman wrote:
> There is a problem with 1.5.5rc1 that prevents padb from loading the process
> table start from the orterun process, what appears to be happening is that
> MPIR_proctable and MPIR_proctable_size is present in both orterun itself and
> also
There is a problem with 1.5.5rc1 that prevents padb from loading the process
table start from the orterun process, what appears to be happening is that
MPIR_proctable and MPIR_proctable_size is present in both orterun itself and
also in libopen-rte.so, the code is correctly setting them in
On 15 Dec 2011, at 20:16, Ashley Pittman wrote:
>
> On 14 Dec 2011, at 04:36, Jeff Squyres wrote:
>
>> In the usual place:
>>
>> http://www.open-mpi.org/software/ompi/v1.5/
>>
>> Please test! I would really like to get this out by the end of the week.
>
> As with 1.4 I've tested it on
Right -- the symbol isn't declared in orterun. It's in libopen-rte.so.
My changes ensure that the .o file that MPIR_Breakpoint is defined in will be
pulled in by the linker to be in the mpirun process.
On Dec 15, 2011, at 3:30 PM, Nathan Hjelm wrote:
> Your changes don't break anything but
Your changes don't break anything but they also don't cause MPIR_Breakpoint to
appear in orterun:
ct-login1:/scratch2/hjelmn hjelmn$ nm `type -p orterun` | grep MPIR
0060b0e0 B MPIR_attach_fifo
0060b2e0 B MPIR_being_debugged
0060b7b0 B MPIR_debug_state
0060ada0 B
On Dec 15, 2011, at 2:51 PM, George Bosilca wrote:
> This patch is not correct. All these variables have been moved into the ORTE
> layer (they are declared in orte/mca/debugger/base/base.h), so they should be
> in fact removed from the MPI level files.
>
> While I don't think moving them all
This patch is not correct. All these variables have been moved into the ORTE
layer (they are declared in orte/mca/debugger/base/base.h), so they should be
in fact removed from the MPI level files.
While I don't think moving them all in the ORTE was a good choice, changing
their definition in
On 8 Dec 2011, at 22:13, Jeff Squyres wrote:
> 1.4.5rc1 is now posted in the usual place:
>
>http://www.open-mpi.org/software/ompi/v1.4/
>
> Gearing up for a pre-Christmas release -- please test! There have only been
> a few bug fixes since 1.4.4. See
>
Ok, here's what I did:
https://svn.open-mpi.org/trac/ompi/changeset/25660
--> pulls in symbols like MPIR_Breakpoint via a different dummy function
https://svn.open-mpi.org/trac/ompi/changeset/25661
--> Fixes the ORTE_DECLSPEC typos that George found
LANL: Can you verify that this (still) works
On Dec 15, 2011, at 10:28 AM, Ralph Castain wrote:
>> I have had the chance now to test it with totalview and stat 1.1.0. Looks
>> good. I pushed the fix to the trunk and it will need to be CMRed to 1.5.
Ralph and I just talked about this on the phone some more -- I don't think
mpirun --oversubscribe or OMPI_MCA_rmaps_base_oversubscribe=1
On Dec 15, 2011, at 11:27 AM, TERRY DONTJE wrote:
> There's an oversubscribe option I can set in my case, right?
>
> Thanks,
>
> --td
>
> On 12/15/2011 1:22 PM, Ralph Castain wrote:
>>
>> This is fixed, to a degree, with
There's an oversubscribe option I can set in my case, right?
Thanks,
--td
On 12/15/2011 1:22 PM, Ralph Castain wrote:
This is fixed, to a degree, with r25659. However, note that there is
one big change that occurred back when we first committed the mapping
change.
As I noted at that time,
This is fixed, to a degree, with r25659. However, note that there is one big
change that occurred back when we first committed the mapping change.
As I noted at that time, we changed the default for RM-given allocations to be
no-oversubscribe. So your MTTs may well fail if they weren't updated
Le 15/12/2011 16:31, bgog...@osl.iu.edu a écrit :
> Author: bgoglin
> Date: 2011-12-15 10:31:50 EST (Thu, 15 Dec 2011)
> New Revision: 4069
> URL: https://svn.open-mpi.org/trac/hwloc/changeset/4069
>
> Log:
> Fix a long-standing obsolete PREDEFINED in the website doxygen config
There are still
I'll take a look, Terry - it has to be the change I made yesterday.
On Dec 15, 2011, at 8:37 AM, TERRY DONTJE wrote:
> Last night MTT test results for 1.7a1r25652 from IU and Oracle is showing
> failures during some of the spawn tests see
> http://www.open-mpi.org/mtt/index.php?do_redir=2036.
Last night MTT test results for 1.7a1r25652 from IU and Oracle is
showing failures during some of the spawn tests see
http://www.open-mpi.org/mtt/index.php?do_redir=2036.
Essentially, the test are failing with the message:
All nodes which are allocated for this job are already filled.
I
On Dec 15, 2011, at 8:21 AM, Nathan Hjelm wrote:
>
>
> On Wed, 14 Dec 2011, Ralph Castain wrote:
>
>> Yes - we were having problems making symbols in orterun visible for the
>> "stat" debugger when built dynamically. The symbols are actually
>> instantiated in the debugger base, but they
On Wed, 14 Dec 2011, Ralph Castain wrote:
Yes - we were having problems making symbols in orterun visible for the "stat" debugger
when built dynamically. The symbols are actually instantiated in the debugger base, but they need
to be "seen" in orterun prior to us calling orte_init. So, we
On 12/14/2011 10:36 PM, Brice Goglin wrote:
I committed the silence-warning patch but I will keep the other part for
now. I am a bit afraid of changing that much code in 1.3.1 without being
sure whether it's necessary.
Sounds good to me.
I certainly have no grounds to argue that RHL8 support
Le 14/12/2011 22:42, Paul H. Hargrove a écrit :
>
>
> On 12/14/2011 1:21 PM, Brice Goglin wrote:
>> The attached patch might work. I am not sure all this is actually
>> necessary because things have been working fine so far, apart from your
>> warnings.
>
> Yup, the patch silences the warnings.
30 matches
Mail list logo