In the usual place - this is an early rc as it doesn’t yet contain the thread
multiple fix that is impacting performance. However, I wanted to give people a
chance to run all their non-threaded functional validation tests.
The release candidate includes a wide range of bug fixes as reported by
Hi Folks,
I'm trying to use mtt on a cluster where it looks like the only functional
compiler that
1) can build open mpi master
2) can also build the ibm test suite
may be pgi. Can't compile write now, so I'm trying to fix it. But I'm now
wondering
whether we are still supporting building
I’m unaware of any conscious decision to cut pgi off - I think it has been more
a case of nobody having a license to use for testing.
> On Dec 11, 2014, at 7:37 AM, Howard Pritchard wrote:
>
> Hi Folks,
>
> I'm trying to use mtt on a cluster where it looks like the only
On Dec 11, 2014, at 7:40 AM, Ralph Castain wrote:
> I’m unaware of any conscious decision to cut pgi off - I think it has been
> more a case of nobody having a license to use for testing.
+1
--
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
On Thu, Dec 11, 2014 at 07:37:17AM -0800, Howard Pritchard wrote:
>Hi Folks,
>I'm trying to use mtt on a cluster where it looks like the only functional
>compiler that
>1) can build open mpi master
>2) can also build the ibm test suite
>may be pgi. Can't compile write now,
Okay, I'll try to fix things. problem in opal_datatype_internal.h, then a
meltdown with libfabric owing to the fact that its probably
only been used in a gnu env. I'll open an issue on that one and assign it
to Jeff.
I think we should be turning this libfabric build off unless one asks for
it.
On Dec 11, 2014, at 9:58 AM, Howard Pritchard wrote:
> Okay, I'll try to fix things. problem in opal_datatype_internal.h, then a
> meltdown with libfabric owing to the fact that its probably
> only been used in a gnu env. I'll open an issue on that one and assign it to
>
Jeff,
PGI compiler(s) are available on our Cluster:
$ module avail pgi
there are a lot of older versions, too:
$ module load DEPRECATED
$ module avail pgi
best
Paul
P.S. in our standard environmet, Intel compieler and Open MPI are active, so
$ module unload openmpi intel
$ module load pgi
Hi Jeff & Ralph,
Thanks for the response, and sorry for the delay in my reply. Attending the
developers meeting sounds like a good idea, But I will be back from my
vacation only on the 15th. So I will not be able to close in on my
possibilities to attend the developers meeting before that. I will
Ok. Howard asked me about this in person this week at the MPI Forum. I think
we all agree that this sounds like an interesting prospect; we just need to
make some adjustments in the OMPI infrastructure to make it happen. That will
take some discussion.
On Dec 11, 2014, at 11:58 AM,
Howard --
One thing I neglected to say -- if libfabric/usnic support on master is causing
problems for you, you can configure without libfabric:
./configure --without-libfabric ...
(which will, of course, also disable anything that requires libfabric)
The intent is that we build things by
Howard,
I regularly test release candidates against the PGI installations on
NERSC's systems (and sometimes elsewhere). In fact, have a test of
1.8.4rc2 against pgi-14.4 "in the pipe" right now.
I believe Larry Baker of USGS is also a PGI user (in production, rather
than just testing as I do).
On 11 Dec 2014, at 2:12 PM, Paul Hargrove wrote:
> I believe Larry Baker of USGS is also a PGI user (in production, rather than
> just testing as I do).
That is correct.
Although we are running a rather old Rocks cluster kit (CentOS based) which is
so old that we cannot run the latest PGI
Testing the 1.8.4rc2 tarball on my x86-64 Solaris-11 systems I am getting
the following crash for both "-m32" and "-m64" builds:
$ mpirun -mca btl sm,self,openib -np 2 -host pcp-j-19,pcp-j-20
examples/ring_c'
[pcp-j-19:18762] *** Process received signal ***
[pcp-j-19:18762] Signal: Segmentation
Ah crud - incomplete commit means we didn’t send the topo string. Will roll rc3
in a few minutes.
Thanks, Paul
Ralph
> On Dec 11, 2014, at 3:08 PM, Paul Hargrove wrote:
>
> Testing the 1.8.4rc2 tarball on my x86-64 Solaris-11 systems I am getting the
> following crash for
The overall design in OMPI was that no OMPI module should be allowed to
decide if threads are on (thus it should not rely on the value
returned by opal_using_threads
during it's initialization stage). Instead, they should respect the level
of thread support requested as an argument during the
Just to help me understand: I don’t think this change actually changed any
behavior. However, it certainly *allows* a different behavior. Isn’t that true?
If so, I guess the real question is for Pascal at Bull: why do you feel this
earlier setting is required?
> On Dec 11, 2014, at 4:21 PM,
George,
please allow me to jump in with naive comments ...
currently (master) both openib and usnic btl invokes opal_using_threads
in component_init() :
btl_openib_component_init(int *num_btl_modules,
bool enable_progress_threads,
bool
I think I've reported this earlier in the 1.8 series.
If I compile on an SGI UV (e.g. blacklight at PSC) configure picks up the
presence of xpmem headers and enables the vader BTL.
However, the port of vader to SGI's "flavor" of xpmem is incomplete and the
following build failure results:
Don't see an rc3 yet.
My Solaris-10/SPARC runs fail slightly differently (see below).
It looks sufficiently similar that it MIGHT be the same root cause.
However, lacking an rc3 to test I figured it would be better to report this
than to ignore it.
The problem is present with both V8+ and V9
No, that looks different - it’s failing in mpirun itself. Can you get a line
number on it?
Sorry for delay - I’m generating rc3 now
> On Dec 11, 2014, at 6:59 PM, Paul Hargrove wrote:
>
> Don't see an rc3 yet.
>
> My Solaris-10/SPARC runs fail slightly differently (see
Backtrace for the Solaris-10/SPARC SEGV appears below.
I've changed the subject line to distinguish this from the earlier report.
-Paul
program terminated by signal SEGV (no mapping at the fault address)
0x7d93b634: strlen+0x0014: lduh [%o2], %o1
Current function is guess_strlen
Ralph,
Sorry to be the bearer of more bad news.
The "good" news is I've seen the new warning regarding the lack of a
loopback interface.
The BAD news is that it is occurring on a Linux cluster that I'ver verified
DOES have 'lo' configured on the front-end and compute nodes (UP and
RUNNING
Paul,
about the five warnings :
can you confirm you are running mpirun *not* on n15 nor n16 ?
if my guess is correct, then you can get up to 5 warnings : mpirun + 2
orted + 2 mpi tasks
do you have any oob_tcp_if_include or oob_tcp_if_exclude settings in
your openmpi-mca-params.conf ?
here is
I honestly think it has to be a selected interface, Gilles, else we will fail
to connect.
> On Dec 11, 2014, at 8:26 PM, Gilles Gouaillardet
> wrote:
>
> Paul,
>
> about the five warnings :
> can you confirm you are running mpirun *not* on n15 nor n16 ?
> if my
Thanks Paul - I will post a fix for this tomorrow. Looks like Sparc isn’t
returning an architecture type for some reason, and I didn’t protect against it.
> On Dec 11, 2014, at 7:39 PM, Paul Hargrove wrote:
>
> Backtrace for the Solaris-10/SPARC SEGV appears below.
> I've
26 matches
Mail list logo