Jeff Squyres wrote:
On Jul 23, 2008, at 10:37 AM, Terry Dontje wrote:
This seems to work for me too. What is interesting is my experiments
have shown that if you run on RH5.1 you don't need to set
mpi_yield_when_idle to 0.
Yes, this makes sense -- on RHEL5.1, it's a much newer Linux kernel
is the architecture where you noticed the performance degradation ?
This is on a 2 year old system with 2 chips single AMD cores each chip
running both Solaris and Linux under 64 bit addressing.
--td
Thanks,
george.
On Jul 11, 2008, at 1:32 PM, Terry Dontje wrote:
Has anyone else seen the trunk incur
Has anyone else seen the trunk incur approximately a 10% increase in
latency? I think this has happened in the last couple weeks. I have
verified that it isn't due to the recheck put into the
sm_component_progress. I am about ready to try and track this down but
wanted to throw this out
Ralph H Castain wrote:
On 7/11/08 7:48 AM, "Terry Dontje" <terry.don...@sun.com> wrote:
Jeff Squyres wrote:
Check that -- Ralph and I talked more about #1383 and have come up
with a decent/better solution that a) is not wonky and b) does not
involve MCA parameter
Jeff Squyres wrote:
Check that -- Ralph and I talked more about #1383 and have come up
with a decent/better solution that a) is not wonky and b) does not
involve MCA parameter synonyms. We're working on it in an hg and will
put it back when done (probably within a business day or three).
So
to
repair this new hole will quite likely open another one somewhere else. So
even if we can "fix" the duplicate stdin problem...did the kid really
improve the situation?
On 7/10/08 7:05 AM, "Terry Dontje" <terry.don...@sun.com> wrote:
I see that Jeff has updated
I see that Jeff has updated the ticket saying that he is looking at the
code to see if he can generate a fix so the below may be superfluous.
Anyways, what were the issues fixed in 1.3? I really comes down to how
much more pain are we
giving our users by rolling back to 1.2 or not.
Note, I
Brian W. Barrett wrote:
On Mon, 7 Jul 2008, Terry Dontje wrote:
Just curious has anyone done comparisons of latency measurements as
one changes the size of a job. That is changing the size of the job
(and number of nodes used) and just taking the half roundtrip latency
of two
Just curious has anyone done comparisons of latency measurements as one
changes the size of a job. That is changing the size of the job (and
number of nodes used) and just taking the half roundtrip latency of two
of the processes in the job. I am roughly seeing an addition of 5% to
the
ng the ack packets.
--td
george.
On Jun 19, 2008, at 2:16 PM, Terry Dontje wrote:
Galen, George and others that might have SM BTL interest.
In my quest of looking at MPI_Iprobe performance I found what I think
is an issue. If you have an application that is using the SM BTL and
does a sma
ifs from the critical path for receives ...
george.
On Jun 18, 2008, at 3:57 PM, Brian W. Barrett wrote:
On Wed, 18 Jun 2008, Terry Dontje wrote:
Jeff Squyres wrote:
Perhaps we did that as a latency optimization...?
George / Brian / Galen -- do you guys know/remember why this was done
on the unexpected queue or do I
need to FINI the request and regenerate it again?
--td
On Jun 17, 2008, at 11:43 AM, Terry Dontje wrote:
I've ran into an issue while running hpl where a message has been
sent (in shared memory in this case) and the receiver calls iprobe
but doesn't see said
I've ran into an issue while running hpl where a message has been sent
(in shared memory in this case) and the receiver calls iprobe but
doesn't see said message the first call to iprobe (even though it is
there) but does see it the second call to iprobe. Looking at
mca_pml_ob1_iprobe
, Terry Dontje wrote:
When compiling the latest trunk under SLES 9 I am seeing the
following error:
../../../../../opal/mca/maffinity/libnuma/maffinity_libnuma_module.c:118:
error: `MPOL_MF_MOVE' undeclared (first use in this function)
Looks like SLES 9 numaif.h does not support MPOL_MF_MOVE
When compiling the latest trunk under SLES 9 I am seeing the following
error:
../../../../../opal/mca/maffinity/libnuma/maffinity_libnuma_module.c:118:
error: `MPOL_MF_MOVE' undeclared (first use in this function)
Looks like SLES 9 numaif.h does not support MPOL_MF_MOVE. Can we
somehow
Jeff Squyres wrote:
Brian and I were chatting the other day about random OMPI stuff and
the topic of the memory hooks came up again. Brian was wondering if
we should [finally] revisit this topic -- there's a few things that
could be done to make life "better". Two things jump to mind:
-
Jeff Squyres wrote:
On May 22, 2008, at 6:50 AM, Terry Dontje wrote:
Brian and I chatted a bit about this off-list, and I think we're in
agreement now:
- do not change the default value or meaning of
btl_base_want_component_unsed.
- major point of confusion: the openib BTL is actually
Jeff Squyres wrote:
Brian and I chatted a bit about this off-list, and I think we're in
agreement now:
- do not change the default value or meaning of
btl_base_want_component_unsed.
- major point of confusion: the openib BTL is actually fairly unique
in that it can (and does) tell the
So are you proposing to set btl_base_warn_component_unused to 0 or
something more BTL specific?
--td
Jeff Squyres wrote:
What: Change default in openib BTL to not complain if no OpenFabrics
devices are found
Why: Many linuxes are shipping libibverbs these days, but most users
still don't
I am right to assume that bug fixes are allowed.
--td
Brad Benton wrote:
All:
As of today (May 13, 2008), the trunk is under v1.3 feature freeze
until it is stabilized and branched (targeted for May 23, 2008). Here
are the guidelines for activity in the trunk while we are under the
v1.3
Roland Dreier wrote:
> > > Can I make a /tmp branch from the hg read-only branch that is not tied
> > > to the svn /tmp branches.
> > Why do you want to do that?
> >
> > Mercurial is a fully distributed system, so you could just start
> > committing to one of your local copies of the
Ralph Castain wrote:
Sure:
hg clone http://www.open-mpi.org/hg/hgwebdir.cgi/ompi-svn-mirror my-tmp
I want the tmp to reside on www.open-mpi.org not in my own directory.
--td
On 5/2/08 9:57 AM, "Terry Dontje" <terry.don...@sun.com> wrote:
Jeff Squyres wrote:
Roland Dreier wrote:
> Can I make a /tmp branch from the hg read-only branch that is not tied
> to the svn /tmp branches.
Why do you want to do that?
Mercurial is a fully distributed system, so you could just start
committing to one of your local copies of the repository, and I can't
see
Jeff Squyres wrote:
On May 2, 2008, at 11:04 AM, Terry Dontje wrote:
Is there a way to make a hg specific /tmp branch?
I'm not sure what you're asking...?
Can I make a /tmp branch from the hg read-only branch that is not tied
to the svn /tmp branches.
--td
Jeff Squyres wrote:
Taking steps towards Mercurial, we have setup a read-only Mercurial
mirror of the official OMPI SVN repository:
http://www.open-mpi.org/hg/hgwebdir.cgi/ompi-svn-mirror/
Anything you commit to SVN should show up on the HG mirror within 30
minutes.
This mirror is
Jeff Squyres wrote:
On Mar 27, 2008, at 8:02 AM, Lenny Verkhovsky wrote:
- I don't think we can delete the MCA param ompi_paffinity_alone; it
exists in the v1.2 series and has historical precedent.
It will not be deleted,
It will just use the same infrastructure ( slot_list parameter
Jeff Squyres wrote:
I can't remember if I posted about this before or not -- should we
disable trunk/VT building by default while the configuration issues
are being worked out?
If we want to guarrantee things to build by default I would think the
answer to the above would be yes.
--td
increasing number of such "included" packages, how
complex
is -that- release discussion going to get?!?
On 2/7/08 4:48 AM, "Terry Dontje" <terry.don...@sun.com> wrote:
Torsten Hoefler wrote:
Hi Brian,
Let me start by reminding everyone that I have
an impasse. I think both sides need to go to their respective
corners, count to ten and then maybe the respective communities should
consider trying to have two other people come together to discuss the
issues and concerns.
--td
Terry Dontje - Senior Staff Engineer
Sun Microsystems, Inc.
Marc
Jeff Squyres wrote:
I should also note the following:
- LAM/MPI does the same thing (increments refcount when GROUP_EMPTY is
returned to the user, and allows GROUP_EMPTY in GROUP_FREE)
- MPICH2 has the following comment in GROUP_FREE (and code to match):
/* Cannot free the
If the guidelines are made for the BTLs Sun will handle the udapl btl.
We can also help in testing too.
--td
George Bosilca wrote:
Yes, "us" means UTK. Our math folks are pushing hard for this. I'll
gladly accept any help, even if it's only for testing. For
development, I dispose of some of
Jeff Squyres wrote:
The MPICH guys presented TCP results with THREAD_MULTIPLE at Euro PVM/
MPI and frankly, I was amazed that it worked at all.
I seriously doubt that we're going to advance the state of threading
on the 1.2 series (which is nowhere as close as it is on the 1.3
series).
I haven't tried to debug the following but I am getting the following
errors when building the vt-integration tmp branch on Solaris. So I
don't think the branch is ready for putback yet.
--td
cc -DHAVE_CONFIG_H -I. -I..
At today's OMPI concall meeting it was decided some bug management was
appropriate. In order to do this we would like the community to do the
following two tasks:
1. Review all 1.2.X bugs under trac send email to the devel alias as to
the bugs you think should be fixed for 1.2.5 and beyond.
Brad Penoff wrote:
On Nov 12, 2007 3:26 AM, Jeff Squyres wrote:
I have no objections to bringing this into the trunk, but I agree that
an .ompi_ignore is probably a good idea at first.
I'll try to cook up a commit soon then!
One question that I'd like to have
George Bosilca wrote:
On Nov 6, 2007, at 8:38 AM, Terry Dontje wrote:
George Bosilca wrote:
If I understand correctly your question, then we don't need any
extension. Each request has a unique ID (from PERUSE perspective).
However, if I remember well this is only half implemented in our
Currently in order to do message tracing one either has to rely on some
error prone postprocessing of data or replicating some MPI internals up
in the PMPI layer. It would help Sun's tools group (and I believe U
Dresden also) if Open MPI would create a couple APIs that exoposed the
following:
Christian Bell wrote:
On Mon, 15 Oct 2007, Brian Barrett wrote:
No! :)
It would be good for everyone to read the Libtool documentation to
see why versioning on the release number would be a really bad idea.
Then comment. But my opinion would be that you should change based
on
Will these new scripts be using the same wrapper config files as the
binaries were?
--td
Richard Graham wrote:
What: Change the mpicc/mpicxx/mpif77/mpif90 from being binaries to being
shell scripts
Why: Our build environment assumes that wrapper compilers will use the same
binary format
Jeff Squyres wrote:
George has proposed to bring the DDT over from the trunk to the v1.2
branch before v1.2.5 in order to fix some pending bugs.
What does this entail (ie does this affect the pml interface at all)?
Also by saying "before v1.2.5" I am assuming you mean this fix is to
be
a new 1.2 RC.
thanks,
--td
Terry Dontje wrote:
Nikolay and Community,
Sorry to be so late in responding to your email but I've been working
with Pak to determine whether my hasty decision as RM yesterday was
hasty or not. To answer your question, we are still trying to determine
Nikolay and Community,
Sorry to be so late in responding to your email but I've been working
with Pak to determine whether my hasty decision as RM yesterday was
hasty or not. To answer your question, we are still trying to determine
if the message queue support can go in or not and the below
Jeff Squyres wrote:
All organizations should review the README file (particularly the
list of supported systems, etc.) to ensure that it is good-to-go and
accurate for the 1.2.4. release.
Tim posted 1.2.4rc1 yesterday:
http://www.open-mpi.org/software/ompi/v1.2/
I would like to
less theres something specific I'm after, ie benchmarks or
apps I'm using as a benchmark, rather than test suites.
You might look at some of the purple benchmarks:
http://www.llnl.gov/asci/platforms/purple/rfp/benchmarks/limited/
code_list.html
Andrew
Terry Dontje wrote:
What about Sandi
What about Sandia and LANL? Is there anything that is ran on their
large clusters to confirm things seem to work at high np's?
--td
Jeff Squyres wrote:
Cisco is not yet testing that large, but we plan to shortly start
testing at np>=128 (I'm waiting for an internal cluster within Cisco
to
Li-Ta Lo wrote:
On Thu, 2007-08-30 at 12:25 -0400, terry.don...@sun.com wrote:
Li-Ta Lo wrote:
On Wed, 2007-08-29 at 14:06 -0400, Terry D. Dontje wrote:
hmmm, interesting since my version doesn't abort at all.
Some problem with fortran compiler/language
Li-Ta Lo wrote:
On Wed, 2007-08-29 at 14:06 -0400, Terry D. Dontje wrote:
hmmm, interesting since my version doesn't abort at all.
Some problem with fortran compiler/language binding? My C translation
doesn't have any problem.
[ollie@exponential ~]$ mpirun -np 4 a.out 10
Target
201 - 247 of 247 matches
Mail list logo