The MPI Standard (in my opinion) should have avoided the word "buffer". To
me, a "buffer" is something you set aside as scratch space between the
application data structures and the communication calls.
In MPI, the communication is done directly from/to the application's data
structures and th
For reading the data from an isend buffer to cause problems, the
underlying hardware would need to have very unusual characteristic that
the MPI implementation is exploiting. People have imagined hardware
characteristics that could make reading an Isend buffer a problem but I
have never heard
I did not take the time to try to fully understand your approach so this
may sound like a dumb question;
Do you have an MPI_Bcast ROOT process in every MPI_COMM_WORLD and does
every non-ROOT MPI_Bcast call correctly identify the rank of ROOT in its
MPI_COMM_WORLD ?
An MPI_Bcast call when the
Sorry -
I missed the statement that all works when you add sleeps. That probably
rules out any possible error in the way MPI_Bcast was used.
Dick Treumann - MPI Team
IBM Systems & Technology Group
Dept X2ZA / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601
Tele (845) 433-7846 F
Randolf
I am confused about using multiple, concurrent mpirun operations. If
there are M uses of mpirun and each starts N tasks (carried out under pvm
or any other way) I would expect you to have M completely independent MPI
jobs with N tasks (processes) each. You could have some root in eac
- yes I know this should
> not happen, the question is why.
>
> --- On Wed, 11/8/10, Richard Treumann wrote:
>
> From: Richard Treumann
> Subject: Re: [OMPI users] MPI_Bcast issue
> To: "Open MPI Users"
> Received: Wednesday, 11 August, 2010, 11:34 PM
>
As of MPI 2.2 there is no longer a restriction against read access to a
live send buffer. The wording was changed to now prohibit the user to
"modify". You can look the subsection of Communication Modes in chapter 3
but you will need to compare MPI 2.1 and 2.2 carefully to see the change.
The
It is hard to imagine how a total data load of 41,943,040 bytes could be a
problem. That is really not much data. By the time the BCAST is done, each
task (except root) will have received a single half meg message form one
sender. That is not much.
IMB does shift the root so some tasks may be i
Network saturation could produce arbitrary long delays the total data load
we are talking about is really small. It is the responsibility of an MPI
library to do one of the following:
1) Use a reliable message protocol for each message (e.g. Infiniband RC or
TCP/IP)
2) detect lost packets and
I am out of the office until 09/07/2010.
I will be out of the office on vacation the week before Labor Day. I will
not see any email.
Note: This is an automated response to your message "[OMPI users] random
IB failures when running medium core counts" sent on 8/30/10 12:22:19.
This is the onl
Ashley's observation may apply to an application that iterates on many to
one communication patterns. If the only collective used is MPI_Reduce,
some non-root tasks can get ahead and keep pushing iteration results at
tasks that are nearer the root. This could overload them and cause some
extra
I was pointing out that most programs have some degree of elastic
synchronization built in. Tasks (or groups or components in a coupled
model) seldom only produce data.they also consume what other tasks produce
and that limits the potential skew.
If step n for a task (or group or coupled compo
Ashley
Can you provide an example of a situation in which these semantically
redundant barriers help?
I may be missing something but my statement for the text book would be
"If adding a barrier to your MPI program makes it run faster, there is
almost certainly a flaw in it that is better solv
; to:
>
> Open MPI Users
>
> 09/09/2010 05:37 PM
>
> Sent by:
>
> users-boun...@open-mpi.org
>
> Please respond to Open MPI Users
>
>
> On 9 Sep 2010, at 21:40, Richard Treumann wrote:
>
> >
> > Ashley
> >
> > Can you provide an e
gt;
> users-boun...@open-mpi.org
>
> Please respond to Open MPI Users
>
> Richard Treumann wrote:
>
> Hi Ashley
>
> I understand the problem with descriptor flooding can be serious in
> an application with unidirectional data dependancy. Perhaps we have
>
Tony
You are depending on luck. The MPI Standard allows the implementation to
assume that send and recv buffers are distinct unless MPI_IN_PLACE is
used. Any MPI implementation may have more than one algorithm for a given
MPI collective communication operation and the policy for switching
al
request_1 and request_2 are just local variable names.
The only thing that determines matching order is CC issue order on the
communicator. At each process, some CC is issued first and some CC is
issued second. The first issued CC at each process will try to match the
first issued CC at the
Sent by:
users-boun...@open-mpi.org
Sorry Richard,
what is CC issue order on the communicator?, in particular, "CC", what
does it mean?
2010/9/23 Richard Treumann
request_1 and request_2 are just local variable names.
The only thing that determines matching order is CC issue or
Amb
It sounds like you have more workers than you can keep fed. Workers are
finishing up and requesting their next assignment but sit idle because
there are so many other idle workers too.
Load balance does not really matter if the choke point is the master. The
work is being done as fast as
I will add to what Terry said by mentioning that the MPI implementation
has no awareness of ordinary POSIX or Fortran disk I/O routines. It
cannot help on those.
Any automated help the MPI implementation can provide would only apply to
MPI_File_xxx disk I/O. These are implemented by the MPI
When you use MPI message passing in your application, the MPI library
decides how to deliver the message. The "magic" is simply that when sender
process and receiver process are on the same node (shared memory domain)
the library uses shared memory to deliver the message from process to
process
Subject:
Re: [OMPI users] a question about [MPI]IO on systemswithout network
filesystem
Sent by:
users-boun...@open-mpi.org
On Thu, Sep 30, 2010 at 09:00:31AM -0400, Richard Treumann wrote:
> It is possible for MPI-IO to be implemented in a way that lets a single
> process or the set
Brian
Most HPC applications are run with one processor and one working thread
per MPI process. In this case, the node is not being used for other work
so if the MPI process does release a processor, there is nothing else
important for it to do anyway.
In these applications, the blocking MPI c
I am out of the office until 11/01/2010.
I will be out of the office on vacation the last week of Oct. Back Nov 1.
I will not see any email.
Note: This is an automated response to your message "[OMPI users] OPEN MPI
data transfer error" sent on 10/22/10 15:19:05.
This is the only notification
Also -
HPC clusters are commonly dedicated to running parallel jobs with exactly
one process per CPU. HPC is about getting computation done and letting a
CPU time slice among competing processes always has overhead (CPU time not
spent on the computation).
Unless you are trying to run extra pr
It seems to me the MPI_Get_processor_name description is too ambiguous to
make this 100% portable. I assume most MPI implementations simply use the
hostname so all processes on the same host will return the same string.
The suggestion would work then.
However, it would also be reasonable for a
I am out of the office until 03/31/2011.
I will be out of the office from now on - I will not see email or check
phone messages.
Good luck to you all
Contact my former team leader: Charles J. Archer
E-mail: arch...@us.ibm.com
Phone: 553-0346 / 1-507-253-0346
OR
My former manager: Carl F. (Carl
_
> >> users mailing list
> >> us...@open-mpi.org
> >> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
> >
> > --
> > --Kris
> >
> > 叶ってしまう夢は本当の夢と言えん。
> > [A dream that comes true can't really be called a dream.]
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> [attachment "smime.p7s" deleted by Richard
> Treumann/Poughkeepsie/IBM]
___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
Hi Ron -
I am well aware of the scaling problems related to the standard send
requirements in MPI. I t is a very difficult issue.
However, here is what the standard says: MPI 1.2, page 32 lines 29-37
===
a standard send operation that cannot complete because of lack of buffer
space will me
Sorry for typo - The reference is MPI 1.1
Dick Treumann - MPI Team/TCEM
IBM Systems & Technology Group
Dept 0lva / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601
Tele (845) 433-7846 Fax (845) 433-8363
- Forwarded by Richard Treumann/Poughkeepsie/IBM on 02/04/2008 01:3
08 02:03:20 PM:
> On Mon, Feb 04, 2008 at 09:08:45AM -0500, Richard Treumann wrote:
> > To me, the MPI standard is clear that a program like this:
> >
> > task 0:
> > MPI_Init
> > sleep(3000);
> > start receiving messages
> >
> > each of tasks 1 to n-1:
an application that break a particular
> MPI implementation. It doesn't necessarily make this implementation
> non standard compliant.
>
> george.
>
> On Feb 4, 2008, at 9:08 AM, Richard Treumann wrote:
>
> > Is what George says accurate? If so, it sounds to me lik
Hi Gleb
There is no misunderstanding of the MPI standard or the definition of
blocking in the bug3 example. Both bug 3 and the example I provided are
valid MPI.
As you say, blocking means the send buffer can be reused when the MPI_Send
returns. This is exactly what bug3 is count on.
MPI is a r
> Richard,
>
> You're absolutely right. What a shame :) If I have spent less time
> drawing the boxes around the code I might have noticed the typo. The
> Send should be an Isend.
>
>george.
>
> On Feb 4, 2008, at 5:32 PM, Richard Treumann wrote:
>
> > Hi George
&
Ron's comments are probably dead on for an application like bug3.
If bug3 is long running and libmpi is doing eager protocol buffer
management as I contend the standard requires then the producers will not
get far ahead of the consumer before they are forced to synchronous send
under the covers a
Hi slimtimmy
I have been involved in several of the MPI Forum's discussions of how
MPI_Cancel should work and I agree with your interpretation of the
standard. By my reading of the standard, the MPI_Wait must not hang and the
cancel must succeed.
Making an MPI implementation work exactly as the
Hi Jitendra
Before you worry too much about the inefficiency of using a contiguous
scratch buffer to pack into and send from and a second contiguous scratch
buffer to receive into and unpack from, it would be worth knowing how
OpenMPI processes a discontiguous datatype on your platform.
Gatherin
Robert -
A return from a blocking send means the application send buffer is
available for reuse. If it is a BSEND, the application buffer could be
available because the message data has been copied to the attached buffer
or because the data has been delivered to the destination. The application
Hi Robert
Dick Treumann - MPI Team
IBM Systems & Technology Group
Dept X2ZA / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601
Tele (845) 433-7846 Fax (845) 433-8363
users-boun...@open-mpi.org wrote on 08/27/2008 11:55:58 AM:
<< snip >>
>
> However from an application point of vie
Vincent
1) Assume you are running an MPI program which has 16 tasks in
MPI_COMM_WORLD, you have 16 dedicated CPUs and each task is single
threaded. (a task is a distinct process, a process can contain one or more
threads) The is the most common traditional model. In this model, when a
task makes
No - it is not guaranteed. (it is highly probable though)
The return from the MPI_Send only guarantees that the data is safely held
somewhere other than the send buffer so you are free to modify the send
buffer. The MPI standard does not say where the data is to be held. It only
says that once th
ters across all processes and determine if
> any are outstanding. It could be accomplished with a single
> MPI_Reduce(sent - received).
>
> Cheers,
> Shaun
>
> Richard Treumann wrote:
> > No - it is not guaranteed. (it is highly probable though)
> >
> > The retur
As far as I can see, Jeff's analysis is dead on. The matching order at P2
is based on the order in which the envelopes from P0 and P1 show up at P2.
The Barrier does not force an order between the communication paths P0->P2
vs. P1->P2.
The MPI standard does not even say what "show up" means unle
Guess I should have kept quiet a bit longer. As I recall we had already
seen a counter example to Jeff's stronger statement and that motivated my
narrower one.
If there are no wildcard receives - every MPI_Barrier call is
semantically irrelevant.
Do you have a counte
Dennis
In MPI, you must complete every MPI_Isend by MPI_Wait on the request handle
(or a variant like MPI_Waitall or MPI_Test that returns TRUE). An
un-completed MPI_Isend leaves resources tied up.
I do not know what symptom to expect from OpenMPI with this particular
application error but the
It is dangerous to hold a local lock (like a mutex} across a blocking MPI
call unless you can be 100% sure everything that must happen remotely will
be completely independent of what is done with local locks & communication
dependancies on other tasks.
It is likely that a MPI_Comm_spawn call in w
ou would need
> to have a different input communicator for each thread that will
> make an MPI_Comm_spawn call" , i am confused with the term "single
> task communicator"
>
> Best Regards,
> umanga
>
> Richard Treumann wrote:
> It is dangerous to hold a loc
MPI standard compliant management of eager send requires that this program
work. There is nothing that says "unless eager limit is set too high/low."
Honoring this requirement in an MPI implementation can be costly. There are
practical reasons to pass up this requirement because most applications
If you are hoping for a return on timeout, almost zero CPU use while
waiting and fast response you will need to be pretty creative. Here is a
simple solution that may be OK if you do not need both fast response and
low CPU load.
flag = false;
for ( ; ! is_time_up(); )
MPI_Test( &flag,
The need for a "better" timeout depends on what else there is for the CPU
to do.
If you get creative and shift from {99% MPI_WAIT , 1% OS_idle_process} to
{1% MPI_Wait, 99% OS_idle_process} at a cost of only a few extra
microseconds added lag on MPI_Wait, you may be pleased by the CPU load
statist
I cannot resist:
Jaison -
The MPI_Comm_spawn call specifies what you want to have happen. The child
launch is what does happen.
If we can come up with a way to have things happen correctly before we know
what it is that we want to have happen, the heck with this HPC stuff. Lets
get together and
Tim
MPI is a library providing support for passing messages among several
distinct processes. It offers datatype constructors that let an
application describe complex layouts of data in the local memory of a
process so a message can be sent from a complex data layout or received
into a complex l
The caller of MPI_INIT_THREAD says what level of thread safety he would
like to get from the MPI implementation. The implementation says what level
of thread safety it provides.
The implementation is free to provide more or less thread safety than
requested. The caller of MPI_INIT_THREAD should
5 AM, Richard Treumann wrote:
> If the application will make MPI calls from multiple threads and
> MPI_INIT_THREAD has returned FUNNELED, the application must be
> willing to take the steps that ensure there will never be concurrent
> calls to MPI from the threads. The threads will take
A call to MPI_Init allows the MPI library to return any level of thread
support it chooses. This MPI 1.1 call does not let the application say what
it wants and does not let the implementation reply with what it can
guarantee.
If you are using only one MPI implementation and your code will never
The program Jonathan offers as an example is valid use of MPI standard
send. With this message size it is fair to assume the implementation is
doing standard send with an eager send. The MPI standard is explicit about
how eager send, as a undercover option for standard send, must work.
When the
abc def
When the parent does a spawn call, it presumably blocks until the child
tasks have called MPI_Init. The standard allows some flexibility on this
but at least after spawn, the spawn side must be able to issue
communication calls involving the children and expect them to work.
What you se
abc def
The MPI_Barrier call in the parent must be on the intercomm returned by the
spawn. The call in the children must be on MPI_COMM_PARENT.
The semantic of an MPI_Barrier call on an intercomm is:
No MPI_Barrier caller in the local group returns until all members of the
remote group have m
I do not know what the OpenMPI message looks like or why people want to
hide it. It should be phrased to avoid any implication of a problem with
OpenMPI itself.
How about something like this which:
"The application has called MPI_Abort. The application is terminated by
OpenMPI as the application
Why should any software system offer an option which lets the user hide
all distinction between a run that succeeded and one that failed?
Dick Treumann - MPI Team
IBM Systems & Technology Group
Dept X2ZA / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601
Tele (845) 433-7846 Fax (845
have been.
On Apr 5, 2010, at 7:01 AM, Richard Treumann wrote:
Why should any software system offer an option which lets the user
hide all distinction between a run that succeeded and one that
failed?
Dick Treumann - MPI Team
IBM Systems & Technology Group
MPI Team
> > IBM Systems & Technology Group
> > Dept X2ZA / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601
> > Tele (845) 433-7846 Fax (845) 433-8363
> >
> >
> >
> >
> > From: Yves Caniou
> >
> > To: Rich
The MPI standard says that MPI_Abort makes a "best effort". It also says
that an MPI implementation is free to lose the value passed into MPI_Abort
and deliver some other RC..
The standard does not say that MPI_Abort becomes a valid way to end a
parallel job if it is passed a zero.
To me it see
Assume your data is discontiguous in memory and making it contiguous is
not practical (e.g. there is no way to make cells of a row and cells of a
column both contiguous.) You have 3 options:
1) Use many small/contiguous messages
2) Allocate scratch space and pack/unpack
3) Use a derived datatyp
An MPI send (of any kind), is defined by "local completion semantics".
When a send is complete, the send buffer may be reused. The only kind of
send that gives any indication whether the receive is posted is the
synchronous send. Neither standard send nor buffered send tell the sender
if the recv
Bsend does not guarantee to use the attached buffer, Return from MPI_Ibsend
does not guarantee you can modify the application send buffer.
Maybe the implementation would try to optimize by scheduling a nonblocking
send from the apploication buffer that bypasses the copy to the attach
buffer. When
If someone is deciding whether to use complex datatypes or stick with
contiguous ones, they need to look at their own situation. There is no
simple answer. The only thing that is fully predictable is that an MPI
operation, measured in isolation, will be no slower with contiguous data
than with di
The MPI standard requires that when there is a free running task posting
isends to a task that is not keeping up on receives, the sending task will
switch to synchronous isend BEFORE the receive side runs out of memory and
fails.
There should be no need for the sender to us MPI_Issend becaus
Dick Treumann - MPI Team
IBM Systems & Technology Group
Dept X2ZA / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601
Tele (845) 433-7846 Fax (845) 433-8363
users-boun...@open-mpi.org wrote on 05/25/2010 12:03:11 AM:
> [image removed]
>
> [OMPI users] About the necessity of ca
One difference is that putting a blocking send before the irecv is a
classic "unsafe" MPI program. It depends on eager send buffering to
complete the MPI_Send so the MPI_Irecv can be posted. The example with
MPI_Send first would be allowed to hang.
The original program is correct and safe MPI.
I am not 100% sure I understand your situation. It it this?
Process A has an ongoing stream of inputs. For each input unit, A does some
processing and then passes on work to B via a message. B receives the
message from A and does some additional work before sending a message to C.
C receives th
Hi George
I have run into the argument that in a case where the number of array
elements that will be accessed is == 0 it is "obviously" valid to pass NULL
as the array address. I recognize the argument has merit but I am not clear
that it really requires that an MPI implementation that tries to
Jeff paraphrased an unnamed source as suggesting that: "any MPI program
that relies on a barrier for correctness is an incorrect MPI application."
. That is probably too strong.
How about this assertion?
If there are no wildcard receives - every MPI_Barrier call is semantically
irrelevant.
It
There is no synchronization operation in MPI that promises all tasks will
exit at the same time. For MPI_Barrier they will exit as close to the same
time as the implementation can reasonably support but as long as the
application is distributed and there are delays in the interconnect, it is
not p
You can use MPI_REQUEST_GET_STATUS as a way to "test" without
deallocation.
I do not understand the reason you would forward the request (as a request)
to another function. The data is already in a specific receive buffer by
the time an MPI_Test returns TRUE so calling the function and passing i
Santolo
The MPI standard defines reduction operations where the operand/operation
pair has a meaningful semantic. I cannot picture a well defined semantic
for:
999.0 BXOR 0.009. Maybe you can but it is
not an error that the MPI standard leaves out BXOR on floatin
Tee Wen Kai -
You asked "Just to find out more about the consequences for exiting MPI
processes without calling MPI_Finalize, will it cause memory leak or other
fatal problem?"
Be aware that Jeff has offered you an OpenMPI implementation oriented
answer rather than an MPI standard oriented answe
77 matches
Mail list logo