Re: [OMPI users] EXTERNAL: Re: unacceptable latency in gathering process

2012-10-03 Thread Ralph Castain
Hmmm...you probably can't without digging down into the diagnostics.

Perhaps we could help more if we had some idea how you are measuring this 
"latency". I ask because that is orders of magnitude worse than anything we 
measure - so I suspect the problem is in your app (i.e., that the time you are 
measuring is actually how long it takes you to get around to processing a 
message that was received some time ago).


On Oct 3, 2012, at 11:52 AM, "Hodge, Gary C"  wrote:

> how do I tell the difference between when the message was received and when 
> the message was picked up in MPI_Test?
>  
> From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On 
> Behalf Of Ralph Castain
> Sent: Wednesday, October 03, 2012 1:00 PM
> To: Open MPI Users
> Subject: EXTERNAL: Re: [OMPI users] unacceptable latency in gathering process
>  
> Out of curiosity, have you logged the time when the SP called "send" and 
> compared it to the time when the message was received, and when that message 
> is picked up in MPI_Test? In other words, have you actually verified that the 
> delay is in the MPI library as opposed to in your application?
>  
>  
> On Oct 3, 2012, at 9:40 AM, "Hodge, Gary C"  wrote:
> 
> 
> Hi all,
> I am running on an IBM BladeCenter, using Open MPI 1.4.1, and opensm subnet 
> manager for Infiniband
>  
> Our application has real time requirements and it has recently been proven 
> that it does not scale to meet future requirements.
> Presently, I am re-organizing the application to process work in a more 
> parallel manner then it does now.
>  
> Jobs arrive at the rate of 200 per second and are sub-divided into groups of 
> objects by a master process (MP) on its own node.
> The MP then assigns the object groups to 20 slave processes (SP), each 
> running on their own node, to do the expensive computational work in parallel.
> The SPs then send their results to a gatherer process (GP) on its own node 
> that merges the results for the job and sends it onward for final processing.
> The highest latency for the last 1024 jobs that were processed is then 
> written to a log file that is displayed by a GUI.
> Each process uses the same controller method for sending and  receiving 
> messages as follows:
>  
> For (each CPU that sends us input)
> {
> MPI_Irecv(….)
> }
>  
> While (true)
> {
> For (each CPU that sends us input)
> {
> MPI_Test(….)
> If (message was received)
> {
> Copy the message
> Queue the copy to our input queue
> MPI_Irecv(…)
> }
> }
> If (there are messages on our input queue)
> {
> … process the FIRST message on queue (this may queue messages 
> for output) ….
>  
> For (each message on our output queue)
> {
> MPI_Send(…)
> }
> } 
> }
>  
> My problem is that I do not meet our applications performance requirements 
> for a job (~ 20 ms) until I reduce the number of SPs from 20 to 4 or less.
> I added some debug into the GP and found that there are never more than 14 
> messages received in the for loop that calls MPI_Test.
> The messages that were sent from the other 6 SPs will eventually arrive at 
> the GP in a long stream after experiencing high latency (over 600 ms).
>  
> Going forward, we need to handle more objects per job and will need to have 
> more than 4 SPs to keep up.
> My thought is that I have to obey this 4 SPs to 1 GP ratio and create 
> intermediate GPs to gather results from every 4 slaves.
>  
> Is this a contention problem at the GP?
> Is there debugging or logging I can turn on in the MPI to prove that 
> contention is occurring?
> Can I configure MPI receive processing to improve upon the 4 to 1 ratio?
> Can I improve the controller method (listed above) to gain a performance 
> improvement?
>  
> Thanks for any suggestions.
> Gary Hodge
>  
>  
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>  
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users



Re: [OMPI users] EXTERNAL: Re: unacceptable latency in gathering process

2012-10-03 Thread Hodge, Gary C
how do I tell the difference between when the message was received and when the 
message was picked up in MPI_Test?

From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf 
Of Ralph Castain
Sent: Wednesday, October 03, 2012 1:00 PM
To: Open MPI Users
Subject: EXTERNAL: Re: [OMPI users] unacceptable latency in gathering process

Out of curiosity, have you logged the time when the SP called "send" and 
compared it to the time when the message was received, and when that message is 
picked up in MPI_Test? In other words, have you actually verified that the 
delay is in the MPI library as opposed to in your application?


On Oct 3, 2012, at 9:40 AM, "Hodge, Gary C" 
> wrote:


Hi all,
I am running on an IBM BladeCenter, using Open MPI 1.4.1, and opensm subnet 
manager for Infiniband

Our application has real time requirements and it has recently been proven that 
it does not scale to meet future requirements.
Presently, I am re-organizing the application to process work in a more 
parallel manner then it does now.

Jobs arrive at the rate of 200 per second and are sub-divided into groups of 
objects by a master process (MP) on its own node.
The MP then assigns the object groups to 20 slave processes (SP), each running 
on their own node, to do the expensive computational work in parallel.
The SPs then send their results to a gatherer process (GP) on its own node that 
merges the results for the job and sends it onward for final processing.
The highest latency for the last 1024 jobs that were processed is then written 
to a log file that is displayed by a GUI.
Each process uses the same controller method for sending and  receiving 
messages as follows:

For (each CPU that sends us input)
{
MPI_Irecv()
}

While (true)
{
For (each CPU that sends us input)
{
MPI_Test()
If (message was received)
{
Copy the message
Queue the copy to our input queue
MPI_Irecv(...)
}
}
If (there are messages on our input queue)
{
... process the FIRST message on queue (this may queue messages 
for output) 

For (each message on our output queue)
{
MPI_Send(...)
}
}
}

My problem is that I do not meet our applications performance requirements for 
a job (~ 20 ms) until I reduce the number of SPs from 20 to 4 or less.
I added some debug into the GP and found that there are never more than 14 
messages received in the for loop that calls MPI_Test.
The messages that were sent from the other 6 SPs will eventually arrive at the 
GP in a long stream after experiencing high latency (over 600 ms).

Going forward, we need to handle more objects per job and will need to have 
more than 4 SPs to keep up.
My thought is that I have to obey this 4 SPs to 1 GP ratio and create 
intermediate GPs to gather results from every 4 slaves.

Is this a contention problem at the GP?
Is there debugging or logging I can turn on in the MPI to prove that contention 
is occurring?
Can I configure MPI receive processing to improve upon the 4 to 1 ratio?
Can I improve the controller method (listed above) to gain a performance 
improvement?

Thanks for any suggestions.
Gary Hodge


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



Re: [OMPI users] Need solution- nodes can't find the paths.

2012-10-03 Thread Jeff Squyres
This list is intended for Open MPI support, not general Linux cluster support.  
You might be able to get more detailed help from other forums and/or your local 
cluster support admin / vendor.

Thanks!


On Oct 3, 2012, at 6:58 AM, Syed Ahsan Ali wrote:

> Thanks John for the detailed procedure. the fstab thing was in mind but it 
> was not sure how to make it happen on compute nodes. I'll try this and let 
> you know.
>  
> Actually the cluster and SAN was deployed by a local vendor of Dell and they 
> are not much sure about this thing.
> 
> On Wed, Oct 3, 2012 at 3:49 PM, John Hearns  wrote:
> If I may ask, which comapny installed thsi cluster for you?
> Surely they will advise on how to NFS mount the storage on the compute nodes?
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> 
> -- 
> Syed Ahsan Ali Bokhari 
> Electronic Engineer (EE)
> 
> Research & Development Division
> Pakistan Meteorological Department H-8/4, Islamabad.
> Phone # off  +92518358714
> Cell # +923155145014
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI users] unacceptable latency in gathering process

2012-10-03 Thread Ralph Castain
Out of curiosity, have you logged the time when the SP called "send" and 
compared it to the time when the message was received, and when that message is 
picked up in MPI_Test? In other words, have you actually verified that the 
delay is in the MPI library as opposed to in your application?


On Oct 3, 2012, at 9:40 AM, "Hodge, Gary C"  wrote:

> Hi all,
> I am running on an IBM BladeCenter, using Open MPI 1.4.1, and opensm subnet 
> manager for Infiniband
>  
> Our application has real time requirements and it has recently been proven 
> that it does not scale to meet future requirements.
> Presently, I am re-organizing the application to process work in a more 
> parallel manner then it does now.
>  
> Jobs arrive at the rate of 200 per second and are sub-divided into groups of 
> objects by a master process (MP) on its own node.
> The MP then assigns the object groups to 20 slave processes (SP), each 
> running on their own node, to do the expensive computational work in parallel.
> The SPs then send their results to a gatherer process (GP) on its own node 
> that merges the results for the job and sends it onward for final processing.
> The highest latency for the last 1024 jobs that were processed is then 
> written to a log file that is displayed by a GUI.
> Each process uses the same controller method for sending and  receiving 
> messages as follows:
>  
> For (each CPU that sends us input)
> {
> MPI_Irecv(….)
> }
>  
> While (true)
> {
> For (each CPU that sends us input)
> {
> MPI_Test(….)
> If (message was received)
> {
> Copy the message
> Queue the copy to our input queue
> MPI_Irecv(…)
> }
> }
> If (there are messages on our input queue)
> {
> … process the FIRST message on queue (this may queue messages 
> for output) ….
>  
> For (each message on our output queue)
> {
> MPI_Send(…)
> }
> } 
> }
>  
> My problem is that I do not meet our applications performance requirements 
> for a job (~ 20 ms) until I reduce the number of SPs from 20 to 4 or less.
> I added some debug into the GP and found that there are never more than 14 
> messages received in the for loop that calls MPI_Test.
> The messages that were sent from the other 6 SPs will eventually arrive at 
> the GP in a long stream after experiencing high latency (over 600 ms).
>  
> Going forward, we need to handle more objects per job and will need to have 
> more than 4 SPs to keep up.
> My thought is that I have to obey this 4 SPs to 1 GP ratio and create 
> intermediate GPs to gather results from every 4 slaves.
>  
> Is this a contention problem at the GP?
> Is there debugging or logging I can turn on in the MPI to prove that 
> contention is occurring?
> Can I configure MPI receive processing to improve upon the 4 to 1 ratio?
> Can I improve the controller method (listed above) to gain a performance 
> improvement?
>  
> Thanks for any suggestions.
> Gary Hodge
>  
>  
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users



[OMPI users] unacceptable latency in gathering process

2012-10-03 Thread Hodge, Gary C
Hi all,
I am running on an IBM BladeCenter, using Open MPI 1.4.1, and opensm subnet 
manager for Infiniband

Our application has real time requirements and it has recently been proven that 
it does not scale to meet future requirements.
Presently, I am re-organizing the application to process work in a more 
parallel manner then it does now.

Jobs arrive at the rate of 200 per second and are sub-divided into groups of 
objects by a master process (MP) on its own node.
The MP then assigns the object groups to 20 slave processes (SP), each running 
on their own node, to do the expensive computational work in parallel.
The SPs then send their results to a gatherer process (GP) on its own node that 
merges the results for the job and sends it onward for final processing.
The highest latency for the last 1024 jobs that were processed is then written 
to a log file that is displayed by a GUI.
Each process uses the same controller method for sending and  receiving 
messages as follows:

For (each CPU that sends us input)
{
MPI_Irecv()
}

While (true)
{
For (each CPU that sends us input)
{
MPI_Test()
If (message was received)
{
Copy the message
Queue the copy to our input queue
MPI_Irecv(...)
}
}
If (there are messages on our input queue)
{
... process the FIRST message on queue (this may queue messages 
for output) 

For (each message on our output queue)
{
MPI_Send(...)
}
}
}

My problem is that I do not meet our applications performance requirements for 
a job (~ 20 ms) until I reduce the number of SPs from 20 to 4 or less.
I added some debug into the GP and found that there are never more than 14 
messages received in the for loop that calls MPI_Test.
The messages that were sent from the other 6 SPs will eventually arrive at the 
GP in a long stream after experiencing high latency (over 600 ms).

Going forward, we need to handle more objects per job and will need to have 
more than 4 SPs to keep up.
My thought is that I have to obey this 4 SPs to 1 GP ratio and create 
intermediate GPs to gather results from every 4 slaves.

Is this a contention problem at the GP?
Is there debugging or logging I can turn on in the MPI to prove that contention 
is occurring?
Can I configure MPI receive processing to improve upon the 4 to 1 ratio?
Can I improve the controller method (listed above) to gain a performance 
improvement?

Thanks for any suggestions.
Gary Hodge




Re: [OMPI users] one more problem with process bindings on openmpi-1.6.2

2012-10-03 Thread Ralph Castain

On Oct 3, 2012, at 8:40 AM, Siegmar Gross 
 wrote:

> Hi,
> 
>> As I said, in the absence of a hostfile, -host assigns ONE slot for
>> each time a host is named. So the equivalent hostfile would have
>> "slots=1" to create the same pattern as your -host cmd line.
> 
> That would mean that a hostfile has nothing to do with the underlying
> hardware and that it would be a mystery to find out how to set it up.

That's correct - an unfortunate aspect of using hostfiles. This is one of the 
big motivations for the changes in 1.7 and beyond.

> Now I found a different solution so that I'm a little bit satisfied that
> I don't need a different hostfile for every "mpiexec" command. I
> sorted the output and removed the output from "hostname" so that
> everything is more readable. Is the keyword "sockets" available in
> openmpi-1.7 and openmpi-1.9 as well?

No - it is no longer required with 1.7 and beyond because we now have the 
ability to directly sense the hardware, so we no longer need users to tell us.


> 
> tyr fd1026 252 cat host_sunpc0_1  
>   
> sunpc0 sockets=2 slots=4
> sunpc1 sockets=2 slots=4
> 
> tyr fd1026 253 mpiexec -report-bindings -hostfile host_sunpc0_1 \
>  -np 4 -npersocket 1 -cpus-per-proc 2 -bynode -bind-to-core hostname
> [sunpc0:12641] MCW rank 0 bound to socket 0[core 0-1]: [B B][. .]
> [sunpc1:01402] MCW rank 1 bound to socket 0[core 0-1]: [B B][. .]
> [sunpc0:12641] MCW rank 2 bound to socket 1[core 0-1]: [. .][B B]
> [sunpc1:01402] MCW rank 3 bound to socket 1[core 0-1]: [. .][B B]
> 
> tyr fd1026 254 mpiexec -report-bindings -host sunpc0,sunpc1 \
>  -np 4 -cpus-per-proc 2 -bind-to-core -bysocket hostname
> [sunpc0:12676] MCW rank 0 bound to socket 0[core 0-1]: [B B][. .]
> [sunpc1:01437] MCW rank 1 bound to socket 0[core 0-1]: [B B][. .]
> [sunpc0:12676] MCW rank 2 bound to socket 1[core 0-1]: [. .][B B]
> [sunpc1:01437] MCW rank 3 bound to socket 1[core 0-1]: [. .][B B]
> 
> tyr fd1026 258 mpiexec -report-bindings -hostfile host_sunpc0_1 \
>  -np 2 -npernode 1 -cpus-per-proc 4 -bind-to-core hostname
> [sunpc0:12833] MCW rank 0 bound to socket 0[core 0-1]
>   socket 1[core 0-1]: [B B][B B]
> [sunpc1:01561] MCW rank 1 bound to socket 0[core 0-1]
>   socket 1[core 0-1]: [B B][B B]
> 
> tyr fd1026 259 mpiexec -report-bindings -host sunpc0,sunpc1 \
>  -np 2 -cpus-per-proc 4 -bind-to-core hostname
> [sunpc0:12869] MCW rank 0 bound to socket 0[core 0-1]
>   socket 1[core 0-1]: [B B][B B]
> [sunpc1:01600] MCW rank 1 bound to socket 0[core 0-1]
>   socket 1[core 0-1]: [B B][B B]
> 
> 
> Thank you very much for your answers and your time. I have learned
> a lot about process bindings through our discussion. Now I'm waiting
> for a bug fix for my problem with rankfiles. :-))
> 
> 
> Kind regards
> 
> Siegmar
> 
> 
> 
>> On Oct 3, 2012, at 7:12 AM, Siegmar Gross 
>  wrote:
>> 
>>> Hi,
>>> 
>>> I thought that "slot" is the smallest manageable entity so that I
>>> must set "slot=4" for a dual-processor dual-core machine with one
>>> hardware-thread per core. Today I learned about the new keyword
>>> "sockets" for a hostfile (I didn't find it in "man orte_hosts").
>>> How would I specify a system with two dual-core processors so that
>>> "mpiexec -report-bindings -hostfile host_sunpc0_1 -np 4 
>>> -cpus-per-proc 2 -bind-to-core hostname" or even
>>> "mpiexec -report-bindings -hostfile host_sunpc0_1 -np 2 
>>> -cpus-per-proc 4 -bind-to-core hostname" would work in the same way
>>> as the commands below.
>>> 
>>> tyr fd1026 217 mpiexec -report-bindings -host sunpc0,sunpc1 -np 2 \
>>> -cpus-per-proc 4 -bind-to-core hostname
>>> [sunpc0:11658] MCW rank 0 bound to socket 0[core 0-1]
>>> socket 1[core 0-1]: [B B][B B]
>>> sunpc0
>>> [sunpc1:00553] MCW rank 1 bound to socket 0[core 0-1]
>>> socket 1[core 0-1]: [B B][B B]
>>> sunpc1
>>> 
>>> 
>>> Thank you very much for your help in advance.
>>> 
>>> 
>>> Kind regards
>>> 
>>> Siegmar
>>> 
>>> 
>>> 
> I recognized another problem with procecss bindings. The command
> works, if I use "-host" and it breaks, if I use "-hostfile" with 
> the same machines.
> 
> tyr fd1026 178 mpiexec -report-bindings -host sunpc0,sunpc1 -np 4 \
> -cpus-per-proc 2 -bind-to-core hostname
> sunpc1
> [sunpc1:00086] MCW rank 1 bound to socket 0[core 0-1]: [B B][. .]
> [sunpc1:00086] MCW rank 3 bound to socket 1[core 0-1]: [. .][B B]
> sunpc0
> [sunpc0:10929] MCW rank 0 bound to socket 0[core 0-1]: [B B][. .]
> sunpc0
> [sunpc0:10929] MCW rank 2 bound to socket 1[core 0-1]: [. .][B B]
> sunpc1
> 
> 
 
 Yes, this works because you told us there is only ONE slot on each
 host. As a result, we split the 4 processes across the two hosts
 (both of which are now 

Re: [OMPI users] one more problem with process bindings on openmpi-1.6.2

2012-10-03 Thread Siegmar Gross
Hi,

> As I said, in the absence of a hostfile, -host assigns ONE slot for
> each time a host is named. So the equivalent hostfile would have
> "slots=1" to create the same pattern as your -host cmd line.

That would mean that a hostfile has nothing to do with the underlying
hardware and that it would be a mystery to find out how to set it up.
Now I found a different solution so that I'm a little bit satisfied that
I don't need a different hostfile for every "mpiexec" command. I
sorted the output and removed the output from "hostname" so that
everything is more readable. Is the keyword "sockets" available in
openmpi-1.7 and openmpi-1.9 as well?

tyr fd1026 252 cat host_sunpc0_1
sunpc0 sockets=2 slots=4
sunpc1 sockets=2 slots=4

tyr fd1026 253 mpiexec -report-bindings -hostfile host_sunpc0_1 \
  -np 4 -npersocket 1 -cpus-per-proc 2 -bynode -bind-to-core hostname
[sunpc0:12641] MCW rank 0 bound to socket 0[core 0-1]: [B B][. .]
[sunpc1:01402] MCW rank 1 bound to socket 0[core 0-1]: [B B][. .]
[sunpc0:12641] MCW rank 2 bound to socket 1[core 0-1]: [. .][B B]
[sunpc1:01402] MCW rank 3 bound to socket 1[core 0-1]: [. .][B B]

tyr fd1026 254 mpiexec -report-bindings -host sunpc0,sunpc1 \
  -np 4 -cpus-per-proc 2 -bind-to-core -bysocket hostname
[sunpc0:12676] MCW rank 0 bound to socket 0[core 0-1]: [B B][. .]
[sunpc1:01437] MCW rank 1 bound to socket 0[core 0-1]: [B B][. .]
[sunpc0:12676] MCW rank 2 bound to socket 1[core 0-1]: [. .][B B]
[sunpc1:01437] MCW rank 3 bound to socket 1[core 0-1]: [. .][B B]

tyr fd1026 258 mpiexec -report-bindings -hostfile host_sunpc0_1 \
  -np 2 -npernode 1 -cpus-per-proc 4 -bind-to-core hostname
[sunpc0:12833] MCW rank 0 bound to socket 0[core 0-1]
   socket 1[core 0-1]: [B B][B B]
[sunpc1:01561] MCW rank 1 bound to socket 0[core 0-1]
   socket 1[core 0-1]: [B B][B B]

tyr fd1026 259 mpiexec -report-bindings -host sunpc0,sunpc1 \
  -np 2 -cpus-per-proc 4 -bind-to-core hostname
[sunpc0:12869] MCW rank 0 bound to socket 0[core 0-1]
   socket 1[core 0-1]: [B B][B B]
[sunpc1:01600] MCW rank 1 bound to socket 0[core 0-1]
   socket 1[core 0-1]: [B B][B B]


Thank you very much for your answers and your time. I have learned
a lot about process bindings through our discussion. Now I'm waiting
for a bug fix for my problem with rankfiles. :-))


Kind regards

Siegmar



> On Oct 3, 2012, at 7:12 AM, Siegmar Gross 
 wrote:
> 
> > Hi,
> > 
> > I thought that "slot" is the smallest manageable entity so that I
> > must set "slot=4" for a dual-processor dual-core machine with one
> > hardware-thread per core. Today I learned about the new keyword
> > "sockets" for a hostfile (I didn't find it in "man orte_hosts").
> > How would I specify a system with two dual-core processors so that
> > "mpiexec -report-bindings -hostfile host_sunpc0_1 -np 4 
> > -cpus-per-proc 2 -bind-to-core hostname" or even
> > "mpiexec -report-bindings -hostfile host_sunpc0_1 -np 2 
> > -cpus-per-proc 4 -bind-to-core hostname" would work in the same way
> > as the commands below.
> > 
> > tyr fd1026 217 mpiexec -report-bindings -host sunpc0,sunpc1 -np 2 \
> >  -cpus-per-proc 4 -bind-to-core hostname
> > [sunpc0:11658] MCW rank 0 bound to socket 0[core 0-1]
> >  socket 1[core 0-1]: [B B][B B]
> > sunpc0
> > [sunpc1:00553] MCW rank 1 bound to socket 0[core 0-1]
> >  socket 1[core 0-1]: [B B][B B]
> > sunpc1
> > 
> > 
> > Thank you very much for your help in advance.
> > 
> > 
> > Kind regards
> > 
> > Siegmar
> > 
> > 
> > 
> >>> I recognized another problem with procecss bindings. The command
> >>> works, if I use "-host" and it breaks, if I use "-hostfile" with 
> >>> the same machines.
> >>> 
> >>> tyr fd1026 178 mpiexec -report-bindings -host sunpc0,sunpc1 -np 4 \
> >>> -cpus-per-proc 2 -bind-to-core hostname
> >>> sunpc1
> >>> [sunpc1:00086] MCW rank 1 bound to socket 0[core 0-1]: [B B][. .]
> >>> [sunpc1:00086] MCW rank 3 bound to socket 1[core 0-1]: [. .][B B]
> >>> sunpc0
> >>> [sunpc0:10929] MCW rank 0 bound to socket 0[core 0-1]: [B B][. .]
> >>> sunpc0
> >>> [sunpc0:10929] MCW rank 2 bound to socket 1[core 0-1]: [. .][B B]
> >>> sunpc1
> >>> 
> >>> 
> >> 
> >> Yes, this works because you told us there is only ONE slot on each
> >> host. As a result, we split the 4 processes across the two hosts
> >> (both of which are now oversubscribed), resulting in TWO processes
> >> running on each host. Since there are 4 cores on each host, and
> >> you asked for 2 cores/process, we can make this work.
> >> 
> >> 
> >>> tyr fd1026 179 cat host_sunpc0_1 
> >>> sunpc0 slots=4
> >>> sunpc1 slots=4
> >>> 
> >>> 
> >>> tyr fd1026 180 mpiexec -report-bindings -hostfile host_sunpc0_1 -np 4 \
> >>> -cpus-per-proc 2 -bind-to-core hostname
> >> 
> >> And this will of course not work. In your hostfile, you told us there
> >> 

Re: [hwloc-users] hwloc 1.5, freebsd and linux output on the same hardware

2012-10-03 Thread Brice Goglin
Le 03/10/2012 17:23, Sebastian Kuzminsky a écrit :
> On Tue, Oct 2, 2012 at 5:14 PM, Samuel Thibault
> > wrote:
>
> There were two bugs which resulted into cpuid not being properly
> compiled. I have fixed them in the trunk, could you try again?
>
>
> I updated my checkout to r4882, reconfigured, rebuilt, and reran it,
> and it made the same output as 1.5.  So that's an improvement over the
> svn trunk yesterday, but it's not all the way fixed yet!
>
> I'll be around all day to run tests if you like ;-)
>

For what it's worth, I tested the x86 code on Linux on a dual E5-2650
machine and got the correct topology (exactly like your Linux on your
server). So the x86 detection code may be ok, but something else wouldn't.
There's still at least one bug in the freebsd code according to our
internal regression tool, stay tuned.

Brice



Re: [OMPI users] Load and link MPI Host at runtime

2012-10-03 Thread Jeff Squyres
On Oct 3, 2012, at 2:30 AM,  
 wrote:

> I`m looking for a document in 'Run MPI At Run-time' topic.

I'm can't quite parse this.  Are you looking for a document with that name?  If 
so, I suggest Google.

> My idea is to load MPI and link host at run-time in special situation. please 
> help.

I'm afraid I don't know what you are asking for.  Can you clarify your question 
/ be (much) more precise?

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI users] one more problem with process bindings on openmpi-1.6.2

2012-10-03 Thread Ralph Castain
As I said, in the absence of a hostfile, -host assigns ONE slot for each time a 
host is named. So the equivalent hostfile would have "slots=1" to create the 
same pattern as your -host cmd line.


On Oct 3, 2012, at 7:12 AM, Siegmar Gross 
 wrote:

> Hi,
> 
> I thought that "slot" is the smallest manageable entity so that I
> must set "slot=4" for a dual-processor dual-core machine with one
> hardware-thread per core. Today I learned about the new keyword
> "sockets" for a hostfile (I didn't find it in "man orte_hosts").
> How would I specify a system with two dual-core processors so that
> "mpiexec -report-bindings -hostfile host_sunpc0_1 -np 4 
> -cpus-per-proc 2 -bind-to-core hostname" or even
> "mpiexec -report-bindings -hostfile host_sunpc0_1 -np 2 
> -cpus-per-proc 4 -bind-to-core hostname" would work in the same way
> as the commands below.
> 
> tyr fd1026 217 mpiexec -report-bindings -host sunpc0,sunpc1 -np 2 \
>  -cpus-per-proc 4 -bind-to-core hostname
> [sunpc0:11658] MCW rank 0 bound to socket 0[core 0-1]
>  socket 1[core 0-1]: [B B][B B]
> sunpc0
> [sunpc1:00553] MCW rank 1 bound to socket 0[core 0-1]
>  socket 1[core 0-1]: [B B][B B]
> sunpc1
> 
> 
> Thank you very much for your help in advance.
> 
> 
> Kind regards
> 
> Siegmar
> 
> 
> 
>>> I recognized another problem with procecss bindings. The command
>>> works, if I use "-host" and it breaks, if I use "-hostfile" with 
>>> the same machines.
>>> 
>>> tyr fd1026 178 mpiexec -report-bindings -host sunpc0,sunpc1 -np 4 \
>>> -cpus-per-proc 2 -bind-to-core hostname
>>> sunpc1
>>> [sunpc1:00086] MCW rank 1 bound to socket 0[core 0-1]: [B B][. .]
>>> [sunpc1:00086] MCW rank 3 bound to socket 1[core 0-1]: [. .][B B]
>>> sunpc0
>>> [sunpc0:10929] MCW rank 0 bound to socket 0[core 0-1]: [B B][. .]
>>> sunpc0
>>> [sunpc0:10929] MCW rank 2 bound to socket 1[core 0-1]: [. .][B B]
>>> sunpc1
>>> 
>>> 
>> 
>> Yes, this works because you told us there is only ONE slot on each
>> host. As a result, we split the 4 processes across the two hosts
>> (both of which are now oversubscribed), resulting in TWO processes
>> running on each host. Since there are 4 cores on each host, and
>> you asked for 2 cores/process, we can make this work.
>> 
>> 
>>> tyr fd1026 179 cat host_sunpc0_1 
>>> sunpc0 slots=4
>>> sunpc1 slots=4
>>> 
>>> 
>>> tyr fd1026 180 mpiexec -report-bindings -hostfile host_sunpc0_1 -np 4 \
>>> -cpus-per-proc 2 -bind-to-core hostname
>> 
>> And this will of course not work. In your hostfile, you told us there
>> are FOUR slots on each host. Since the default is to map by slot, we
>> correctly mapped all four processes to the first node. We then tried
>> to bind 2 cores for each process, resulting in 8 cores - which is
>> more than you have.
>> 
>> 
>>> --
>>> An invalid physical processor ID was returned when attempting to bind
>>> an MPI process to a unique processor.
>>> 
>>> This usually means that you requested binding to more processors than
>>> exist (e.g., trying to bind N MPI processes to M processors, where N >
>>> M).  Double check that you have enough unique processors for all the
>>> MPI processes that you are launching on this host.
>>> 
>>> You job will now abort.
>>> --
>>> sunpc0
>>> [sunpc0:10964] MCW rank 0 bound to socket 0[core 0-1]: [B B][. .]
>>> sunpc0
>>> [sunpc0:10964] MCW rank 1 bound to socket 1[core 0-1]: [. .][B B]
>>> --
>>> mpiexec was unable to start the specified application as it encountered
>>> an error
>>> on node sunpc0. More information may be available above.
>>> --
>>> 4 total processes failed to start
>>> 
>>> 
>>> Perhaps this error is related to the other errors. Thank you very
>>> much for any help in advance.
>>> 
>>> 
>>> Kind regards
>>> 
>>> Siegmar
>>> 
>>> ___
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> 
> 




Re: [OMPI users] one more problem with process bindings on openmpi-1.6.2

2012-10-03 Thread Siegmar Gross
Hi,

I thought that "slot" is the smallest manageable entity so that I
must set "slot=4" for a dual-processor dual-core machine with one
hardware-thread per core. Today I learned about the new keyword
"sockets" for a hostfile (I didn't find it in "man orte_hosts").
How would I specify a system with two dual-core processors so that
"mpiexec -report-bindings -hostfile host_sunpc0_1 -np 4 
-cpus-per-proc 2 -bind-to-core hostname" or even
"mpiexec -report-bindings -hostfile host_sunpc0_1 -np 2 
-cpus-per-proc 4 -bind-to-core hostname" would work in the same way
as the commands below.

tyr fd1026 217 mpiexec -report-bindings -host sunpc0,sunpc1 -np 2 \
  -cpus-per-proc 4 -bind-to-core hostname
[sunpc0:11658] MCW rank 0 bound to socket 0[core 0-1]
  socket 1[core 0-1]: [B B][B B]
sunpc0
[sunpc1:00553] MCW rank 1 bound to socket 0[core 0-1]
  socket 1[core 0-1]: [B B][B B]
sunpc1


Thank you very much for your help in advance.


Kind regards

Siegmar



> > I recognized another problem with procecss bindings. The command
> > works, if I use "-host" and it breaks, if I use "-hostfile" with 
> > the same machines.
> > 
> > tyr fd1026 178 mpiexec -report-bindings -host sunpc0,sunpc1 -np 4 \
> >  -cpus-per-proc 2 -bind-to-core hostname
> > sunpc1
> > [sunpc1:00086] MCW rank 1 bound to socket 0[core 0-1]: [B B][. .]
> > [sunpc1:00086] MCW rank 3 bound to socket 1[core 0-1]: [. .][B B]
> > sunpc0
> > [sunpc0:10929] MCW rank 0 bound to socket 0[core 0-1]: [B B][. .]
> > sunpc0
> > [sunpc0:10929] MCW rank 2 bound to socket 1[core 0-1]: [. .][B B]
> > sunpc1
> > 
> > 
> 
> Yes, this works because you told us there is only ONE slot on each
> host. As a result, we split the 4 processes across the two hosts
> (both of which are now oversubscribed), resulting in TWO processes
> running on each host. Since there are 4 cores on each host, and
> you asked for 2 cores/process, we can make this work.
> 
> 
> > tyr fd1026 179 cat host_sunpc0_1 
> > sunpc0 slots=4
> > sunpc1 slots=4
> > 
> > 
> > tyr fd1026 180 mpiexec -report-bindings -hostfile host_sunpc0_1 -np 4 \
> >  -cpus-per-proc 2 -bind-to-core hostname
> 
> And this will of course not work. In your hostfile, you told us there
> are FOUR slots on each host. Since the default is to map by slot, we
> correctly mapped all four processes to the first node. We then tried
> to bind 2 cores for each process, resulting in 8 cores - which is
> more than you have.
> 
> 
> > --
> > An invalid physical processor ID was returned when attempting to bind
> > an MPI process to a unique processor.
> > 
> > This usually means that you requested binding to more processors than
> > exist (e.g., trying to bind N MPI processes to M processors, where N >
> > M).  Double check that you have enough unique processors for all the
> > MPI processes that you are launching on this host.
> > 
> > You job will now abort.
> > --
> > sunpc0
> > [sunpc0:10964] MCW rank 0 bound to socket 0[core 0-1]: [B B][. .]
> > sunpc0
> > [sunpc0:10964] MCW rank 1 bound to socket 1[core 0-1]: [. .][B B]
> > --
> > mpiexec was unable to start the specified application as it encountered
> >  an error
> > on node sunpc0. More information may be available above.
> > --
> > 4 total processes failed to start
> > 
> > 
> > Perhaps this error is related to the other errors. Thank you very
> > much for any help in advance.
> > 
> > 
> > Kind regards
> > 
> > Siegmar
> > 
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 



Re: [OMPI users] problem with rankfile and openmpi-1.6.2

2012-10-03 Thread Ralph Castain
I filed a bug fix for this one. However, something you should note.

If you fail to provide a "-np N" argument to mpiexec, we assume you want ALL 
all available slots filled. The rankfile will contain only those procs that you 
want specifically bound. The remaining procs will be unbound.

So with your hostfile, we are going to run EIGHT processes, with ranks 0-3 
located as specified in the rankfile.

If that isn't what you want, then you should add -np 4 to your cmd line.


On Oct 3, 2012, at 3:03 AM, Siegmar Gross 
 wrote:

> Hi,
> 
> I want to test process bindings with a rankfile in openmpi-1.6.2. Both
> machines are dual-processor dual-core machines running Solaris 10 x86_64.
> 
> tyr fd1026 138 cat host_sunpc0_1 
> sunpc0 slots=4
> sunpc1 slots=4
> 
> tyr fd1026 139 cat rankfile 
> rank 0=sunpc0 slot=0:0-1,1:0-1
> rank 1=sunpc1 slot=0:0-1
> rank 2=sunpc1 slot=1:0
> rank 3=sunpc1 slot=1:1
> 
> tyr fd1026 140 mpiexec -rf rankfile hostname
> --
> All nodes which are allocated for this job are already filled.
> --
> 
> Is something wrong with my rankfile, must I add a hostfile, or is it a
> bug? I get the following error when I add a hostfile. 
> 
> 
> tyr fd1026 141 mpiexec -hostfile host_sunpc0_1 -rf rankfile hostname
> [tyr.informatik.hs-fulda.de:20227] [[27927,0],0] ORTE_ERROR_LOG:
>  Data unpack would read past end of buffer in file
>  ../../../../openmpi-1.6.2/orte/mca/odls/base/odls_base_default_fns.c
>  at line 927
> ^Cmpiexec: abort is already in progress...hit ctrl-c again to forcibly
>  terminate
> 
> 
> I get the following outputs when I use Linux instead of Solaris
> (same hardware).
> 
> tyr fd1026 146 mpiexec -rf rankfile_linux hostname
> --
> All nodes which are allocated for this job are already filled.
> --
> 
> tyr fd1026 147 mpiexec -hostfile host_linpc0_1 -rf rankfile_linux hostname
> [tyr.informatik.hs-fulda.de:20260] [[27952,0],0] ORTE_ERROR_LOG: Data unpack 
> would read past end of buffer in 
> file ../../../../openmpi-1.6.2/orte/mca/odls/base/odls_base_default_fns.c at 
> line 927
> [tyr:20260] *** Process received signal ***
> [tyr:20260] Signal: Bus Error (10)
> [tyr:20260] Signal code: Invalid address alignment (1)
> [tyr:20260] Failing at address: 7463703a2f2f3129
> /export2/prog/SunOS_sparc/openmpi-1.6.2_64_cc/lib64/libopen-rte.so.4.0.0:opal_backtrace_print+0x14
> /export2/prog/SunOS_sparc/openmpi-1.6.2_64_cc/lib64/libopen-rte.so.4.0.0:0x335b48
> /lib/sparcv9/libc.so.1:0xd88a4
> /lib/sparcv9/libc.so.1:0xcc418
> /lib/sparcv9/libc.so.1:0xcc624
> /lib/sparcv9/libc.so.1:0x64394 [ Signal 2131043744 (?)]
> /lib/sparcv9/libc.so.1:free+0x30
> /export2/prog/SunOS_sparc/openmpi-1.6.2_64_cc/lib64/libopen-rte.so.4.0.0:orte_odls_base_default_construct_child
> _list+0x20b8
> /export2/prog/SunOS_sparc/openmpi-1.6.2_64_cc/lib64/openmpi/mca_odls_default.so:0x11c80
> ...
> 
> "tyr" is a Sparc machine running Solaris 10. I get a similar error if
> I run the command on a Linux machine.
> 
> tyr fd1026 148 ssh linpc4
> linpc4 fd1026 100  mpiexec -rf rankfile_linux hostname
> --
> All nodes which are allocated for this job are already filled.
> --
> 
> linpc4 fd1026 101 mpiexec -hostfile host_linpc0_1 -rf rankfile_linux hostname
> [linpc4:08079] [[49559,0],0] ORTE_ERROR_LOG: Data unpack would read past end 
> of buffer in file 
> ../../../../openmpi-1.6.2/orte/mca/odls/base/odls_base_default_fns.c at line 
> 927
> [linpc4:08079] *** Process received signal ***
> [linpc4:08079] Signal: Segmentation fault (11)
> [linpc4:08079] Signal code: Address not mapped (1)
> [linpc4:08079] Failing at address: 0x900306368
> [linpc4:08079] [ 0] /lib64/libpthread.so.0(+0xfd00) [0x7fbe174bcd00]
> [linpc4:08079] [ 1] /lib64/libc.so.6(cfree+0x14) [0x7fbe17197d24]
> [linpc4:08079] [ 2] 
> /usr/local/openmpi-1.6.2_64_cc/lib64/libopen-rte.so.4(orte_odls_base_default_construct_child_list+0x2091)
>  
> [0x7fbe182e4d21]
> [linpc4:08079] [ 3] 
> /usr/local/openmpi-1.6.2_64_cc/lib64/openmpi/mca_odls_default.so(+0x10dba) 
> [0x7fbe15415dba]
> ...
> 
> Thank you very much for any suggestion in advance.
> 
> 
> Kind regards
> 
> Siegmar
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] question to binding options in openmpi-1.6.2

2012-10-03 Thread Siegmar Gross
Hi,

thank you very much for your help. Now the command with "-npersocket"
works. Unfortunately it is not a solution for the other problem, which
I reported a few minutes ago.

tyr fd1026 191 cat host_sunpc0_1 
sunpc0 sockets=2 slots=4
sunpc1 sockets=2 slots=4

tyr fd1026 192 mpiexec -report-bindings -hostfile host_sunpc0_1 -np 4 
-cpus-per-proc 2 -bind-to-core hostname
--
An invalid physical processor ID was returned when attempting to bind
an MPI process to a unique processor.

This usually means that you requested binding to more processors than
exist (e.g., trying to bind N MPI processes to M processors, where N >
M).  Double check that you have enough unique processors for all the
MPI processes that you are launching on this host.

You job will now abort.
--
sunpc0
[sunpc0:11341] MCW rank 0 bound to socket 0[core 0-1]: [B B][. .]
sunpc0
[sunpc0:11341] MCW rank 1 bound to socket 1[core 0-1]: [. .][B B]
--
mpiexec was unable to start the specified application as it encountered an error
on node sunpc0. More information may be available above.
--
4 total processes failed to start


Perhaps you find a solution for that error as well. Thank you very much
for your help in advance.

Kind regards

Siegmar

> Okay, I looked at this and the problem isn't in the code. The
> problem is that the 1.6 series doesn't have the more sophisticated
> discovery and mapping algorithms of the 1.7 series. In this case,
< the specific problem is that the 1.6 series doesn't automatically
> detect the number of sockets on a node - you have to tell it in
> your hostfile:
> 
> foo.domain.org  sockets=2 slots=4
> 
> Otherwise, you'll get this poor error message as it tries to
> communicate that 0 sockets => zero processes.
> 
> 
> On Oct 2, 2012, at 2:44 AM, Siegmar Gross 
 wrote:
> 
> > Option "-npersocket" doesnt't work, even if I reduce "-npersocket"
> > to "1". Why doesn't it find any sockets, although the above commands
> > could find both sockets?
> > 
> > mpiexec -report-bindings -host sunpc0 -np 2 -npersocket 1 hostname
> > --
> > Your job has requested a conflicting number of processes for the
> > application:
> > 
> > App: hostname
> > number of procs:  2
> > 
> > This is more processes than we can launch under the following
> > additional directives and conditions:
> > 
> > number of sockets:   0
> > npersocket:   1
> > 
> > Please revise the conflict and try again.
> > --
> 



Re: [OMPI users] one more problem with process bindings on openmpi-1.6.2

2012-10-03 Thread Ralph Castain

On Oct 3, 2012, at 6:19 AM, Siegmar Gross 
 wrote:

> Hi,
> 
> I recognized another problem with procecss bindings. The command
> works, if I use "-host" and it breaks, if I use "-hostfile" with 
> the same machines.
> 
> tyr fd1026 178 mpiexec -report-bindings -host sunpc0,sunpc1 -np 4 \
>  -cpus-per-proc 2 -bind-to-core hostname
> sunpc1
> [sunpc1:00086] MCW rank 1 bound to socket 0[core 0-1]: [B B][. .]
> [sunpc1:00086] MCW rank 3 bound to socket 1[core 0-1]: [. .][B B]
> sunpc0
> [sunpc0:10929] MCW rank 0 bound to socket 0[core 0-1]: [B B][. .]
> sunpc0
> [sunpc0:10929] MCW rank 2 bound to socket 1[core 0-1]: [. .][B B]
> sunpc1
> 
> 

Yes, this works because you told us there is only ONE slot on each host. As a 
result, we split the 4 processes across the two hosts (both of which are now 
oversubscribed), resulting in TWO processes running on each host. Since there 
are 4 cores on each host, and you asked for 2 cores/process, we can make this 
work.


> tyr fd1026 179 cat host_sunpc0_1 
> sunpc0 slots=4
> sunpc1 slots=4
> 
> 
> tyr fd1026 180 mpiexec -report-bindings -hostfile host_sunpc0_1 -np 4 \
>  -cpus-per-proc 2 -bind-to-core hostname

And this will of course not work. In your hostfile, you told us there are FOUR 
slots on each host. Since the default is to map by slot, we correctly mapped 
all four processes to the first node. We then tried to bind 2 cores for each 
process, resulting in 8 cores - which is more than you have.


> --
> An invalid physical processor ID was returned when attempting to bind
> an MPI process to a unique processor.
> 
> This usually means that you requested binding to more processors than
> exist (e.g., trying to bind N MPI processes to M processors, where N >
> M).  Double check that you have enough unique processors for all the
> MPI processes that you are launching on this host.
> 
> You job will now abort.
> --
> sunpc0
> [sunpc0:10964] MCW rank 0 bound to socket 0[core 0-1]: [B B][. .]
> sunpc0
> [sunpc0:10964] MCW rank 1 bound to socket 1[core 0-1]: [. .][B B]
> --
> mpiexec was unable to start the specified application as it encountered
>  an error
> on node sunpc0. More information may be available above.
> --
> 4 total processes failed to start
> 
> 
> Perhaps this error is related to the other errors. Thank you very
> much for any help in advance.
> 
> 
> Kind regards
> 
> Siegmar
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users




[OMPI users] one more problem with process bindings on openmpi-1.6.2

2012-10-03 Thread Siegmar Gross
Hi,

I recognized another problem with procecss bindings. The command
works, if I use "-host" and it breaks, if I use "-hostfile" with 
the same machines.

tyr fd1026 178 mpiexec -report-bindings -host sunpc0,sunpc1 -np 4 \
  -cpus-per-proc 2 -bind-to-core hostname
sunpc1
[sunpc1:00086] MCW rank 1 bound to socket 0[core 0-1]: [B B][. .]
[sunpc1:00086] MCW rank 3 bound to socket 1[core 0-1]: [. .][B B]
sunpc0
[sunpc0:10929] MCW rank 0 bound to socket 0[core 0-1]: [B B][. .]
sunpc0
[sunpc0:10929] MCW rank 2 bound to socket 1[core 0-1]: [. .][B B]
sunpc1


tyr fd1026 179 cat host_sunpc0_1 
sunpc0 slots=4
sunpc1 slots=4


tyr fd1026 180 mpiexec -report-bindings -hostfile host_sunpc0_1 -np 4 \
  -cpus-per-proc 2 -bind-to-core hostname
--
An invalid physical processor ID was returned when attempting to bind
an MPI process to a unique processor.

This usually means that you requested binding to more processors than
exist (e.g., trying to bind N MPI processes to M processors, where N >
M).  Double check that you have enough unique processors for all the
MPI processes that you are launching on this host.

You job will now abort.
--
sunpc0
[sunpc0:10964] MCW rank 0 bound to socket 0[core 0-1]: [B B][. .]
sunpc0
[sunpc0:10964] MCW rank 1 bound to socket 1[core 0-1]: [. .][B B]
--
mpiexec was unable to start the specified application as it encountered
  an error
on node sunpc0. More information may be available above.
--
4 total processes failed to start


Perhaps this error is related to the other errors. Thank you very
much for any help in advance.


Kind regards

Siegmar



Re: [OMPI users] crashes in VASP with openmpi 1.6.x

2012-10-03 Thread Noam Bernstein
Thanks to everyone who answered, in particular Ake Sandgren, it appears
to be a weird problem with acml that somehow triggers a seg fault in
libmpi, but only when running on Opterons.  I'd still be interested in
figuring out how to get a more complete backtrace, but at least the
immediate problem is solved.


Noam


Re: [OMPI users] question to binding options in openmpi-1.6.2

2012-10-03 Thread Ralph Castain
Okay, I looked at this and the problem isn't in the code. The problem is that 
the 1.6 series doesn't have the more sophisticated discovery and mapping 
algorithms of the 1.7 series. In this case, the specific problem is that the 
1.6 series doesn't automatically detect the number of sockets on a node - you 
have to tell it in your hostfile:

foo.domain.org  sockets=2 slots=4

Otherwise, you'll get this poor error message as it tries to communicate that 0 
sockets => zero processes.


On Oct 2, 2012, at 2:44 AM, Siegmar Gross 
 wrote:

> Option "-npersocket" doesnt't work, even if I reduce "-npersocket"
> to "1". Why doesn't it find any sockets, although the above commands
> could find both sockets?
> 
> mpiexec -report-bindings -host sunpc0 -np 2 -npersocket 1 hostname
> --
> Your job has requested a conflicting number of processes for the
> application:
> 
> App: hostname
> number of procs:  2
> 
> This is more processes than we can launch under the following
> additional directives and conditions:
> 
> number of sockets:   0
> npersocket:   1
> 
> Please revise the conflict and try again.
> --



Re: [OMPI users] Need solution- nodes can't find the paths.

2012-10-03 Thread John Hearns
If I may ask, which comapny installed thsi cluster for you?
Surely they will advise on how to NFS mount the storage on the compute nodes?


Re: [OMPI users] Need solution- nodes can't find the paths.

2012-10-03 Thread John Hearns
Data is large and cannot be copied to the local drives od the compute
nodes as the data is large.

I understand that.
I think that you have storage attached to your cluster head node - the
'SAN storage' you refer to.
Lets' call that volume   /data

All you need to do is edit the /etc/exports file on the cluster head node,
and include the name of that storage area.  Just cut and paste one of
the other lines inthe file, which will have a definition of the IP
Address range of
the cluster nodes and the mount parameters.

On all the cluster nodes, you will need to run a command'mkdir /data/
The create a new  /etc/fstab file with an additional line which
contains   /data  and the name of the cluster head node
You will then have to update the node image with this new /etc/fstab,
or push the /etc/fstab out to all compute nodes
finally run the command  'mount /data/ on all compute nodes.


[OMPI users] problem with rankfile and openmpi-1.6.2

2012-10-03 Thread Siegmar Gross
Hi,

I want to test process bindings with a rankfile in openmpi-1.6.2. Both
machines are dual-processor dual-core machines running Solaris 10 x86_64.

tyr fd1026 138 cat host_sunpc0_1 
sunpc0 slots=4
sunpc1 slots=4

tyr fd1026 139 cat rankfile 
rank 0=sunpc0 slot=0:0-1,1:0-1
rank 1=sunpc1 slot=0:0-1
rank 2=sunpc1 slot=1:0
rank 3=sunpc1 slot=1:1

tyr fd1026 140 mpiexec -rf rankfile hostname
--
All nodes which are allocated for this job are already filled.
--

Is something wrong with my rankfile, must I add a hostfile, or is it a
bug? I get the following error when I add a hostfile. 


tyr fd1026 141 mpiexec -hostfile host_sunpc0_1 -rf rankfile hostname
[tyr.informatik.hs-fulda.de:20227] [[27927,0],0] ORTE_ERROR_LOG:
  Data unpack would read past end of buffer in file
  ../../../../openmpi-1.6.2/orte/mca/odls/base/odls_base_default_fns.c
  at line 927
^Cmpiexec: abort is already in progress...hit ctrl-c again to forcibly
  terminate


I get the following outputs when I use Linux instead of Solaris
(same hardware).

tyr fd1026 146 mpiexec -rf rankfile_linux hostname
--
All nodes which are allocated for this job are already filled.
--

tyr fd1026 147 mpiexec -hostfile host_linpc0_1 -rf rankfile_linux hostname
[tyr.informatik.hs-fulda.de:20260] [[27952,0],0] ORTE_ERROR_LOG: Data unpack 
would read past end of buffer in 
file ../../../../openmpi-1.6.2/orte/mca/odls/base/odls_base_default_fns.c at 
line 927
[tyr:20260] *** Process received signal ***
[tyr:20260] Signal: Bus Error (10)
[tyr:20260] Signal code: Invalid address alignment (1)
[tyr:20260] Failing at address: 7463703a2f2f3129
/export2/prog/SunOS_sparc/openmpi-1.6.2_64_cc/lib64/libopen-rte.so.4.0.0:opal_backtrace_print+0x14
/export2/prog/SunOS_sparc/openmpi-1.6.2_64_cc/lib64/libopen-rte.so.4.0.0:0x335b48
/lib/sparcv9/libc.so.1:0xd88a4
/lib/sparcv9/libc.so.1:0xcc418
/lib/sparcv9/libc.so.1:0xcc624
/lib/sparcv9/libc.so.1:0x64394 [ Signal 2131043744 (?)]
/lib/sparcv9/libc.so.1:free+0x30
/export2/prog/SunOS_sparc/openmpi-1.6.2_64_cc/lib64/libopen-rte.so.4.0.0:orte_odls_base_default_construct_child
_list+0x20b8
/export2/prog/SunOS_sparc/openmpi-1.6.2_64_cc/lib64/openmpi/mca_odls_default.so:0x11c80
...

"tyr" is a Sparc machine running Solaris 10. I get a similar error if
I run the command on a Linux machine.

tyr fd1026 148 ssh linpc4
linpc4 fd1026 100  mpiexec -rf rankfile_linux hostname
--
All nodes which are allocated for this job are already filled.
--

linpc4 fd1026 101 mpiexec -hostfile host_linpc0_1 -rf rankfile_linux hostname
[linpc4:08079] [[49559,0],0] ORTE_ERROR_LOG: Data unpack would read past end of 
buffer in file 
../../../../openmpi-1.6.2/orte/mca/odls/base/odls_base_default_fns.c at line 927
[linpc4:08079] *** Process received signal ***
[linpc4:08079] Signal: Segmentation fault (11)
[linpc4:08079] Signal code: Address not mapped (1)
[linpc4:08079] Failing at address: 0x900306368
[linpc4:08079] [ 0] /lib64/libpthread.so.0(+0xfd00) [0x7fbe174bcd00]
[linpc4:08079] [ 1] /lib64/libc.so.6(cfree+0x14) [0x7fbe17197d24]
[linpc4:08079] [ 2] 
/usr/local/openmpi-1.6.2_64_cc/lib64/libopen-rte.so.4(orte_odls_base_default_construct_child_list+0x2091)
 
[0x7fbe182e4d21]
[linpc4:08079] [ 3] 
/usr/local/openmpi-1.6.2_64_cc/lib64/openmpi/mca_odls_default.so(+0x10dba) 
[0x7fbe15415dba]
...

Thank you very much for any suggestion in advance.


Kind regards

Siegmar



Re: [OMPI users] Need solution- nodes can't find the paths.

2012-10-03 Thread Syed Ahsan Ali
Data is large and cannot be copied to the local drives od the compute nodes
as the data is large.
Second option is good but the thing I don't understand is that when each
and everything is NFS mounted to the compute nodes then why it can't takes
the external SAN drives too, I don't know how to export SAN volume from
headnode. Is there any other solution?

On Wed, Oct 3, 2012 at 1:13 PM, John Hearns  wrote:

> You need to either copy the data to storage which the cluster nodes have
> mounted. Surely your cluster vendor included local storage?
>
> Or you can configure the cluster head node to export the SAN volume by NFS
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



-- 
Syed Ahsan Ali Bokhari
Electronic Engineer (EE)

Research & Development Division
Pakistan Meteorological Department H-8/4, Islamabad.
Phone # off  +92518358714
Cell # +923155145014


Re: [OMPI users] Need solution- nodes can't find the paths.

2012-10-03 Thread John Hearns
You need to either copy the data to storage which the cluster nodes have
mounted. Surely your cluster vendor included local storage?

Or you can configure the cluster head node to export the SAN volume by NFS


[OMPI users] Need solution- nodes can't find the paths.

2012-10-03 Thread Syed Ahsan Ali
Dear All

I have a Dell Cluster running Platform Cluster Manager (PCM) , the compute
nodes are NFS mounted with the master node. Storage (SAN) is mounted to the
installer node only, the problem is that I am running a programme which
uses data which resides on Storage , so as far as running the program on
master node is concerned there is no problem but when I mpirun across other
nodes they are not able to find the paths (as the Storage partitions are
not mounted to the compute nodes). I have made symbolic links on the
installer node but the compute nodes are showing red color sysmbolic links.
Please advise how to reslove this issue

Best Regards
Ahsan


[OMPI users] Load and link MPI Host at runtime

2012-10-03 Thread mostafa . barmshory
Hi:I`m looking for a document in 'Run MPI At Run-time' topic. My idea is to 
load MPI and link host at run-time in special situation. please help. Thanks