Re: [OMPI users] Big job, InfiniBand, MPI_Alltoallv and ibv_create_qp failed

2013-08-01 Thread Paul Kapinos
Vanilla Linux ofed from RPM's for Scientific Linux release 6.4 (Carbon) (= RHEL 
6.4).

No ofed_info available :-(

On 07/31/13 16:59, Mike Dubman wrote:

Hi,
What OFED vendor and version do you use?
Regards
M


On Tue, Jul 30, 2013 at 8:42 PM, Paul Kapinos > wrote:

Dear Open MPI experts,

An user at our cluster has a problem running a kinda of big job:
(- the job using 3024 processes (12 per node, 252 nodes) runs fine)
- the job using 4032 processes (12 per node, 336 nodes) produce the error
attached below.

Well, the
http://www.open-mpi.org/faq/?__category=openfabrics#ib-__locked-pages
 is
well-known one; both recommended tweakables (user limits and registered
memory size) are at MAX now, nevertheless someone queue pair could not be
created.

Our blind guess is the number of completion queues is exhausted.

What happen' when raising the value from standard to max?
What max size of Open MPI jobs have been seen at all?
What max size of Open MPI jobs *using MPI_Alltoallv* have been seen at all?
Is there a way to manage the size/the number of queue pairs? (XRC not 
availabe)
Is there a way to tell MPI_Alltoallv to use less queue pairs, even when this
could lead to slow-down?

There is a suspicious parameter in the mlx4_core module:
$ modinfo mlx4_core | grep log_num_cq
parm:   log_num_cq:log maximum number of CQs per HCA  (int)

Is this the tweakable parameter?
What is the default, and max value?

Any help would be welcome...

Best,

Paul Kapinos

P.S. There should be no connection problen somewhere between the nodes; a
test job with 1x process on each node has been ran sucessfully just before
starting the actual job, which also ran OK for a while - until calling
MPI_Alltoallv.







--__--__--
A process failed to create a queue pair. This usually means either
the device has run out of queue pairs (too many connections) or
there are insufficient resources available to allocate a queue pair
(out of memory). The latter can happen if either 1) insufficient
memory is available, or 2) no more physical memory can be registered
with the device.

For more information on memory registration see the Open MPI FAQs at:
http://www.open-mpi.org/faq/?__category=openfabrics#ib-__locked-pages


Local host: linuxbmc1156.rz.RWTH-Aachen.DE

Local device:   mlx4_0
Queue pair type:Reliable connected (RC)

--__--__--
[linuxbmc1156.rz.RWTH-Aachen.__DE

][[3703,1],4021][connect/__btl_openib_connect_oob.c:867:__rml_recv_cb]
error in endpoint reply start connect
[linuxbmc1156.rz.RWTH-Aachen.__DE:9632
] *** An error occurred in
MPI_Alltoallv
[linuxbmc1156.rz.RWTH-Aachen.__DE:9632
] *** on communicator 
MPI_COMM_WORLD
[linuxbmc1156.rz.RWTH-Aachen.__DE:9632
] *** MPI_ERR_OTHER: known error
not in list
[linuxbmc1156.rz.RWTH-Aachen.__DE:9632
] *** MPI_ERRORS_ARE_FATAL: your
MPI job will now abort
[linuxbmc1156.rz.RWTH-Aachen.__DE

][[3703,1],4024][connect/__btl_openib_connect_oob.c:867:__rml_recv_cb]
error in endpoint reply start connect
[linuxbmc1156.rz.RWTH-Aachen.__DE

][[3703,1],4027][connect/__btl_openib_connect_oob.c:867:__rml_recv_cb]
error in endpoint reply start connect
[linuxbmc0840.rz.RWTH-Aachen.__DE

][[3703,1],10][connect/btl___openib_connect_oob.c:867:rml___recv_cb]
error in endpoint reply start connect
[linuxbmc0840.rz.RWTH-Aachen.__DE

][[3703,1],1][connect/btl___openib_connect_oob.c:867:rml___recv_cb]
error in endpoint reply start connect
[linuxbmc0840.rz.RWTH-Aachen.__DE:17696
] [[3703,0],0]-[[3703,1],10]
mca_oob_tcp_msg_recv: readv failed: Connection reset by peer (104)
[linuxbmc0840.rz.RWTH-Aachen.__DE:17696
] [[3703,0],0]-[[3703,1],8]
mca_oob_tcp_msg_recv: readv failed: Connection reset by peer (104)
[linuxbmc0840.rz.RWTH-Aachen.__DE:17696
] [[3703,0],0]-[[3703,1],9]
mca_oob_tcp_msg_recv: readv failed: Connection reset 

Re: [OMPI users] Big job, InfiniBand, MPI_Alltoallv and ibv_create_qp failed

2013-07-31 Thread Mike Dubman
Hi,
What OFED vendor and version do you use?
Regards
M


On Tue, Jul 30, 2013 at 8:42 PM, Paul Kapinos wrote:

> Dear Open MPI experts,
>
> An user at our cluster has a problem running a kinda of big job:
> (- the job using 3024 processes (12 per node, 252 nodes) runs fine)
> - the job using 4032 processes (12 per node, 336 nodes) produce the error
> attached below.
>
> Well, the http://www.open-mpi.org/faq/?**category=openfabrics#ib-**
> locked-pagesis
>  well-known one; both recommended tweakables (user limits and registered
> memory size) are at MAX now, nevertheless someone queue pair could not be
> created.
>
> Our blind guess is the number of completion queues is exhausted.
>
> What happen' when raising the value from standard to max?
> What max size of Open MPI jobs have been seen at all?
> What max size of Open MPI jobs *using MPI_Alltoallv* have been seen at all?
> Is there a way to manage the size/the number of queue pairs? (XRC not
> availabe)
> Is there a way to tell MPI_Alltoallv to use less queue pairs, even when
> this could lead to slow-down?
>
> There is a suspicious parameter in the mlx4_core module:
> $ modinfo mlx4_core | grep log_num_cq
> parm:   log_num_cq:log maximum number of CQs per HCA  (int)
>
> Is this the tweakable parameter?
> What is the default, and max value?
>
> Any help would be welcome...
>
> Best,
>
> Paul Kapinos
>
> P.S. There should be no connection problen somewhere between the nodes; a
> test job with 1x process on each node has been ran sucessfully just before
> starting the actual job, which also ran OK for a while - until calling
> MPI_Alltoallv.
>
>
>
>
>
>
> --**--**
> --
> A process failed to create a queue pair. This usually means either
> the device has run out of queue pairs (too many connections) or
> there are insufficient resources available to allocate a queue pair
> (out of memory). The latter can happen if either 1) insufficient
> memory is available, or 2) no more physical memory can be registered
> with the device.
>
> For more information on memory registration see the Open MPI FAQs at:
> http://www.open-mpi.org/faq/?**category=openfabrics#ib-**locked-pages
>
> Local host: linuxbmc1156.rz.RWTH-Aachen.DE
> Local device:   mlx4_0
> Queue pair type:Reliable connected (RC)
> --**--**
> --
> [linuxbmc1156.rz.RWTH-Aachen.**DE 
> ][[3703,1],4021][connect/**btl_openib_connect_oob.c:867:**rml_recv_cb]
> error in endpoint reply start connect
> [linuxbmc1156.rz.RWTH-Aachen.**DE:9632]
> *** An error occurred in MPI_Alltoallv
> [linuxbmc1156.rz.RWTH-Aachen.**DE:9632]
> *** on communicator MPI_COMM_WORLD
> [linuxbmc1156.rz.RWTH-Aachen.**DE:9632]
> *** MPI_ERR_OTHER: known error not in list
> [linuxbmc1156.rz.RWTH-Aachen.**DE:9632]
> *** MPI_ERRORS_ARE_FATAL: your MPI job will now abort
> [linuxbmc1156.rz.RWTH-Aachen.**DE 
> ][[3703,1],4024][connect/**btl_openib_connect_oob.c:867:**rml_recv_cb]
> error in endpoint reply start connect
> [linuxbmc1156.rz.RWTH-Aachen.**DE 
> ][[3703,1],4027][connect/**btl_openib_connect_oob.c:867:**rml_recv_cb]
> error in endpoint reply start connect
> [linuxbmc0840.rz.RWTH-Aachen.**DE 
> ][[3703,1],10][connect/btl_**openib_connect_oob.c:867:rml_**recv_cb]
> error in endpoint reply start connect
> [linuxbmc0840.rz.RWTH-Aachen.**DE 
> ][[3703,1],1][connect/btl_**openib_connect_oob.c:867:rml_**recv_cb] error
> in endpoint reply start connect
> [linuxbmc0840.rz.RWTH-Aachen.**DE:17696]
> [[3703,0],0]-[[3703,1],10] mca_oob_tcp_msg_recv: readv failed: Connection
> reset by peer (104)
> [linuxbmc0840.rz.RWTH-Aachen.**DE:17696]
> [[3703,0],0]-[[3703,1],8] mca_oob_tcp_msg_recv: readv failed: Connection
> reset by peer (104)
> [linuxbmc0840.rz.RWTH-Aachen.**DE:17696]
> [[3703,0],0]-[[3703,1],9] mca_oob_tcp_msg_recv: readv failed: Connection
> reset by peer (104)
> [linuxbmc0840.rz.RWTH-Aachen.**DE:17696]
> [[3703,0],0]-[[3703,1],1] mca_oob_tcp_msg_recv: readv failed: Connection
> reset by peer (104)
> [linuxbmc0840.rz.RWTH-Aachen.**DE:17696]
> 9 more processes have sent help message help-mpi-btl-openib-cpc-base.**txt
> / 

[OMPI users] Big job, InfiniBand, MPI_Alltoallv and ibv_create_qp failed

2013-07-30 Thread Paul Kapinos

Dear Open MPI experts,

An user at our cluster has a problem running a kinda of big job:
(- the job using 3024 processes (12 per node, 252 nodes) runs fine)
- the job using 4032 processes (12 per node, 336 nodes) produce the error 
attached below.


Well, the http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages is 
well-known one; both recommended tweakables (user limits and registered memory 
size) are at MAX now, nevertheless someone queue pair could not be created.


Our blind guess is the number of completion queues is exhausted.

What happen' when raising the value from standard to max?
What max size of Open MPI jobs have been seen at all?
What max size of Open MPI jobs *using MPI_Alltoallv* have been seen at all?
Is there a way to manage the size/the number of queue pairs? (XRC not availabe)
Is there a way to tell MPI_Alltoallv to use less queue pairs, even when this 
could lead to slow-down?


There is a suspicious parameter in the mlx4_core module:
$ modinfo mlx4_core | grep log_num_cq
parm:   log_num_cq:log maximum number of CQs per HCA  (int)

Is this the tweakable parameter?
What is the default, and max value?

Any help would be welcome...

Best,

Paul Kapinos

P.S. There should be no connection problen somewhere between the nodes; a test 
job with 1x process on each node has been ran sucessfully just before starting 
the actual job, which also ran OK for a while - until calling MPI_Alltoallv.







--
A process failed to create a queue pair. This usually means either
the device has run out of queue pairs (too many connections) or
there are insufficient resources available to allocate a queue pair
(out of memory). The latter can happen if either 1) insufficient
memory is available, or 2) no more physical memory can be registered
with the device.

For more information on memory registration see the Open MPI FAQs at:
http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages

Local host: linuxbmc1156.rz.RWTH-Aachen.DE
Local device:   mlx4_0
Queue pair type:Reliable connected (RC)
--
[linuxbmc1156.rz.RWTH-Aachen.DE][[3703,1],4021][connect/btl_openib_connect_oob.c:867:rml_recv_cb] 
error in endpoint reply start connect

[linuxbmc1156.rz.RWTH-Aachen.DE:9632] *** An error occurred in MPI_Alltoallv
[linuxbmc1156.rz.RWTH-Aachen.DE:9632] *** on communicator MPI_COMM_WORLD
[linuxbmc1156.rz.RWTH-Aachen.DE:9632] *** MPI_ERR_OTHER: known error not in list
[linuxbmc1156.rz.RWTH-Aachen.DE:9632] *** MPI_ERRORS_ARE_FATAL: your MPI job 
will now abort
[linuxbmc1156.rz.RWTH-Aachen.DE][[3703,1],4024][connect/btl_openib_connect_oob.c:867:rml_recv_cb] 
error in endpoint reply start connect
[linuxbmc1156.rz.RWTH-Aachen.DE][[3703,1],4027][connect/btl_openib_connect_oob.c:867:rml_recv_cb] 
error in endpoint reply start connect
[linuxbmc0840.rz.RWTH-Aachen.DE][[3703,1],10][connect/btl_openib_connect_oob.c:867:rml_recv_cb] 
error in endpoint reply start connect
[linuxbmc0840.rz.RWTH-Aachen.DE][[3703,1],1][connect/btl_openib_connect_oob.c:867:rml_recv_cb] 
error in endpoint reply start connect
[linuxbmc0840.rz.RWTH-Aachen.DE:17696] [[3703,0],0]-[[3703,1],10] 
mca_oob_tcp_msg_recv: readv failed: Connection reset by peer (104)
[linuxbmc0840.rz.RWTH-Aachen.DE:17696] [[3703,0],0]-[[3703,1],8] 
mca_oob_tcp_msg_recv: readv failed: Connection reset by peer (104)
[linuxbmc0840.rz.RWTH-Aachen.DE:17696] [[3703,0],0]-[[3703,1],9] 
mca_oob_tcp_msg_recv: readv failed: Connection reset by peer (104)
[linuxbmc0840.rz.RWTH-Aachen.DE:17696] [[3703,0],0]-[[3703,1],1] 
mca_oob_tcp_msg_recv: readv failed: Connection reset by peer (104)
[linuxbmc0840.rz.RWTH-Aachen.DE:17696] 9 more processes have sent help message 
help-mpi-btl-openib-cpc-base.txt / ibv_create_qp failed
[linuxbmc0840.rz.RWTH-Aachen.DE:17696] Set MCA parameter 
"orte_base_help_aggregate" to 0 to see all help / error messages
[linuxbmc0840.rz.RWTH-Aachen.DE:17696] 3 more processes have sent help message 
help-mpi-errors.txt / mpi_errors_are_fatal


--
Dipl.-Inform. Paul Kapinos   -   High Performance Computing,
RWTH Aachen University, Center for Computing and Communication
Seffenter Weg 23,  D 52074  Aachen (Germany)
Tel: +49 241/80-24915



smime.p7s
Description: S/MIME Cryptographic Signature