Re: [OMPI users] Issue with : btl_openib.c (OMPI 1.4.3)

2011-01-12 Thread Shamis, Pavel
You are running 1.4.1 version. If it does not work, I would contact your ib vendor, or ofa-general mail list to check what combination of Firmware / driver you have to use. Regards, Pavel (Pasha) Shamis --- Application Performance Tools Group Computer Science and Math Division Oak Ridge

Re: [OMPI users] Issue with : btl_openib.c (OMPI 1.4.3)

2011-01-07 Thread Gilbert Grosdidier
Bonjour Pavel, Here is the output of the ofed_info command : == OFED-1.4.1 libibverbs: git://git.openfabrics.org/ofed_1_4/libibverbs.git ofed_1_4 commit b00dc7d2f79e0660ac40160607c9c4937a895433 libmthca:

Re: [OMPI users] Issue with : btl_openib.c (OMPI 1.4.3)

2011-01-07 Thread Jeff Squyres
+1 AFAIR (and I stopped being an IB vendor a long time ago, so I might be wrong), the _resize_cq function being there or not is not an issue of the underlying HCA; it's a function of what version of OFED you're running. On Jan 7, 2011, at 10:14 AM, Shamis, Pavel wrote: > The FW version looks

Re: [OMPI users] Issue with : btl_openib.c (OMPI 1.4.3)

2011-01-07 Thread Shamis, Pavel
The FW version looks ok. But it may be driver issues as well. I guess that OFED 1.4.X or 1.5.x driver should be ok. To check driver version , you may run ofed_info command. Regards, Pavel (Pasha) Shamis --- Application Performance Tools Group Computer Science and Math Division Oak Ridge

Re: [OMPI users] Issue with : btl_openib.c (OMPI 1.4.3)

2010-12-17 Thread Gilbert Grosdidier
John, Thanks, more info below. Le 17/12/2010 17:32, John Hearns a écrit : On 17 December 2010 15:47, Gilbert Grosdidier wrote: gg= I don't know, and firmware_revs does not seem to be available. Only thing I got on a worker node was with lspci : If you log into

Re: [OMPI users] Issue with : btl_openib.c (OMPI 1.4.3)

2010-12-17 Thread Gilbert Grosdidier
Bonjour John, First, Thanks for your feedback. Le 17 déc. 10 à 16:13, John Hearns a écrit : On 17 December 2010 14:45, Gilbert Grosdidier wrote: Bonjour, About this issue, for which I got NO feedback ;-) Gilbert, as you have an SGI cluster, have you filed a

Re: [OMPI users] Issue with : btl_openib.c (OMPI 1.4.3)

2010-12-17 Thread John Hearns
On 17 December 2010 14:45, Gilbert Grosdidier wrote: > Bonjour, >  About this issue, for which I got NO feedback ;-) I recently spotted > into btl_openib.c code, that this error message could come from On the cluster admin node, run firmware_revs and look for the

Re: [OMPI users] Issue with : btl_openib.c (OMPI 1.4.3)

2010-12-17 Thread John Hearns
On 17 December 2010 14:45, Gilbert Grosdidier wrote: > Bonjour, >  About this issue, for which I got NO feedback ;-) Gilbert, as you have an SGI cluster, have you filed a support request to SGI? Also, which firmware do you have installed? I haveFirmware

Re: [OMPI users] Issue with : btl_openib.c (OMPI 1.4.3)

2010-12-17 Thread Gilbert Grosdidier
Bonjour, About this issue, for which I got NO feedback ;-) I recently spotted into btl_openib.c code, that this error message could come from some missing ConnectX HCA ibv_resize_cq function. Well ... I was unable yet to figure out why/how this could occur, but I have a now a closely related

[OMPI users] Issue with : btl_openib.c (OMPI 1.4.3)

2010-12-15 Thread Gilbert Grosdidier
Bonjour, Running with OpenMPI 1.4.3 on an SGI Altix cluster with 2048 cores, I got this error message on all cores, right at startup : btl_openib.c:211:adjust_cq] cannot resize completion queue, error: 12 What could be the culprit please ? Is there a workaround ? What parameter is to be