Re: [OMPI users] Error Polling HP CQ Status on PPC64 LInux with IB
I'm currently working with Owen on this issue.. will continue my findings on list.. - Galen On Jun 29, 2006, at 7:56 AM, Jeff Squyres ((jsquyres)) wrote: Owen -- Sorry, we all fell [way] behind on e-mail because many of us were at an OMPI developer's meeting last week. :-( In the interim, we have finally released Open MPI v1.1. Could you give this version a whirl and see if it fixes your problems? -Original Message- From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf Of Owen Stampflee Sent: Monday, June 19, 2006 3:19 PM To: Open MPI Users Cc: Kai Staats Subject: Re: [OMPI users] Error Polling HP CQ Status on PPC64 LInux with IB Ooops, thought I mentioned that, its 1.0.2. Cheers, Owen On Mon, 2006-06-19 at 17:08 +0300, Gleb Natapov wrote: What version of OpenMPI are you using? On Mon, Jun 19, 2006 at 07:06:54AM -0700, Owen Stampflee wrote: I'm currently working on getting OpenMPI + OpenIB 1.0 (might be an RC) working on our 8 node Xserve G5 cluster running Linux kernel version 2.6.16 and get the following errors: Process 1 on node-192-168-111-249 Process 0 on node-192-168-111-248 [0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress] error polling HP CQ with status 1 for wr_id 270995584 opcode -1286736 [0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress] error polling HP CQ with status 5 for wr_id 270995868 opcode -1286736 [0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress] error polling HP CQ with status 5 for wr_id 270996152 opcode -1286736 [0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress] error polling HP CQ with status 5 for wr_id 270996436 opcode -1286736 [0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress] error polling HP CQ with status 5 for wr_id 270996720 opcode -1286736 [0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress] error polling HP CQ with status 5 for wr_id 270997004 opcode -1286736 [0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress] error polling HP CQ with status 5 for wr_id 270997288 opcode -1286736 [0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress] error polling HP CQ with status 5 for wr_id 270997572 opcode -1286736 [0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress] error polling HP CQ with status 5 for wr_id 271077504 opcode -1286736 [0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress] error polling HP CQ with status 5 for wr_id 271077788 opcode -1286736 [0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress] error polling HP CQ with status 5 for wr_id 271078072 opcode -1286736 [0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress] error polling HP CQ with status 9 for wr_id 270991488 opcode -6639584 [0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress] error polling HP CQ with status 5 for wr_id 270995584 opcode -6639584 [0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress] error polling HP CQ with status 5 for wr_id 270995868 opcode -6639584 [0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress] error polling HP CQ with status 5 for wr_id 270996152 opcode -6639584 [0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress] error polling HP CQ with status 5 for wr_id 270996436 opcode -6639584 [0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress] error polling HP CQ with status 5 for wr_id 270996720 opcode -6639584 [0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress] error polling HP CQ with status 5 for wr_id 270997004 opcode -6639584 [0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress] error polling HP CQ with status 5 for wr_id 270997288 opcode -6639584 [0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress] error polling HP CQ with status 5 for wr_id 270997572 opcode -6639584 [0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress] error polling HP CQ with status 5 for wr_id 271077504 opcode -6639584 [0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress] error polling HP CQ with status 5 for wr_id 271077788 opcode -6639584 [0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress] error polling HP CQ with status 5 for wr_id 271078072 opcode -6639584 mpirun: killing job... ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Gleb. ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users !DSPAM:4496b11547791312162! ___ users mailing list us...@open-mpi.org http://www.open-mpi.org
Re: [OMPI users] Error Polling HP CQ Status on PPC64 LInux with IB
Owen -- Sorry, we all fell [way] behind on e-mail because many of us were at an OMPI developer's meeting last week. :-( In the interim, we have finally released Open MPI v1.1. Could you give this version a whirl and see if it fixes your problems? > -Original Message- > From: users-boun...@open-mpi.org > [mailto:users-boun...@open-mpi.org] On Behalf Of Owen Stampflee > Sent: Monday, June 19, 2006 3:19 PM > To: Open MPI Users > Cc: Kai Staats > Subject: Re: [OMPI users] Error Polling HP CQ Status on PPC64 > LInux with IB > > Ooops, thought I mentioned that, its 1.0.2. > > Cheers, > Owen > > On Mon, 2006-06-19 at 17:08 +0300, Gleb Natapov wrote: > > What version of OpenMPI are you using? > > > > On Mon, Jun 19, 2006 at 07:06:54AM -0700, Owen Stampflee wrote: > > > I'm currently working on getting OpenMPI + OpenIB 1.0 > (might be an RC) > > > working on our 8 node Xserve G5 cluster running Linux > kernel version > > > 2.6.16 and get the following errors: > > > > > > Process 1 on node-192-168-111-249 > > > Process 0 on node-192-168-111-248 > > > > [0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress] > > > error polling HP CQ with status 1 for wr_id 270995584 > opcode -1286736 > > > > > > > [0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress] > > > error polling HP CQ with status 5 for wr_id 270995868 > opcode -1286736 > > > > > > > [0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress] > > > error polling HP CQ with status 5 for wr_id 270996152 > opcode -1286736 > > > > > > > [0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress] > > > error polling HP CQ with status 5 for wr_id 270996436 > opcode -1286736 > > > > > > > [0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress] > > > error polling HP CQ with status 5 for wr_id 270996720 > opcode -1286736 > > > > > > > [0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress] > > > error polling HP CQ with status 5 for wr_id 270997004 > opcode -1286736 > > > > > > > [0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress] > > > error polling HP CQ with status 5 for wr_id 270997288 > opcode -1286736 > > > > > > > [0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress] > > > error polling HP CQ with status 5 for wr_id 270997572 > opcode -1286736 > > > > > > > [0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress] > > > error polling HP CQ with status 5 for wr_id 271077504 > opcode -1286736 > > > > > > > [0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress] > > > error polling HP CQ with status 5 for wr_id 271077788 > opcode -1286736 > > > > > > > [0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress] > > > error polling HP CQ with status 5 for wr_id 271078072 > opcode -1286736 > > > > > > > [0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress] > > > error polling HP CQ with status 9 for wr_id 270991488 > opcode -6639584 > > > > > > > [0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress] > > > error polling HP CQ with status 5 for wr_id 270995584 > opcode -6639584 > > > > > > > [0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress] > > > error polling HP CQ with status 5 for wr_id 270995868 > opcode -6639584 > > > > > > > [0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress] > > > error polling HP CQ with status 5 for wr_id 270996152 > opcode -6639584 > > > > > > > [0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress] > > > error polling HP CQ with status 5 for wr_id 270996436 > opcode -6639584 > > > > > > > [0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress] > > > error polling HP CQ with status 5 for wr_id 270996720 > opcode -6639584 > > > > > > > [0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress] > > > error polling HP CQ with status 5 for wr_id 270997004 > opcode -6639584 > > > > > > > [0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress] > > > error polling HP CQ with status 5 for wr_id 270997288 > opcode -6639584 > > > > > > > [0,1,0][bt
Re: [OMPI users] Error Polling HP CQ Status on PPC64 LInux with IB
What version of OpenMPI are you using? On Mon, Jun 19, 2006 at 07:06:54AM -0700, Owen Stampflee wrote: > I'm currently working on getting OpenMPI + OpenIB 1.0 (might be an RC) > working on our 8 node Xserve G5 cluster running Linux kernel version > 2.6.16 and get the following errors: > > Process 1 on node-192-168-111-249 > Process 0 on node-192-168-111-248 > [0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress] > error polling HP CQ with status 1 for wr_id 270995584 opcode -1286736 > > [0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress] > error polling HP CQ with status 5 for wr_id 270995868 opcode -1286736 > > [0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress] > error polling HP CQ with status 5 for wr_id 270996152 opcode -1286736 > > [0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress] > error polling HP CQ with status 5 for wr_id 270996436 opcode -1286736 > > [0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress] > error polling HP CQ with status 5 for wr_id 270996720 opcode -1286736 > > [0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress] > error polling HP CQ with status 5 for wr_id 270997004 opcode -1286736 > > [0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress] > error polling HP CQ with status 5 for wr_id 270997288 opcode -1286736 > > [0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress] > error polling HP CQ with status 5 for wr_id 270997572 opcode -1286736 > > [0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress] > error polling HP CQ with status 5 for wr_id 271077504 opcode -1286736 > > [0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress] > error polling HP CQ with status 5 for wr_id 271077788 opcode -1286736 > > [0,1,1][btl_openib_component.c:587:mca_btl_openib_component_progress] > error polling HP CQ with status 5 for wr_id 271078072 opcode -1286736 > > [0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress] > error polling HP CQ with status 9 for wr_id 270991488 opcode -6639584 > > [0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress] > error polling HP CQ with status 5 for wr_id 270995584 opcode -6639584 > > [0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress] > error polling HP CQ with status 5 for wr_id 270995868 opcode -6639584 > > [0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress] > error polling HP CQ with status 5 for wr_id 270996152 opcode -6639584 > > [0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress] > error polling HP CQ with status 5 for wr_id 270996436 opcode -6639584 > > [0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress] > error polling HP CQ with status 5 for wr_id 270996720 opcode -6639584 > > [0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress] > error polling HP CQ with status 5 for wr_id 270997004 opcode -6639584 > > [0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress] > error polling HP CQ with status 5 for wr_id 270997288 opcode -6639584 > > [0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress] > error polling HP CQ with status 5 for wr_id 270997572 opcode -6639584 > > [0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress] > error polling HP CQ with status 5 for wr_id 271077504 opcode -6639584 > > [0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress] > error polling HP CQ with status 5 for wr_id 271077788 opcode -6639584 > > [0,1,0][btl_openib_component.c:587:mca_btl_openib_component_progress] > error polling HP CQ with status 5 for wr_id 271078072 opcode -6639584 > > mpirun: killing job... > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users -- Gleb.