Hi Wodel

Randomaccess part of HPCC is probably causing this.

Perhaps set PSM env. variable -

Export PSM_MQ_REVCREQ_MAX=10000000

or something like that.

Alternatively launch the job using

mpirun --mca plm ob1 --host ....

to avoid use of psm.  Performance will probably suffer with this option
however.

Howard
wodel youchi <wodel.you...@gmail.com> schrieb am Di. 31. Jan. 2017 um 08:27:

> Hi,
>
> I am n newbie in HPC world
>
> I am trying to execute the hpcc benchmark on our cluster, but every time I
> start the job, I get this error, then the job exits
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> *compute017.22840Exhausted 1048576 MQ irecv request descriptors, which
> usually indicates a user program error or insufficient request descriptors
> (PSM_MQ_RECVREQS_MAX=1048576)compute024.22840Exhausted 1048576 MQ irecv
> request descriptors, which usually indicates a user program error or
> insufficient request descriptors
> (PSM_MQ_RECVREQS_MAX=1048576)compute019.22847Exhausted 1048576 MQ irecv
> request descriptors, which usually indicates a user program error or
> insufficient request descriptors
> (PSM_MQ_RECVREQS_MAX=1048576)-------------------------------------------------------Primary
> job  terminated normally, but 1 process returneda non-zero exit code.. Per
> user-direction, the job has been
> aborted.---------------------------------------------------------------------------------------------------------------------------------mpirun
> detected that one or more processes exited with non-zero status, thus
> causingthe job to be terminated. The first process to do so was:  Process
> name: [[19601,1],272]  Exit code:
> 255--------------------------------------------------------------------------*
>
> Platform : IBM PHPC
> OS : RHEL 6.5
> one management node
> 32 compute node : 16 cores, 32GB RAM, intel qlogic QLE7340 one port QRD
> infiniband 40Gb/s
>
> I compiled hpcc against : IBM MPI, Openmpi 2.0.1 (compiled with gcc 4.4.7)
> and Openmpi 1.8.1 (compiled with gcc 4.4.7)
>
> I get the errors, but each time on different compute nodes.
>
> This is the command I used to start the job
>
> *mpirun -np 512 --mca mtl psm --hostfile hosts32
> /shared/build/hpcc-1.5.0b-blas-ompi-181/hpcc hpccinf.txt*
>
> Any help will be appreciated, and if you need more details, let me know.
> Thanks in advance.
>
>
> Regards.
> _______________________________________________
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Reply via email to