[EMAIL PROTECTED] wrote on Fri, 19 Oct 2007 10:11 -0500:
> I did the tracing that you are suggesting, this time with 1 client and
> 1 PVFS2 server. Apparently the queue has enough completion queue
> entries. The memory registration seems to be the problem (however as I
> said, on the front-end runs):
>
> [D 10:04:01.500768] PVFS2 Server version 2.6.3 starting.
> [D 10:04:01.778135] BMI_ib_initialize: init.
> [D 10:04:01.778252] openib_ib_initialize: init.
> [D 10:04:01.779038] openib_ib_initialize: max 65408 completion queue entries.
> [D 10:04:01.779380] BMI_ib_initialize: done.
> [E 10:04:01.781047] Error: openib_mem_register: ibv_register_mr.
> [E 10:04:01.781763] [bt] ./bt.A.1.mpi_io_full(error+0xf4) [0x533738]
> [E 10:04:01.781771] [bt] ./bt.A.1.mpi_io_full [0x53614a]
> [E 10:04:01.781776] [bt] ./bt.A.1.mpi_io_full [0x534214]
> [E 10:04:01.781780] [bt] ./bt.A.1.mpi_io_full [0x533166]
> [E 10:04:01.781784] [bt] ./bt.A.1.mpi_io_full [0x50a644]
> [E 10:04:01.781788] [bt] ./bt.A.1.mpi_io_full [0x504ac1]
> [E 10:04:01.781792] [bt] ./bt.A.1.mpi_io_full [0x4ce576]
> [E 10:04:01.781795] [bt] ./bt.A.1.mpi_io_full [0x4ce277]
> [E 10:04:01.781799] [bt] ./bt.A.1.mpi_io_full [0x4ed598]
> [E 10:04:01.781803] [bt] ./bt.A.1.mpi_io_full [0x4ed5d1]
> [E 10:04:01.781807] [bt] ./bt.A.1.mpi_io_full [0x4ff1b5]
> [D 10/19 10:04] PVFS2 Server: storage space created. Exiting.
> [D 10:04:01.896168] PVFS2 Server version 2.6.3 starting.
Then the CQ allocation fail did not happen this time around? How
did that get fixed? 65408 seems way too big. I still wonder what
type of silicon you have.
This MR issue might be due to process locked memory limits. Look
around in the IB world for "ulimit -l" or /etc/security/limits.conf
and set it to lots, or unlimited.
-- Pete
_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users