All programs are executed as the root user. ulimit -a
time(seconds) unlimited file(blocks) unlimited data(kbytes) unlimited stack(kbytes) unlimited coredump(blocks) 0 memory(kbytes) unlimited locked memory(kbytes) unlimited process 8063 nofiles 1048576 vmemory(kbytes) unlimited locks unlimited On Tue, Feb 24, 2009 at 11:50 PM, Dotan Barak <[email protected]> wrote: > Do you execute your program under the root user or under any other user? > (maybe you fail because of the ulimit value of memory which can be pinned) > > > Dotan > > On Wed, Feb 25, 2009 at 7:51 AM, Phillip Wilson <[email protected]> > wrote: > > The “ibv_reg_mr()” function call fails with HCA (DID=0x634A) that uses > the > > mlx4_0 driver when the system is under load (memory and cpu). The system > > usually has over 500MB of system memory when “ibv_reg_mr()” call fails. > > > > > > > > If I only run one HCA with (DID=0x6278) that uses the mthca0 driver with > the > > other tools to generate stress the “ibv_reg_mr()” call always passes. If > I > > only run the HCA with (DID=0x634A) with the other tools to generate > stress > > the “ibv_reg_mr()” call will always fails; it usually takes less than 30 > > minutes for the failure to occur. > > > > > > > > > > > > The maximum number of memory regions requested at one time is up to 8 > (32MB) > > with two HCA dual port cards and the maximum size for a memory region is > 1 > > MB. > > > > > > > > (i.e. ctx->mr = ibv_reg_mr(ctx->pd, > > > > buffer, /*malloc 4MB buffer > > per process*/ > > > > size, /*2 Bytes to 1MB > */ > > > > IBV_ACCESS_LOCAL_WRITE); > > > > ) > > > > > > > > I modified the ibv_rc_pingpong test to use the parent-child paradigm > instead > > of the current client/server approach for my environment. The code forks > a > > parent process and a child process per port which serves the same purpose > as > > the current client/server approach. The code also forks a process to run > on > > a HCA. Basically, the same code is executed on each HCA except for the > user > > libraries (libmlx4.so, libmthca.so), mlx4.ko, mthca.ko and firmware on > each > > HCA. > > > > > > > > Since the code in the user libraries is very similar to each other, I > > suspect the issue is in the kernel code or HCA firmware. > > > > > > > > Does any one know what kernel patch fixes this issue starting from kernel > > 2.6.24 through 2.6.28? Has anyone else seen this issue? > > > > > > > > System Information: > > > > > > > > The system has 4GB of memory. > > > > > > > > uname -a > > > > Linux (none) 2.6.24.02.02.08 #21 SMP Thu Feb 19 11:04:35 PST 2009 ia64 > > unknown > > > > > > > > OFED 1.2.5 > > > > > > > > lspci -d 15b3: > > > > > > > > 0000:10:00.0 InfiniBand: Mellanox Technologies MT25208 InfiniHost III Ex > > (Tavor compatibility mode) (rev 20) > > > > 0000:c3:00.0 InfiniBand: Mellanox Technologies: Unknown device 634a (rev > a0) > > > > > > > > lspci -d 15b3: -n > > > > 0000:10:00.0 0c06: 15b3:6278 (rev 20) > > > > 0000:c3:00.0 0c06: 15b3:634a (rev a0) > > > > > > > > ibv_devinfo -v > > > > hca_id: mlx4_0 > > > > fw_ver: 2.5.000 > > > > > > > > hca_id: mthca0 > > > > fw_ver: 4.8.930 > > > > _______________________________________________ > > general mailing list > > [email protected] > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > To unsubscribe, please visit > > http://openib.org/mailman/listinfo/openib-general > > >
_______________________________________________ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
