On Mar 25, 2007, at 4:22 AM, Dotan Barak wrote:
Hi Troy.
I can only answer about your info which is related to the mthca
devices.
Troy Benjegerdes wrote:
We have been getting some interesting failures with ibv_reg_mr..
gcc -ggdb -libverbs -o mr-test mr-test.c
/usr/src/ibv-mr-test/mr-test
mr-test: bufsize 1048576
device # 0 name="mthca0" guid="00066a0098000464"
ibv_open_device() context=0x10012c98
ibv_alloc_pd() pd=0x10013678
alloc: 2482
ibv_reg_mr failed:: Cannot allocate memory
fw_ver: 3.3.2
max_mr_size 0xffffffffffffffff
max_mr: 131056, could only register 2482 regions
sleep 5 sec
free: 0
done
I wasn't able to reproduce this failure but i noticed that you are
using an old FW version (current version is 3.5.0).
with a 10MB buffer:
gcc -ggdb -libverbs -o mr-test mr-test.c
/usr/src/ibv-mr-test/mr-test
mr-test: bufsize 10485760
device # 0 name="mthca0" guid="00066a0098000464"
ibv_open_device() context=0x10012c98
ibv_alloc_pd() pd=0x10013678
alloc: 2482
ibv_reg_mr failed:: Cannot allocate memory
fw_ver: 3.3.2
max_mr_size 0xffffffffffffffff
max_mr: 131056, could only register 2482 regions
sleep 5 sec
free: 0
done
On 64 bit machine i got a kernel oops, bug number 490 was opened in
the Bugzilla and we are analyzing this failure.
And, on an PCI-express mellanox hca:
/afs/scl.ameslab.gov/user/troy/src/ibv-mr-test/mr-test
mr-test: bufsize 10485760
device # 0 name="mthca0" guid="0002c9020040272c"
ibv_open_device() context=0x504c00
ibv_alloc_pd() pd=0x503f30
alloc: 12277
ibv_reg_mr failed:: Cannot allocate memory
fw_ver: 5.1.0
max_mr_size 0xffffffffffffffff
max_mr: 131056, could only register 12277 regions
sleep 5 sec
free: 0
done
I'm checking this issue and let you know about what i will find.
On the pci-express hca, it also looks like the memory usage, as
reported by "free" goes down by about 300MB once all these regions
are allocated.. but the process usage as reported by top is only
20mb total virtual size. What's going on here?
are you talking about the "free memory" which is reported by top?
Both the free memory reported by 'top', and the free memory reported
by the 'free' command on debian.
_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general