While running below ib_rdma_bw on 32bit platform, I am getting unexpected low
Server: ib_rdma_bw -p 5019 -s 1048576 -t 500 -n 5000 -b -c
Client: ib_rdma_bw -p 5019 -s 1048576 -t 500 -n 5000 -b -c 184.108.40.206
(If iterations are changed to 500, I am getting expected throughput)
Looking at the code I found,
ib_rdma_bw.c in perftest package has following code
unsigned long tsize; /* Transferred size, in megabytes */
cycles_to_units = get_cpu_mhz(0) * 1000000;
printf("%d: Bandwidth average: %g MB/sec\n", pid,
tsize * iters * cycles_to_units /
(tcompleted[iters - 1] - tposted)
Here, tsize is "unsigned long" and which is of 4Bytes on 32bit
platforms and 8Bytes on 64bit platforms.
I run test for 1M datasize and 5000 iterations as
above, the calculation (tsize * iters)
overflows "unsigned long" limit and thus gives unexpected
result as low throughput.
Correct fix should be applied in ib_rdma_bw application. Either change
calculation from (tsize * iters * cycles_to_units) to (
cycles_to_units * tsize * iters ) Or to change tsize to double.
Should I go ahead and submit a patch ?
Viral Mehta, Embedded Software Engineer, www.einfochips.com
However, I do understand that we can overflow double boundary as well if we run test for higher datasize and higher iterations.
Better way to calculate bandwidth would be after every fix number of iterations (say 100).
ewg mailing list