Re: [PATCH] IB/srp: use multiple CPU cores more effectively

Vladislav Bolkhovitin Mon, 02 Aug 2010 06:07:59 -0700

Bart Van Assche, on 08/02/2010 12:15 PM wrote:

SRP I/O with small block sizes causes a high CPU load. Processing IB
completions on the context of a kernel thread instead of in interrupt context
allows to process up to 25% more I/O operations per second. This patch does
add a kernel parameter 'thread' that allows to specify whether to process IB
completions in interrupt context or in kernel thread context. Also, the IB
receive notification processing loop is rewritten as proposed earlier by Ralph
Campbell (see also https://patchwork.kernel.org/patch/89426/). As the
measurement results below show, rewriting the IB receive notification
processing loop did not have a measurable impact on performance. Processing
IB receive notifications in thread context however does have a measurable
impact: workloads with I/O depth one are processed at most 10% slower and
workloads with larger I/O depths are processed up to 25% faster.


block size  number of    IOPS        IOPS      IOPS
  in bytes    threads     without     with      with
   ($bs)     ($numjobs)  this patch  thread=n  thread=y
    512           1        25,400      25,400    23,100
    512         128       122,000     122,000   153,000
   4096           1        25,000      25,000    22,700
   4096         128       122,000     121,000   157,000
  65536           1        14,300      14,400    13,600
  65536           4        36,700      36,700    36,600
524288           1         3,470       3,430     3,420
524288           4         5,020       5,020     4,990

performance test used to gather the above results:
   fio --bs=${bs} --ioengine=sg --buffered=0 --size=128M --rw=read \
       --thread --numjobs=${numjobs} --loops=100 --group_reporting \
       --gtod_reduce=1 --name=${dev} --filename=${dev}
other ib_srp kernel module parameters: srp_sg_tablesize=128

How about results of "dd Xflags=direct" in different modes to find outthe lowest latency the driver can process 512 and 4K packets? Sorry, Idon't trust fio, when it comes to precise latency measurements.


Vlad
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] IB/srp: use multiple CPU cores more effectively

Reply via email to