On 07/16/2015 09:01 AM, Ming Lei wrote:
On Thu, Jul 16, 2015 at 10:48 PM, Jens Axboe <ax...@kernel.dk> wrote:
On 07/15/2015 09:16 PM, Ming Lei wrote:

Hi,

This patches kills two kinds of atomic operations in block
accounting I/O.

The 1st two patches convert atomic refcount of partition
into percpu refcount.

The 2nd two patches converts partition->in_flight[] into percpu
variable.

With this change, ~15% throughput improvement can be observed
when running fio(randread) over null blk in a dual-socket
environment.


I've played with this before, but always ran into the hurdle of making
part_in_flight() too expensive ended up hurting results in the end. Making

Yes, it is a bit expensive, but it is only run at most one time per tick for
one partition.

Yup, but that can still be 1000 per second. And up until last year, it was even worse: 7276d02e241dc. So it's not too surprising if there's more low hanging fruit :-)

If we can make the rounding more lazy, then we should go ahead and do that.

the inc/dec parts of accounting percpu is a no-brainer, unfortunately the
summing then becomes pretty expensive. I'll run this through some testing
and see what kind of results I get.

The first two patches should be fine, and it still can get ~8% improvement
in my test.

Agree, those can go right in.

--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to