From: Prakash Surya
Testing has shown the ll_inode_inode's lli_write_mutex to be a
limiting factor with single shared file write performance, when using
many writing threads on a single machine. Even if each thread is
writing to a unique portion of the file, the lli_write_mutex will
prevent no more than a single thread to ever write to the file
simultaneously.
This change attempts to remove this bottle neck, by replacing this
mutex with a range lock. This should allow multiple threads to write
to a single file simultaneously iff the threads are writing to unique
regions of the file.
Performance testing shows this change to garner a significant
performance boost to write bandwidth. Using FIO on a single machine
with 32 GB of RAM, write performance tests were run with and without
this change applied; the results are below:
+-+---+-++++
| fio v2.0.13 |Write Bandwidth (KB/s) |
+-+---+-++++
| # Tasks | GB / Task | Test 1 | Test 2 | Test 3 | Test 4 |
+-+---+-++++
|1|64 | 452446 | 454623 | 457653 | 463737 |
|2|32 | 850318 | 565373 | 602498 | 733027 |
|4|16 | 1058900 | 463546 | 529107 | 976284 |
|8| 8 | 1026300 | 468190 | 576451 | 963404 |
| 16| 4 | 1065500 | 503160 | 462902 | 830065 |
| 32| 2 | 1068600 | 462228 | 466963 | 749733 |
| 64| 1 | 991830 | 556618 | 557863 | 710912 |
+-+---+-++++
* Test 1: Lustre client running 04ec54f. File per process write
workload. This test was used as a baseline for what we
_could_ achieve in the single shared file tests if the
bottle necks were removed.
* Test 2: Lustre client running 04ec54f. Single shared file
workload, each task writing to a unique region.
* Test 3: Lustre client running 04ec54f + I0023132b. Single shared
file workload, each task writing to a unique region.
* Test 4: Lustre client running 04ec54f + this patch.
Single shared file workload, each task writing to a unique
region.
Direct IO does not use the page cache like normal IO, so
concurrent direct IO reads of the same pages are not safe.
As a result, direct IO reads must take the range lock
in ll_file_io_generic, otherwise they will attempt to work
on the same pages and hit assertions like:
(osc_request.c:1219:osc_brw_prep_request())
ASSERTION( i == 0 || pg->off > pg_prev->off ) failed:
i 3 p_c 10 pg ea00017a5208 [pri 0 ind 2771] off 16384
prev_pg ea00017a51d0 [pri 0 ind 2256] off 16384
Signed-off-by: Prakash Surya
Signed-off-by: Patrick Farrell
Signed-off-by: Jinshan Xiong
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-1669
Reviewed-on: http://review.whamcloud.com/6320
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6227
Reviewed-on: http://review.whamcloud.com/14385
Reviewed-by: Bobi Jam
Reviewed-by: Alexander Boyko
Reviewed-by: Hiroya Nozaki
Reviewed-by: Andreas Dilger
Reviewed-by: Oleg Drokin
Signed-off-by: James Simmons
---
drivers/staging/lustre/lustre/llite/Makefile |2 +-
drivers/staging/lustre/lustre/llite/file.c | 37 +++-
.../staging/lustre/lustre/llite/llite_internal.h |5 +-
drivers/staging/lustre/lustre/llite/llite_lib.c|2 +-
drivers/staging/lustre/lustre/llite/range_lock.c | 233
drivers/staging/lustre/lustre/llite/range_lock.h | 82 +++
6 files changed, 348 insertions(+), 13 deletions(-)
create mode 100644 drivers/staging/lustre/lustre/llite/range_lock.c
create mode 100644 drivers/staging/lustre/lustre/llite/range_lock.h
diff --git a/drivers/staging/lustre/lustre/llite/Makefile
b/drivers/staging/lustre/lustre/llite/Makefile
index 2cbb1b8..1ac0940 100644
--- a/drivers/staging/lustre/lustre/llite/Makefile
+++ b/drivers/staging/lustre/lustre/llite/Makefile
@@ -1,6 +1,6 @@
obj-$(CONFIG_LUSTRE_FS) += lustre.o
lustre-y := dcache.o dir.o file.o llite_close.o llite_lib.o llite_nfs.o \
- rw.o namei.o symlink.o llite_mmap.o \
+ rw.o namei.o symlink.o llite_mmap.o range_lock.o \
xattr.o xattr_cache.o rw26.o super25.o statahead.o \
glimpse.o lcommon_cl.o lcommon_misc.o \
vvp_dev.o vvp_page.o vvp_lock.o vvp_io.o vvp_object.o vvp_req.o \
diff --git a/drivers/staging/lustre/lustre/llite/file.c
b/drivers/staging/lustre/lustre/llite/file.c
index 273b563..d9ee255 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -1126,6 +1126,7 @@ ll_file_io_generic(const struct lu_env *env, struct
vvp_io_args *args,
{
struct ll_inode_info *lli = ll_i2info(file_inode(file));
struct ll_file_data *fd = LUSTRE_FPRIVATE(file);
+ struct