[PATCH blktests v2] tests: use nproc to get number of CPUs for fio jobs

2017-06-30 Thread Johannes Thumshirn
Use nproc to get number of CPUs for fio jobs and introduce _run_fio_rand_io helper for parallel IO which we don't really care about the details and just want some IO. Signed-off-by: Johannes Thumshirn --- common/fio | 7 +++ tests/block/005 | 4 +--- tests/block/006

[PATCH] block: constify attribute_group structures.

2017-06-30 Thread Arvind Yadav
attribute_groups are not supposed to change at runtime. All functions working with attribute_groups provided by work with const attribute_group. So mark the non-const structs as const. File size before: textdata bss dec hex filename 11622 9122076 146103912

[PATCH 09/19] bcache: update bio->bi_opf bypass/writeback REQ_ flag hints

2017-06-30 Thread Eric Wheeler
Bypass if: bio->bi_opf & (REQ_RAHEAD|REQ_BACKGROUND) Writeback if: op_is_sync(bio->bi_opf) || bio->bi_opf & (REQ_META|REQ_PRIO) Signed-off-by: Eric Wheeler --- drivers/md/bcache/request.c | 3 +++ drivers/md/bcache/writeback.h | 3 ++- 2 files changed, 5

[PATCH 08/19] bcache: documentation for sysfs entries describing bcache cache hinting

2017-06-30 Thread Eric Wheeler
Signed-off-by: Eric Wheeler --- Documentation/bcache.txt | 80 1 file changed, 80 insertions(+) diff --git a/Documentation/bcache.txt b/Documentation/bcache.txt index a9259b5..c78c012 100644 ---

[PATCH 07/19] bcache: introduce bcache sysfs entries for ioprio-based bypass/writeback hints

2017-06-30 Thread Eric Wheeler
Add sysfs entries to support to hint for bypass/writeback by the ioprio assigned to the bio. If the bio is unassigned, use current's io-context ioprio for cache writeback or bypass (configured per-process with `ionice`). Having idle IOs bypass the cache can increase performance elsewhere since

[PATCH 19/19] bcache: Update continue_at() documentation

2017-06-30 Thread Dan Carpenter
continue_at() doesn't have a return statement anymore. Signed-off-by: Dan Carpenter --- drivers/md/bcache/closure.h | 4 1 file changed, 4 deletions(-) diff --git a/drivers/md/bcache/closure.h b/drivers/md/bcache/closure.h index 1ec84ca..295b7e4 100644 ---

[PATCH 03/19] bcache: do not subtract sectors_to_gc for bypassed IO

2017-06-30 Thread Tang Junhui
Since bypassed IOs use no bucket, so do not subtract sectors_to_gc to trigger gc thread. Signed-off-by: tang.junhui Reviewed-by: Eric Wheeler Cc: sta...@vger.kernel.org --- drivers/md/bcache/request.c | 6 +++--- 1 file changed, 3

[PATCH 15/19] bcache: fix issue of writeback rate at minimum 1 key per second

2017-06-30 Thread Tang Junhui
When there is not enough dirty data in writeback cache, writeback rate is at minimum 1 key per second util all dirty data to be cleaned, it is inefficiency, and also causes waste of energy; in this patch, When there is not enough dirty data, let the writeback rate to be 0, and writeback

[PATCH 18/19] bcache: silence static checker warning

2017-06-30 Thread Dan Carpenter
In olden times, closure_return() used to have a hidden return built in. We removed the hidden return but forgot to add a new return here. If "c" were NULL we would oops on the next line, but fortunately "c" is never NULL. Let's just remove the if statement. Signed-off-by: Dan Carpenter

[PATCH 05/19] bcache: fix calling ida_simple_remove() with incorrect minor

2017-06-30 Thread Tang Junhui
bcache called ida_simple_remove() with minor which have multiplied by BCACHE_MINORS, it would cause minor wrong release and leakage. In addition, when adding partition support to bcache, the name assignment was not updated, resulting in numbers jumping (bcache0, bcache16, bcache32...). This has

[PATCH 14/19] bcache: Correct return value for sysfs attach errors

2017-06-30 Thread Tony Asleson
If you encounter any errors in bch_cached_dev_attach it will return a negative error code. The variable 'v' which stores the result is unsigned, thus user space sees a very large value returned for bytes written which can cause incorrect user space behavior. Utilize 1 signed variable to use

[PATCH 12/19] bcache: update bucket_in_use periodically

2017-06-30 Thread Tang Junhui
bucket_in_use is updated in gc thread which triggered by invalidating or writing sectors_to_gc dirty data, It's been too long, Therefore, when we use it to compare with the threshold, it is often not timely, which leads to inaccurate judgment and often results in bucket depletion. Signed-off-by:

[PATCH 02/19] bcache: fix sequential large write IO bypass

2017-06-30 Thread Tang Junhui
Sequential write IOs were tested with bs=1M by FIO in writeback cache mode, these IOs were expected to be bypassed, but actually they did not. We debug the code, and find in check_should_bypass(): if (!congested && mode == CACHE_MODE_WRITEBACK && op_is_write(bio_op(bio)) &&

[PATCH 10/19] bcache: initialize stripe_sectors_dirty correctly for thin flash device

2017-06-30 Thread Tang Junhui
Thin flash device does not initialize stripe_sectors_dirty correctly, this patch fixes this issue. Signed-off-by: Tang Junhui Cc: sta...@vger.kernel.org --- drivers/md/bcache/super.c | 3 ++- drivers/md/bcache/writeback.c | 8 drivers/md/bcache/writeback.h |

[PATCH 01/19] bcache: Fix leak of bdev reference

2017-06-30 Thread Jan Kara
If blkdev_get_by_path() in register_bcache() fails, we try to lookup the block device using lookup_bdev() to detect which situation we are in to properly report error. However we never drop the reference returned to us from lookup_bdev(). Fix that. Signed-off-by: Jan Kara Cc:

[PATCH 16/19] bcache: increase the number of open buckets

2017-06-30 Thread Tang Junhui
In currently, we only alloc 6 open buckets for each cache set, but in usually, we always attach about 10 or so backend devices for each cache set, and the each bcache device are always accessed by about 10 or so threads in top application layer. So 6 open buckets are too few, It has led to that

[PATCH 17/19] bcache: fix for gc and write-back race

2017-06-30 Thread Tang Junhui
gc and write-back get raced (see the email "bcache get stucked" I sended before): gc thread write-back thread | |bch_writeback_thread() |bch_gc_thread()| |

[PATCH 11/19] bcache: Subtract dirty sectors of thin flash from cache_sectors in calculating writeback rate

2017-06-30 Thread Tang Junhui
Since dirty sectors of thin flash cannot be used to cache data for backend device, so we should subtract it in calculating writeback rate. Signed-off-by: Tang Junhui Cc: sta...@vger.kernel.org --- drivers/md/bcache/writeback.c | 2 +- drivers/md/bcache/writeback.h | 19

[PATCH 13/19] bcache: delete redundant calling set_gc_sectors()

2017-06-30 Thread Tang Junhui
set_gc_sectors() has been called in bch_gc_thread(), and it was called again in bch_btree_gc_finish() . The following call is unnecessary, so delete it. Signed-off-by: Tang Junhui --- drivers/md/bcache/btree.c | 1 - 1 file changed, 1 deletion(-) diff --git

[PATCH 04/19] bcache: fix wrong cache_misses statistics

2017-06-30 Thread Tang Junhui
Some missed IOs are not counted into cache_misses, this patch fix this issue. Signed-off-by: tang.junhui Reviewed-by: Eric Wheeler Cc: sta...@vger.kernel.org --- drivers/md/bcache/request.c | 6 +- 1 file changed, 5 insertions(+), 1

[PATCH] block: Constify attribute_group structures.

2017-06-30 Thread Arvind Yadav
attribute_groups are not supposed to change at runtime. All functions working with attribute_groups provided by work with const attribute_group. So mark the non-const structs as const. File size before: textdata bss dec hex filename 5302 544 0584616d6

[PATCH 01/19] bcache: Fix leak of bdev reference

2017-06-30 Thread Jan Kara
If blkdev_get_by_path() in register_bcache() fails, we try to lookup the block device using lookup_bdev() to detect which situation we are in to properly report error. However we never drop the reference returned to us from lookup_bdev(). Fix that. Signed-off-by: Jan Kara Cc:

[PATCH 12/19] bcache: update bucket_in_use periodically

2017-06-30 Thread Tang Junhui
bucket_in_use is updated in gc thread which triggered by invalidating or writing sectors_to_gc dirty data, It's been too long, Therefore, when we use it to compare with the threshold, it is often not timely, which leads to inaccurate judgment and often results in bucket depletion. Signed-off-by:

[PATCH 10/19] bcache: initialize stripe_sectors_dirty correctly for thin flash device

2017-06-30 Thread Tang Junhui
Thin flash device does not initialize stripe_sectors_dirty correctly, this patch fixes this issue. Signed-off-by: Tang Junhui Cc: sta...@vger.kernel.org --- drivers/md/bcache/super.c | 3 ++- drivers/md/bcache/writeback.c | 8 drivers/md/bcache/writeback.h |

[PATCH 08/19] bcache: documentation for sysfs entries describing bcache cache hinting

2017-06-30 Thread Eric Wheeler
Signed-off-by: Eric Wheeler --- Documentation/bcache.txt | 80 1 file changed, 80 insertions(+) diff --git a/Documentation/bcache.txt b/Documentation/bcache.txt index a9259b5..c78c012 100644 ---

[PATCH 17/19] bcache: fix for gc and write-back race

2017-06-30 Thread Tang Junhui
gc and write-back get raced (see the email "bcache get stucked" I sended before): gc thread write-back thread | |bch_writeback_thread() |bch_gc_thread()| |

[PATCH 14/19] bcache: Correct return value for sysfs attach errors

2017-06-30 Thread Tony Asleson
If you encounter any errors in bch_cached_dev_attach it will return a negative error code. The variable 'v' which stores the result is unsigned, thus user space sees a very large value returned for bytes written which can cause incorrect user space behavior. Utilize 1 signed variable to use

[PATCH 09/19] bcache: update bio->bi_opf bypass/writeback REQ_ flag hints

2017-06-30 Thread Eric Wheeler
Bypass if: bio->bi_opf & (REQ_RAHEAD|REQ_BACKGROUND) Writeback if: op_is_sync(bio->bi_opf) || bio->bi_opf & (REQ_META|REQ_PRIO) Signed-off-by: Eric Wheeler --- drivers/md/bcache/request.c | 3 +++ drivers/md/bcache/writeback.h | 3 ++- 2 files changed, 5

[PATCH 07/19] bcache: introduce bcache sysfs entries for ioprio-based bypass/writeback hints

2017-06-30 Thread Eric Wheeler
Add sysfs entries to support to hint for bypass/writeback by the ioprio assigned to the bio. If the bio is unassigned, use current's io-context ioprio for cache writeback or bypass (configured per-process with `ionice`). Having idle IOs bypass the cache can increase performance elsewhere since

[PATCH 19/19] bcache: Update continue_at() documentation

2017-06-30 Thread Dan Carpenter
continue_at() doesn't have a return statement anymore. Signed-off-by: Dan Carpenter --- drivers/md/bcache/closure.h | 4 1 file changed, 4 deletions(-) diff --git a/drivers/md/bcache/closure.h b/drivers/md/bcache/closure.h index 1ec84ca..295b7e4 100644 ---

[PATCH 15/19] bcache: fix issue of writeback rate at minimum 1 key per second

2017-06-30 Thread Tang Junhui
When there is not enough dirty data in writeback cache, writeback rate is at minimum 1 key per second util all dirty data to be cleaned, it is inefficiency, and also causes waste of energy; in this patch, When there is not enough dirty data, let the writeback rate to be 0, and writeback

[PATCH 18/19] bcache: silence static checker warning

2017-06-30 Thread Dan Carpenter
In olden times, closure_return() used to have a hidden return built in. We removed the hidden return but forgot to add a new return here. If "c" were NULL we would oops on the next line, but fortunately "c" is never NULL. Let's just remove the if statement. Signed-off-by: Dan Carpenter

[PATCH 05/19] bcache: fix calling ida_simple_remove() with incorrect minor

2017-06-30 Thread Tang Junhui
bcache called ida_simple_remove() with minor which have multiplied by BCACHE_MINORS, it would cause minor wrong release and leakage. In addition, when adding partition support to bcache, the name assignment was not updated, resulting in numbers jumping (bcache0, bcache16, bcache32...). This has

[PATCH 02/19] bcache: fix sequential large write IO bypass

2017-06-30 Thread Tang Junhui
Sequential write IOs were tested with bs=1M by FIO in writeback cache mode, these IOs were expected to be bypassed, but actually they did not. We debug the code, and find in check_should_bypass(): if (!congested && mode == CACHE_MODE_WRITEBACK && op_is_write(bio_op(bio)) &&

[PATCH 04/19] bcache: fix wrong cache_misses statistics

2017-06-30 Thread Tang Junhui
Some missed IOs are not counted into cache_misses, this patch fix this issue. Signed-off-by: tang.junhui Reviewed-by: Eric Wheeler Cc: sta...@vger.kernel.org --- drivers/md/bcache/request.c | 6 +- 1 file changed, 5 insertions(+), 1

[PATCH 03/19] bcache: do not subtract sectors_to_gc for bypassed IO

2017-06-30 Thread Tang Junhui
Since bypassed IOs use no bucket, so do not subtract sectors_to_gc to trigger gc thread. Signed-off-by: tang.junhui Reviewed-by: Eric Wheeler Cc: sta...@vger.kernel.org --- drivers/md/bcache/request.c | 6 +++--- 1 file changed, 3

[PATCH 13/19] bcache: delete redundant calling set_gc_sectors()

2017-06-30 Thread Tang Junhui
set_gc_sectors() has been called in bch_gc_thread(), and it was called again in bch_btree_gc_finish() . The following call is unnecessary, so delete it. Signed-off-by: Tang Junhui --- drivers/md/bcache/btree.c | 1 - 1 file changed, 1 deletion(-) diff --git

[PATCH 11/19] bcache: Subtract dirty sectors of thin flash from cache_sectors in calculating writeback rate

2017-06-30 Thread Tang Junhui
Since dirty sectors of thin flash cannot be used to cache data for backend device, so we should subtract it in calculating writeback rate. Signed-off-by: Tang Junhui Cc: sta...@vger.kernel.org --- drivers/md/bcache/writeback.c | 2 +- drivers/md/bcache/writeback.h | 19

[PATCH 16/19] bcache: increase the number of open buckets

2017-06-30 Thread Tang Junhui
In currently, we only alloc 6 open buckets for each cache set, but in usually, we always attach about 10 or so backend devices for each cache set, and the each bcache device are always accessed by about 10 or so threads in top application layer. So 6 open buckets are too few, It has led to that

[PATCH 3/3 v4] mmc: debugfs: Move block debugfs into block module

2017-06-30 Thread Linus Walleij
If we don't have the block layer enabled, we do not present card status and extcsd in the debugfs. Debugfs is not ABI, and maintaining files of no relevance for non-block devices comes at a high maintenance cost if we shall support it with the block layer compiled out. The debugfs entries suffer

[PATCH 1/3 v4] mmc: block: Anonymize the drv op data pointer

2017-06-30 Thread Linus Walleij
We have a data pointer for the ioctl() data, but we need to pass other data along with the DRV_OP:s, so make this a void * so it can be reused. Signed-off-by: Linus Walleij --- ChangeLog v3->v4: - No changes just resending ChangeLog v2->v3: - No changes just resending

Re: LightNVM pblk: read/write of random kernel memory

2017-06-30 Thread Carl-Daniel Hailfinger
On 28.06.2017 16:58, Javier Gonzalez wrote: >> On 28 Jun 2017, at 16.33, Carl-Daniel Hailfinger >> wrote: >> >> thanks for the pointer to the github reporting page. >> I'll answer your questions here (to make then indexable by search >> engines in case someone

Re: [dm-devel] [PATCH 1/1] block: Convert hd_struct in_flight from atomic to percpu

2017-06-30 Thread Jens Axboe
On 06/30/2017 07:05 AM, Brian King wrote: > On 06/29/2017 09:17 PM, Jens Axboe wrote: >> On 06/29/2017 07:20 PM, Ming Lei wrote: >>> On Fri, Jun 30, 2017 at 2:42 AM, Jens Axboe wrote: On 06/29/2017 10:00 AM, Jens Axboe wrote: > On 06/29/2017 09:58 AM, Jens Axboe wrote:

[PATCH 03/10] lightnvm: pblk: remove unused return variable

2017-06-30 Thread Javier González
Remove unused variable. Signed-off-by: Javier González Signed-off-by: Matias Bjørling --- drivers/lightnvm/pblk-sysfs.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/lightnvm/pblk-sysfs.c

[PATCH 02/10] lightnvm: pblk: fix double-free on pblk init

2017-06-30 Thread Javier González
Prevent pblk->lines being double freed in case of an error during pblk initialization. Fixes: dd2a43437337: "lightnvm: pblk: sched. metadata on write thread" Reported-by: Dan Carpenter Signed-off-by: Javier González Signed-off-by: Matias Bjørling

[PATCH 09/10] lightnvm: pblk: verify that cache read is still valid

2017-06-30 Thread Javier González
When a read is directed to the cache, we risk that the lba has been updated during the time we made the L2P table lookup and the time we are actually reading form the cache. We intentionally not hold the L2P lock not to block other threads. While strict ordering is not a guarantee at this level

Re: [PATCH 00/10] lightnvm: pblk fixes for 4.13

2017-06-30 Thread Jens Axboe
On 06/30/2017 09:56 AM, Javier González wrote: > Hi Jens, > > Here you have a second round of fixes for pblk. They are in essence bug > fixes including a double-free reported by Dan. > > There is also regression fix for pblk removal, which was introduced with > the new metadata scheduler. This

Re: [PATCH v8 17/18] xfs: minimal conversion to errseq_t writeback error reporting

2017-06-30 Thread Christoph Hellwig
On Fri, Jun 30, 2017 at 12:45:54PM -0400, Jeff Layton wrote: > Should I aim to do that with an individual patch for each fs, or is it > better to do a swath of them all at once in a single patch here? I'd be perfectly happy with one big patch for all the trivial conversions.

[PATCH 05/10] lightnvm: pblk: use right metadata buffer for recovery

2017-06-30 Thread Javier González
Fix bad metadata buffer assignations introduced when refactoring the medatada write path. Fixes: dd2a43437337 lightnvm: pblk: sched. metadata on write thread Signed-off-by: Javier González Signed-off-by: Matias Bjørling ---

Re: [PATCH v8 17/18] xfs: minimal conversion to errseq_t writeback error reporting

2017-06-30 Thread Jeff Layton
On Thu, 2017-06-29 at 07:12 -0700, Christoph Hellwig wrote: > Nice and simple, this looks great! > > Reviewed-by: Christoph Hellwig Thanks! I think this turned out to be a lot cleaner too. For filesystems that use filemap_write_and_wait_range today this now becomes a pretty

[PATCH 10/10] lightnvm: pblk: set line bitmap check under debug

2017-06-30 Thread Javier González
Do bitmap checks only when debug mode is enable. The line bitmap used for mapping to physical addresses is fairly large (~512KB) and it is expensive to do this checks on the fast path. Signed-off-by: Javier González Signed-off-by: Matias Bjørling ---

[PATCH 04/10] lightnvm: pblk: schedule if data is not ready

2017-06-30 Thread Javier González
When user threads place data into the write buffer, they reserve space and do the memory copy out of the lock. As a consequence, when the write thread starts persisting data, there is a chance that it is not copied yet. In this case, avoid polling, and schedule before retrying. Signed-off-by:

[PATCH 07/10] lightnvm: pblk: remove target using async. I/Os

2017-06-30 Thread Javier González
When removing a pblk instance, pad the current line using asynchronous I/O. This reduces the removal time from ~1 minute in the worst case to a couple of seconds. Signed-off-by: Javier González Signed-off-by: Matias Bjørling ---

[PATCH 01/10] lightnvm: pblk: fix bad le64 assignations

2017-06-30 Thread Javier González
Use the right types and conversions on le64 variables. Reported by sparse. Signed-off-by: Javier González Signed-off-by: Matias Bjørling --- drivers/lightnvm/pblk-core.c | 2 +- drivers/lightnvm/pblk-gc.c | 5 -

[PATCH 08/10] lightnvm: pblk: add initialization check

2017-06-30 Thread Javier González
Add a sanity check to the pblk initialization sequence in order to ensure that enough LUNs have been allocated to store the line metadata. Signed-off-by: Javier González Signed-off-by: Matias Bjørling --- drivers/lightnvm/pblk-init.c | 6 ++ 1 file

[PATCH 06/10] lightnvm: pblk: use vmalloc for GC data buffer

2017-06-30 Thread Javier González
For now, we allocate a per I/O buffer for GC data. Since the potential size of the buffer is 256KB and GC is not in the fast path, do this allocation with vmalloc. This puts lets pressure on the memory allocator at no performance cost. Signed-off-by: Javier González

[PATCH 00/10] lightnvm: pblk fixes for 4.13

2017-06-30 Thread Javier González
Hi Jens, Here you have a second round of fixes for pblk. They are in essence bug fixes including a double-free reported by Dan. There is also regression fix for pblk removal, which was introduced with the new metadata scheduler. This fix makes that removing a pblk instance takes again at most 2

NVMe induced NULL deref in bt_iter()

2017-06-30 Thread Jens Axboe
Hi Max, I remembered you reporting this. I think this is a regression introduced with the scheduling, since ->rqs[] isn't static anymore. ->static_rqs[] is, but that's not indexable by the tag we find. So I think we need to guard those with a NULL check. The actual requests themselves are static,

[PATCH 3/3] fs: support RWF_NOWAIT for buffered reads

2017-06-30 Thread Christoph Hellwig
This is based on the old idea and code from Milosz Tanski. With the aio nowait code it becomes mostly trivial now. Signed-off-by: Christoph Hellwig --- fs/aio.c | 6 -- fs/btrfs/file.c| 9 ++--- fs/ext4/file.c | 6 +++--- fs/xfs/xfs_file.c | 11

non-blockling buffered reads V2

2017-06-30 Thread Christoph Hellwig
This series resurrects the old patches from Milosz to implement non-blocking buffered reads. Thanks to the non-blocking AIO code from Goldwyn the implementation becomes pretty much trivial. As that implementation is in the block tree I would suggest that we merge these patches through the block

[PATCH 1/3] fs: pass iocb to do_generic_file_read

2017-06-30 Thread Christoph Hellwig
And rename it to the more descriptive generic_file_buffered_read while at it. Signed-off-by: Christoph Hellwig --- mm/filemap.c | 15 --- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/mm/filemap.c b/mm/filemap.c index 742034e56100..3df0a57cd48e 100644

[PATCH 2/3] fs: support IOCB_NOWAIT in generic_file_buffered_read

2017-06-30 Thread Christoph Hellwig
From: Milosz Tanski Allow generic_file_buffered_read to bail out early instead of waiting for the page lock or reading a page if IOCB_NOWAIT is specified. Signed-off-by: Milosz Tanski Reviewed-by: Christoph Hellwig Reviewed-by: Jeff Moyer

[PATCH 01/19] bcache: Fix leak of bdev reference

2017-06-30 Thread bcache
From: Jan Kara If blkdev_get_by_path() in register_bcache() fails, we try to lookup the block device using lookup_bdev() to detect which situation we are in to properly report error. However we never drop the reference returned to us from lookup_bdev(). Fix that. Signed-off-by:

Re: [dm-devel] [PATCH 1/1] block: Convert hd_struct in_flight from atomic to percpu

2017-06-30 Thread Brian King
On 06/30/2017 09:08 AM, Jens Axboe wrote: Compared with the totally percpu approach, this way might help 1:M or N:M mapping, but won't help 1:1 map(NVMe), when hctx is mapped to each CPU(especially there are huge hw queues on a big system), :-( >>> >>> Not disagreeing with that,

Re: [PATCH 3/3] fs: support RWF_NOWAIT for buffered reads

2017-06-30 Thread Goldwyn Rodrigues
On 06/30/2017 01:15 PM, Christoph Hellwig wrote: > This is based on the old idea and code from Milosz Tanski. With the > aio nowait code it becomes mostly trivial now. > Looks Good. Reviewed-by: Goldwyn Rodrigues -- Goldwyn

[PATCH 19/19] bcache: Update continue_at() documentation

2017-06-30 Thread bcache
From: Dan Carpenter continue_at() doesn't have a return statement anymore. Signed-off-by: Dan Carpenter --- drivers/md/bcache/closure.h | 4 1 file changed, 4 deletions(-) diff --git a/drivers/md/bcache/closure.h

[PATCH 15/19] bcache: fix issue of writeback rate at minimum 1 key per second

2017-06-30 Thread bcache
From: Tang Junhui When there is not enough dirty data in writeback cache, writeback rate is at minimum 1 key per second util all dirty data to be cleaned, it is inefficiency, and also causes waste of energy; in this patch, When there is not enough dirty data, let the

[PATCH 08/19] bcache: documentation for sysfs entries describing bcache cache hinting

2017-06-30 Thread bcache
From: Eric Wheeler Signed-off-by: Eric Wheeler --- Documentation/bcache.txt | 80 1 file changed, 80 insertions(+) diff --git a/Documentation/bcache.txt b/Documentation/bcache.txt index

[PATCH 17/19] bcache: fix for gc and write-back race

2017-06-30 Thread bcache
From: Tang Junhui gc and write-back get raced (see the email "bcache get stucked" I sended before): gc thread write-back thread | |bch_writeback_thread() |bch_gc_thread()

[PATCH 18/19] bcache: silence static checker warning

2017-06-30 Thread bcache
From: Dan Carpenter In olden times, closure_return() used to have a hidden return built in. We removed the hidden return but forgot to add a new return here. If "c" were NULL we would oops on the next line, but fortunately "c" is never NULL. Let's just remove the if

[PATCH 10/19] bcache: initialize stripe_sectors_dirty correctly for thin flash device

2017-06-30 Thread bcache
From: Tang Junhui Thin flash device does not initialize stripe_sectors_dirty correctly, this patch fixes this issue. Signed-off-by: Tang Junhui Cc: sta...@vger.kernel.org --- drivers/md/bcache/super.c | 3 ++- drivers/md/bcache/writeback.c |

[PATCH 04/19] bcache: fix wrong cache_misses statistics

2017-06-30 Thread bcache
From: Tang Junhui Some missed IOs are not counted into cache_misses, this patch fix this issue. Signed-off-by: tang.junhui Reviewed-by: Eric Wheeler Cc: sta...@vger.kernel.org --- drivers/md/bcache/request.c | 6

[PATCH 09/19] bcache: update bio->bi_opf bypass/writeback REQ_ flag hints

2017-06-30 Thread bcache
From: Eric Wheeler Bypass if: bio->bi_opf & (REQ_RAHEAD|REQ_BACKGROUND) Writeback if: op_is_sync(bio->bi_opf) || bio->bi_opf & (REQ_META|REQ_PRIO) Signed-off-by: Eric Wheeler --- drivers/md/bcache/request.c | 3 +++

[PATCH 16/19] bcache: increase the number of open buckets

2017-06-30 Thread bcache
From: Tang Junhui In currently, we only alloc 6 open buckets for each cache set, but in usually, we always attach about 10 or so backend devices for each cache set, and the each bcache device are always accessed by about 10 or so threads in top application layer. So 6

[PATCH 13/19] bcache: delete redundant calling set_gc_sectors()

2017-06-30 Thread bcache
From: Tang Junhui set_gc_sectors() has been called in bch_gc_thread(), and it was called again in bch_btree_gc_finish() . The following call is unnecessary, so delete it. Signed-off-by: Tang Junhui --- drivers/md/bcache/btree.c | 1 - 1 file

[PATCH 14/19] bcache: Correct return value for sysfs attach errors

2017-06-30 Thread bcache
From: Tony Asleson If you encounter any errors in bch_cached_dev_attach it will return a negative error code. The variable 'v' which stores the result is unsigned, thus user space sees a very large value returned for bytes written which can cause incorrect user space

[PATCH 07/19] bcache: introduce bcache sysfs entries for ioprio-based bypass/writeback hints

2017-06-30 Thread bcache
From: Eric Wheeler Add sysfs entries to support to hint for bypass/writeback by the ioprio assigned to the bio. If the bio is unassigned, use current's io-context ioprio for cache writeback or bypass (configured per-process with `ionice`). Having idle IOs bypass the

[PATCH 12/19] bcache: update bucket_in_use periodically

2017-06-30 Thread bcache
From: Tang Junhui bucket_in_use is updated in gc thread which triggered by invalidating or writing sectors_to_gc dirty data, It's been too long, Therefore, when we use it to compare with the threshold, it is often not timely, which leads to inaccurate judgment and often

[PATCH 03/19] bcache: do not subtract sectors_to_gc for bypassed IO

2017-06-30 Thread bcache
From: Tang Junhui Since bypassed IOs use no bucket, so do not subtract sectors_to_gc to trigger gc thread. Signed-off-by: tang.junhui Reviewed-by: Eric Wheeler Cc: sta...@vger.kernel.org ---

[PATCH 06/19] bcache: explicitly destory mutex while exiting

2017-06-30 Thread bcache
From: Liang Chen mutex_destroy does nothing most of time, but it's better to call it to make the code future proof and it also has some meaning for like mutex debug. Signed-off-by: Liang Chen Reviewed-by: Eric Wheeler

[PATCH 02/19] bcache: fix sequential large write IO bypass

2017-06-30 Thread bcache
From: Tang Junhui Sequential write IOs were tested with bs=1M by FIO in writeback cache mode, these IOs were expected to be bypassed, but actually they did not. We debug the code, and find in check_should_bypass(): if (!congested && mode ==

Re: [dm-devel] [PATCH 1/1] block: Convert hd_struct in_flight from atomic to percpu

2017-06-30 Thread Brian King
On 06/30/2017 06:26 PM, Jens Axboe wrote: > On 06/30/2017 05:23 PM, Ming Lei wrote: >> Hi Bian, >> >> On Sat, Jul 1, 2017 at 2:33 AM, Brian King wrote: >>> On 06/30/2017 09:08 AM, Jens Axboe wrote: >>> Compared with the totally percpu approach, this way might help

Re: [dm-devel] [PATCH 1/1] block: Convert hd_struct in_flight from atomic to percpu

2017-06-30 Thread Ming Lei
Hi Bian, On Sat, Jul 1, 2017 at 2:33 AM, Brian King wrote: > On 06/30/2017 09:08 AM, Jens Axboe wrote: > Compared with the totally percpu approach, this way might help 1:M or > N:M mapping, but won't help 1:1 map(NVMe), when hctx is mapped to > each

[PATCH] nvme: remove pci device if no longer present

2017-06-30 Thread Wei Zhang
This patch removes the PCI device from the kernel's topology tree if the device is no longer present. Commit ddf097ec1d44c9648c4738d7cf2819411b44253a (NVMe: Unbind driver on failure) left the PCI device in the kernel's topology upon device failure. However, this does not work well for the slot

Re: [dm-devel] [PATCH 1/1] block: Convert hd_struct in_flight from atomic to percpu

2017-06-30 Thread Jens Axboe
On 06/30/2017 08:08 AM, Jens Axboe wrote: > On 06/30/2017 07:05 AM, Brian King wrote: >> On 06/29/2017 09:17 PM, Jens Axboe wrote: >>> On 06/29/2017 07:20 PM, Ming Lei wrote: On Fri, Jun 30, 2017 at 2:42 AM, Jens Axboe wrote: > On 06/29/2017 10:00 AM, Jens Axboe wrote:

Re: [dm-devel] [PATCH 1/1] block: Convert hd_struct in_flight from atomic to percpu

2017-06-30 Thread Jens Axboe
On 06/30/2017 10:17 PM, Jens Axboe wrote: > On 06/30/2017 08:08 AM, Jens Axboe wrote: >> On 06/30/2017 07:05 AM, Brian King wrote: >>> On 06/29/2017 09:17 PM, Jens Axboe wrote: On 06/29/2017 07:20 PM, Ming Lei wrote: > On Fri, Jun 30, 2017 at 2:42 AM, Jens Axboe wrote:

Re: [dm-devel] [PATCH 1/1] block: Convert hd_struct in_flight from atomic to percpu

2017-06-30 Thread Jens Axboe
On 06/30/2017 05:23 PM, Ming Lei wrote: > Hi Bian, > > On Sat, Jul 1, 2017 at 2:33 AM, Brian King wrote: >> On 06/30/2017 09:08 AM, Jens Axboe wrote: >> Compared with the totally percpu approach, this way might help 1:M or >> N:M mapping, but won't help 1:1

Re: [dm-devel] [PATCH 1/1] block: Convert hd_struct in_flight from atomic to percpu

2017-06-30 Thread Brian King
On 06/29/2017 09:17 PM, Jens Axboe wrote: > On 06/29/2017 07:20 PM, Ming Lei wrote: >> On Fri, Jun 30, 2017 at 2:42 AM, Jens Axboe wrote: >>> On 06/29/2017 10:00 AM, Jens Axboe wrote: On 06/29/2017 09:58 AM, Jens Axboe wrote: > On 06/29/2017 02:40 AM, Ming Lei wrote: