Since bypassed IOs use no bucket, so do not subtract sectors_to_gc to
trigger gc thread.
Signed-off-by: tang.junhui
Reviewed-by: Eric Wheeler
Cc: sta...@vger.kernel.org
---
drivers/md/bcache/request.c | 6 +++---
1 file changed, 3
ver, then goto step 2).
2) Loop in bch_writeback_thread() to check if there is enough
dirty data for writeback. if there is not enough diry data for
writing, then sleep 10 seconds, otherwise, write dirty data to
back-end device.
Signed-off-by: Tang Junhui <tang.jun...@zte.com.cn>
---
drivers/md/bcac
bcache called ida_simple_remove() with minor which have multiplied by
BCACHE_MINORS, it would cause minor wrong release and leakage.
In addition, when adding partition support to bcache, the name assignment
was not updated, resulting in numbers jumping (bcache0, bcache16,
bcache32...). This has
-by: Tang Junhui <tang.jun...@zte.com.cn>
---
drivers/md/bcache/btree.c | 29 +++--
1 file changed, 27 insertions(+), 2 deletions(-)
diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c
index 866dcf7..77aa20b 100644
--- a/drivers/md/bcache/btree.c
+++ b/driv
Sequential write IOs were tested with bs=1M by FIO in writeback cache
mode, these IOs were expected to be bypassed, but actually they did not.
We debug the code, and find in check_should_bypass():
if (!congested &&
mode == CACHE_MODE_WRITEBACK &&
op_is_write(bio_op(bio)) &&
Thin flash device does not initialize stripe_sectors_dirty correctly, this
patch fixes this issue.
Signed-off-by: Tang Junhui <tang.jun...@zte.com.cn>
Cc: sta...@vger.kernel.org
---
drivers/md/bcache/super.c | 3 ++-
drivers/md/bcache/writeback.c | 8
drivers/md/bcache/write
ead the same backend device, so it is good
for write-back and also promote the usage efficiency of buckets.
Signed-off-by: Tang Junhui <tang.jun...@zte.com.cn>
---
drivers/md/bcache/alloc.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/md/bcache/alloc.c b/drivers/
rite locker
This patch alloc a separate work-queue for write-back thread to avoid such
race.
Signed-off-by: Tang Junhui <tang.jun...@zte.com.cn>
Cc: sta...@vger.kernel.org
---
drivers/md/bcache/bcache.h| 1 +
drivers/md/bcache/super.c | 2 ++
drivers/md/bcache/writeback.c | 8 ++--
3
Since dirty sectors of thin flash cannot be used to cache data for backend
device, so we should subtract it in calculating writeback rate.
Signed-off-by: Tang Junhui <tang.jun...@zte.com.cn>
Cc: sta...@vger.kernel.org
---
drivers/md/bcache/writeback.c | 2 +-
drivers/md/bcache/writeback.
set_gc_sectors() has been called in bch_gc_thread(), and it was called
again in bch_btree_gc_finish() . The following call is unnecessary, so
delete it.
Signed-off-by: Tang Junhui <tang.jun...@zte.com.cn>
---
drivers/md/bcache/btree.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/driv
Some missed IOs are not counted into cache_misses, this patch fix this
issue.
Signed-off-by: tang.junhui
Reviewed-by: Eric Wheeler
Cc: sta...@vger.kernel.org
---
drivers/md/bcache/request.c | 6 +-
1 file changed, 5 insertions(+), 1
-by: Tang Junhui <tang.jun...@zte.com.cn>
---
drivers/md/bcache/btree.c | 29 +++--
1 file changed, 27 insertions(+), 2 deletions(-)
diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c
index 866dcf7..77aa20b 100644
--- a/drivers/md/bcache/btree.c
+++ b/driv
Thin flash device does not initialize stripe_sectors_dirty correctly, this
patch fixes this issue.
Signed-off-by: Tang Junhui <tang.jun...@zte.com.cn>
Cc: sta...@vger.kernel.org
---
drivers/md/bcache/super.c | 3 ++-
drivers/md/bcache/writeback.c | 8
drivers/md/bcache/write
rite locker
This patch alloc a separate work-queue for write-back thread to avoid such
race.
Signed-off-by: Tang Junhui <tang.jun...@zte.com.cn>
Cc: sta...@vger.kernel.org
---
drivers/md/bcache/bcache.h| 1 +
drivers/md/bcache/super.c | 2 ++
drivers/md/bcache/writeback.c | 8 ++--
3
ver, then goto step 2).
2) Loop in bch_writeback_thread() to check if there is enough
dirty data for writeback. if there is not enough diry data for
writing, then sleep 10 seconds, otherwise, write dirty data to
back-end device.
Signed-off-by: Tang Junhui <tang.jun...@zte.com.cn>
---
drivers/md/bcac
bcache called ida_simple_remove() with minor which have multiplied by
BCACHE_MINORS, it would cause minor wrong release and leakage.
In addition, when adding partition support to bcache, the name assignment
was not updated, resulting in numbers jumping (bcache0, bcache16,
bcache32...). This has
Sequential write IOs were tested with bs=1M by FIO in writeback cache
mode, these IOs were expected to be bypassed, but actually they did not.
We debug the code, and find in check_should_bypass():
if (!congested &&
mode == CACHE_MODE_WRITEBACK &&
op_is_write(bio_op(bio)) &&
Some missed IOs are not counted into cache_misses, this patch fix this
issue.
Signed-off-by: tang.junhui
Reviewed-by: Eric Wheeler
Cc: sta...@vger.kernel.org
---
drivers/md/bcache/request.c | 6 +-
1 file changed, 5 insertions(+), 1
Since bypassed IOs use no bucket, so do not subtract sectors_to_gc to
trigger gc thread.
Signed-off-by: tang.junhui
Reviewed-by: Eric Wheeler
Cc: sta...@vger.kernel.org
---
drivers/md/bcache/request.c | 6 +++---
1 file changed, 3
set_gc_sectors() has been called in bch_gc_thread(), and it was called
again in bch_btree_gc_finish() . The following call is unnecessary, so
delete it.
Signed-off-by: Tang Junhui <tang.jun...@zte.com.cn>
---
drivers/md/bcache/btree.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/driv
Since dirty sectors of thin flash cannot be used to cache data for backend
device, so we should subtract it in calculating writeback rate.
Signed-off-by: Tang Junhui <tang.jun...@zte.com.cn>
Cc: sta...@vger.kernel.org
---
drivers/md/bcache/writeback.c | 2 +-
drivers/md/bcache/writeback.
ead the same backend device, so it is good
for write-back and also promote the usage efficiency of buckets.
Signed-off-by: Tang Junhui <tang.jun...@zte.com.cn>
---
drivers/md/bcache/alloc.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/md/bcache/alloc.c b/drivers/
From: Tang Junhui <tang.jun...@zte.com.cn>
Hello Mike:
> + if (KEY_INODE(>key) != KEY_INODE(>key))
> + return false;
Please remove these redundant codes, all the keys in dc->writeback_keys
have the same KEY_INODE. it is guaranted by refill_dirty().
Regards,
Tang
efcount bch_bucket_alloc_set()
> took:
> */
> if (KEY_PTRS())
Yes, It's useful for code reading, Thanks.
Reviewed-by: Tang Junhui <tang.jun...@zte.com.cn>
From: Tang Junhui <tang.jun...@zte.com.cn>
Hello Mike:
> How did you find this? Did the race trigger at detach or was it through
> code inspection?
I find this through code inspection.
> I need to analyze this more. It looks correct on its own, but there are
> a lot o
From: Tang Junhui <tang.jun...@zte.com.cn>
I try to execute the following command to trigger gc thread:
[root@localhost internal]# echo 1 > trigger_gc
But it does not work, I debug the code in gc_should_run(), It works only
if in invalidating or sectors_to_gc < 0. So set sectors
From: Tang Junhui <tang.jun...@zte.com.cn>
cached_dev_put() is called before setting and writing bdev to
BDEV_STATE_CLEAN state, but after calling cached_dev_put(),
detach work queue works, and bdev is also set to
BDEV_STATE_NONE state in cached_dev_detach_finish(),
it may cause race con
From: Tang Junhui <tang.jun...@zte.com.cn>
Hello Mike:
> One race I think I see: we unset the dirty bit before setting ourselves
> interruptible. Can bch_writeback_add/queue wake writeback before then
> (and then writeback sets itself interruptible and never wakes up)?
>
From: Tang Junhui <tang.jun...@zte.com.cn>
It looks good to me,
I have noted this issue before.
Thanks.
---
Tang Junhui
> If an IO operation fails, and we didn't successfully read data from the
> cache, don't writeback invalid/partial data to the backing disk.
>
> Sign
From: Tang Junhui <tang.jun...@zte.com.cn>
Hello Lyle:
Two questions:
1) In keys_contiguous(), you judge I/O contiguous in cache device, but not
in backing device. I think you should judge it by backing device (remove
PTR_CACHE() and use KEY_OFFSET() instead of PTR_OFFSET()?).
2) I did n
From: Tang Junhui <tang.jun...@zte.com.cn>
Hello Mike:
For the second question, I thinks this modification is somewhat complex,
cannot we do something simple to resolve it? I remember there were some
patches trying to avoid too small writeback rate, Coly, is there any
progre
From: Tang Junhui <tang.jun...@zte.com.cn>
bucket_in_use is updated in gc thread which triggered by invalidating or
writing sectors_to_gc dirty data, It's a long interval. Therefore, when we
use it to compare with the threshold, it is often not timely, which leads
to inaccurate judgment and
From: Tang Junhui <tang.jun...@zte.com.cn>
Hi, everyone:
I create bcache device in 3.10 kernel with bcache of 3.10 kernel, and
we had used it for while, then we backport upstream code to bcache of
3.10 kernel, and build a new kernel module to use in 3.10 kernel.
Is there any compatible
From: Tang Junhui <tang.jun...@zte.com.cn>
Hi, Mike
Thanks for your reminder. I'll checkpatch carefully next time.
Thanks,
Tang
From: Tang Junhui <tang.jun...@zte.com.cn>
Currently, when a cached device detaching from cache, writeback thread is
not stopped, and writeback_rate_update work is not canceled. For example,
after bellow command:
echo 1 >/sys/block/sdb/bcache/detach
you can still see the writeba
From: Tang Junhui <tang.jun...@zte.com.cn>
Hi, everyone:
bcache stucked when reboot system after high load.
root 1704 3.7 0.0 4164 360 ?D14:07 0:09
/usr/lib/udev/bcache-register /dev/sdc
[] closure_sync+0x25/0x90 [bcache]
[] bch_btree_set_root+0x1f1/0x250 [
From: Tang Junhui <tang.jun...@zte.com.cn>
Hi, RuiHua:
> I have met the similar problem once.
> It looks like a deadlock between the cache device register thread and
> bcache_allocator thread.
>
> The trace info tell us the journal is full, probablely the all
From: Tang Junhui <tang.jun...@zte.com.cn>
On Tue, Nov 21, 2017 at 06:50:32PM +0800, tang.jun...@zte.com.cn wrote:
> > From: Tang Junhui <tang.jun...@zte.com.cn>
> >
> > Currently in pick_data_bucket(), though we keep multiple buckets open
> > for writes,
From: Tang Junhui <tang.jun...@zte.com.cn>
Currently in pick_data_bucket(), though we keep multiple buckets open
for writes, and try to segregate different write streams for better
cache utilization: first we look for a bucket where the last write to
it was sequential with the current
From: Tang Junhui <tang.jun...@zte.com.cn>
In such scenario that there are some flash only volumes
, and some cached devices, when many tasks request these devices in
writeback mode, the write IOs may fall to the same bucket as bellow:
| cached data | flash data | cached data | cached data|
From: Tang Junhui <tang.jun...@zte.com.cn>
Hello Coly, Kent
> Correct me if I am wrong. I guess the reason why you care about flash
> only volume is because ceph users use flash only volume to store some
> metadata only on SSD ?
Yes, we store ceph metadata in flash only volume
From: Tang Junhui <tang.jun...@zte.com.cn>
Hello Mike
> Thanks, this looks much better. Can you please fix the whitespace
> issues so it gets through checkpatch cleanly?
OK, I'll resend a patch later.
Thanks,
Tang
From: Tang Junhui <tang.jun...@zte.com.cn>
Hello Coly, Mike
> > If the change can be inside bch_register_lock, it would (just) be more
> > comfortable. The code is correct, because attach/detach sysfs is created
> > after writeback_thread created and writeback_rate_updat
From: Tang Junhui <tang.jun...@zte.com.cn>
In such scenario that there are some flash only volumes
, and some cached devices, when many tasks request these devices in
writeback mode, the write IOs may fall to the same bucket as bellow:
| cached data | flash data | cached data | cached data|
From: Tang Junhui <tang.jun...@zte.com.cn>
There are 3 steps to read out all journal buckets.
1) Try to get a valid journal bucket by golden ratio hash or
falling back to linear search. For an example, NNNYYYNN, each
character represents a bucket, Y represents a valid journal
bucket,
From: Tang Junhui <tang.jun...@zte.com.cn>
Hello Mike, Coly
I found this issue by code reading.
It looks serious.
May I am wrong.
Please have a review.
Thanks,
Tang
From: Tang Junhui <tang.jun...@zte.com.cn>
Hello Mike
> I think the scenario you listed can't happen, because the first bucket
> we try in the hash-search is 0. If the circular buffer has wrapped,
> that will be detected immediately and we'll leave the loop with l=0.
> We sh
From: Tang Junhui <tang.jun...@zte.com.cn>
Hello Mike
> > that will be detected immediately and we'll leave the loop with l=0.
> > We should add a comment that we need to try the first index first for
> > correctness so that we don't inadvertently change this beha
From: Tang Junhui <tang.jun...@zte.com.cn>
Journal bucket is a circular buffer, the bucket
can be like YYYNNNYY, which means the first valid journal in
the 7th bucket, and the latest valid journal in third bucket, in
this case, if we do not try we the zero index first, We
may get a valid j
From: Tang Junhui <tang.jun...@zte.com.cn>
Hello Mike & Coly
Could you please have a reveiw for this patch?
> From: Tang Junhui <tang.jun...@zte.com.cn>
>
> In such scenario that there are some flash only volumes
> , and some cached devices, when many tasks request
From: Tang Junhui <tang.jun...@zte.com.cn>
bucket_in_use is updated in gc thread which triggered by invalidating or
writing sectors_to_gc dirty data, It's a long interval. Therefore, when we
use it to compare with the threshold, it is often not timely, which leads
to inaccurate judgment and
From: Tang Junhui <tang.jun...@zte.com.cn>
Thanks to Mike and Coly's comments.
>> + if(ca->set->avail_nbuckets > 0) {
>> + ca->set->avail_nbuckets--;
>> + bch_update_bucket_in_use(ca-&
From: Tang Junhui <tang.jun...@zte.com.cn>
Currently, when a cached device detaching from cache, writeback thread is not
stopped,
and writeback_rate_update work is not canceled. For example, after bellow
command:
echo 1 >/sys/block/sdb/bcache/detach
you can still see the writeba
From: Tang Junhui <tang.jun...@zte.com.cn>
Hello Eric
> > I'm waiting to queue this patch pending your response to Coly. Can you
> > update the message send a v2?
>
> Hi Tang,
>
> Can you to an update message and send this in so we can get the cache miss
From: Tang Junhui <tang.jun...@zte.com.cn>
Hello Eric
> > I discussed with Tang offline, this patch is correct. But the patch
> > commit log should be improved. Now I help to work on it, should be done
> > quite soon.
>
> Has an updated commit log been made? I'
From: "tang.junhui"
Currently, Cache missed IOs are identified by s->cache_miss, but actually,
there are many situations that missed IOs are not assigned a value for
s->cache_miss in cached_dev_cache_miss(), for example, a bypassed IO
(s->iop.bypass = 1), or the cache_bio
From: Tang Junhui <tang.jun...@zte.com.cn>
Hello Coly,
>When cache set is stopping, calculating writeback rate is wast of time.
>This is the purpose of the first checking, to avoid unnecessary delay
>from bcache_flash_devs_sectors_dirty() inside __update_writeback_rate().
From: Tang Junhui <tang.jun...@zte.com.cn>
Hello Mike,
I thought twice, and feel this patch is a little complex and still not very
accurate for
small backend devices. I think we can resolve it like this:
uint64_t cache_dirty_target =
div_u64(cache_sectors * dc->writebac
From: Tang Junhui <tang.jun...@zte.com.cn>
in bch_debug_init(), ret is always 0, and the return value is useless,
change it to return 0 if be success after calling debugfs_create_dir(),
else return a non-zero value.
Signed-off-by: Tang Junhui <tang.jun...@zte.com.cn>
---
drive
From: Tang Junhui <tang.jun...@zte.com.cn>
>On Fri, Jan 5, 2018 at 11:29 PM, <tang.jun...@zte.com.cn> wrote:
>> From: Tang Junhui <tang.jun...@zte.com.cn>
>>
>> Hello Mike,
>>
>> I thought twice, and feel this patch is a little complex and still
From: Tang Junhui <tang.jun...@zte.com.cn>
LGTM.
Reviewed-by: Tang Junhui <tang.jun...@zte.com.cn>
>Member devices of struct cache_set is used to reference all attached
>
>bcache devices to this cache set. If it is treated
From: Tang Junhui <tang.jun...@zte.com.cn>
LGTM.
Reviewed-by: Tang Junhui <tang.jun...@zte.com.cn>
>Kernel thread routine bch_allocator_thread() references macro
>allocator_wait() to wait for a condition or quit to do_exit()
>when kthread_should_stop() is true.
>
>M
From: Tang Junhui <tang.jun...@zte.com.cn>
LGTM.
Reviewed-by: Tang Junhui <tang.jun...@zte.com.cn>
>Kernel thread routine bch_writeback_thread() has the following code block,
>
>452 set_current_state(TASK_INTE
From: Tang Junhui <tang.jun...@zte.com.cn>
When we run IO in a detached device, and run iostat to shows IO status,
normally it will show like bellow (Omitted some fields):
Device: ... avgrq-sz avgqu-sz await r_await w_await svctm %util
sdd... 15.89 0.531.820.20
From: Tang Junhui <tang.jun...@zte.com.cn>
Sometimes, Journal takes up a lot of CPU, we need statistics
to know what's the journal is doing. So this patch provide
some journal statistics:
1) reclaim: how many times the journal try to reclaim resource,
usually the journal bucket or/and t
From: Tang Junhui <tang.jun...@zte.com.cn>
Hello Coly:
Then in bch_count_io_errors(), why did us still keep these code:
> 92 unsigned errors = atomic_add_return(1 << IO_ERROR_SHIFT,
> 93 >io_errors);
&g
From: Tang Junhui <tang.jun...@zte.com.cn>
Hello Coly:
>It is because of ca->set->error_decay. When error_decay is set, bcache
>tries to do an exponential decay for error count. That is, error numbers
>is decaying against the quantity of io count, this is to avoid long tim
From: Tang Junhui <tang.jun...@zte.com.cn>
Hello Kent && Nix
>
>neither of those locks are needed - rcu_read_lock() isn't needed because we
>never
>free struct btree (except at shutdown), and we're not derefing journal there
__bch_btree_node_write(
From: Tang Junhui <tang.jun...@zte.com.cn>
Hello Kent
>The only purpose of rcu_read_lock() would be to ensure the object
>isn't freed out from under you. That's not an issue here.
>
I do not think so. In for_each_cached_btree(), we traverse all
btrees by hlist_for
From: Tang Junhui <tang.jun...@zte.com.cn>
After long time running of random small IO writing,
I reboot the machine, and after the machine power on,
I found bcache got stuck, the stack is:
[root@ceph153 ~]# cat /proc/2510/task/*/stack
[] closure_sync+0x25/0x90 [bcache]
[] bch_journal+0x118
From: Tang Junhui <tang.jun...@zte.com.cn>
After long time run of random small IO writing,
reboot the machine, and after the machine power on,
bcache got stuck, the stack is:
[root@ceph153 ~]# cat /proc/2510/task/*/stack
[] closure_sync+0x25/0x90 [bcache]
[] bch_journal+0x118/0x2b0 [
From: Tang Junhui <tang.jun...@zte.com.cn>
There is a machine with very little max_sectors_kb size:
[root@ceph151 queue]# pwd
/sys/block/sdd/queue
[root@ceph151 queue]# cat max_hw_sectors_kb
256
[root@ceph151 queue]# cat max_sectors_kb
256
The performance is very low when I run big I/Os.
From: Tang Junhui <tang.jun...@zte.com.cn>
In bch_mca_scan(), the return value should not be the number of freed btree
nodes, but the number of pages of freed btree nodes.
Signed-off-by: Tang Junhui <tang.jun...@zte.com.cn>
---
drivers/md/bcache/btree.c | 2 +-
1 file changed, 1 ins
From: Tang Junhui <tang.jun...@zte.com.cn>
Hello Coly:
OK, I got your point now.
Thanks for your patience.
And there is a small issue I hope to be modified:
+#define BCACHE_DEV_WB_RUNNING4
+#define BCACHE_DEV_RATE_DW_RUNNING8
Would be OK just as:
+#define BCACHE_DEV_WB_R
From: Tang Junhui <tang.jun...@zte.com.cn>
Hello Coly:
This patch is somewhat difficult for me,
I think we can resolve it in a simple way.
We can take the schedule_delayed_work() under the protection of
dc->writeback_lock, and judge if we need re-arm this work to queue.
st
From: Tang Junhui <tang.jun...@zte.com.cn>
In GC thread, we record the latest GC key in gc_done, which is expected
to be used for incremental GC, but in currently code, we didn't realize
it. When GC runs, front side IO would be blocked until the GC over, it
would be a long time if there is
From: Tang Junhui <tang.jun...@zte.com.cn>
Hello Coly:
There are some differences,
Using variable of atomic_t type can not guarantee the atomicity of transaction.
for example:
A thread runs in update_writeback_rate()
update_writeback_rate(){
+ if (te
From: Tang Junhui <tang.jun...@zte.com.cn>
I attach a back-end device to a cache set, and the cache set is not
registered yet, this back-end device did not attach successfully, and no
error returned:
[root]# echo 87859280-fec6-4bcc-20df7ca8f86b > /sys/block/sde/bcache/att
From: Tang Junhui <tang.jun...@zte.com.cn>
back-end device sdm has already attached a cache_set with ID
f67ebe1f-f8bc-4d73-bfe5-9dc88607f119, then try to attach with
another cache set, and it returns with an error:
[root]# cd /sys/block/sdm/bcache
[root]# echo 5ccd0a63-148e-48b8-afa2-aca9cb
From: Tang Junhui <tang.jun...@zte.com.cn>
After long time small writing I/O running, we found the occupancy of CPU
is very high and I/O performance has been reduced by about half:
[root@ceph151 internal]# top
top - 15:51:05 up 1 day,2:43, 4 users, load average: 16.89, 15.15, 16.53
Tasks
From: Tang Junhui <tang.jun...@zte.com.cn>
> Unfortunately, this doesn't build because of nonexistent call heap_empty
> (I assume some changes to util.h got left out). I really need clean
> patches that build and are formatted properly.
>
> Mike
Oh, I am so sorry for that
From: Tang Junhui <tang.jun...@zte.com.cn>
This patch base on "[PATCH] bcache: finish incremental GC".
Since incremental GC would stop 100ms when front side I/O comes, so when
there are many btree nodes, if GC only processes constant (100) nodes each
time, GC would
From: Tang Junhui <tang.jun...@zte.com.cn>
Stripe size is shown as zero when no strip in back end device:
[root@ceph132 ~]# cat /sys/block/sdd/bcache/stripe_size
0.0k
Actually it should be 1T Bytes (1 << 31 sectors), but in sysfs
interface, stripe_size was changed from sectors to byt
From: Tang Junhui <tang.jun...@zte.com.cn>
LGTM, I would like it much more if MAX_WRITEBACKS_IN_PASS(5) defines a
little bigger value.
Reviewed-by: Tang Junhui <tang.jun...@zte.com.cn>
>Previously, there was some logic that attempted to immediately issue
>writeback of backing
From: Tang Junhui <tang.jun...@zte.com.cn>
This patch is useful for preventing the overflow of the expression
(cache_dirty_target * bdev_sectors(dc->bdev)), but it also
lead into a calc error, for example, when there is a 1G and
100*164G cached device, it would cause the "
From: Tang Junhui <tang.jun...@zte.com.cn>
I noticed you add "closure_sync()" before assigning delay to zero in your
patch,
I think we should add it before:
delay = writeback_delay(dc, size)
otherwise we would alays get a wrong value of delay after calling
writeback_delay(),
From: Tang Junhui <tang.jun...@zte.com.cn>
>I don't think so. The thing that is controlled (in current code, and
>this patch set) is the rate of issuance, not of completion (though
>issuance rate is guaranteed not to exceed completion rate, because of
>the semaphore for the m
From: Tang Junhui <tang.jun...@zte.com.cn>
>Thank you for the feedback.
>
>On Mon, Jan 1, 2018 at 10:33 PM, <tang.jun...@zte.com.cn> wrote:
>> From: Tang Junhui <tang.jun...@zte.com.cn>
>>
>> This patch is useful for preventing the over
From: Tang Junhui <tang.jun...@zte.com.cn>
>On 01/02/2018 12:53 AM, tang.jun...@zte.com.cn wrote:
>> If no front-end I/O coming, would this cause write-back IOs one by one
>> (one write-back IO issued must after the completion of the previous IO)?
>> though with ze
From: Tang Junhui <tang.jun...@zte.com.cn>
>The function cached_dev_make_request() and flash_dev_make_request() call
>generic_start_io_acct() with (struct bcache_device)->disk when they start a
>closure. Then the function bio_complete() calls generic_end_io_acct() with
>(str
From: Tang Junhui <tang.jun...@zte.com.cn>
LGTM, and I tested it, it promotes the write-back performance.
[Sorry for the wrong content in the previous email]
Reviewed-by: Tang Junhui <tang.jun...@zte.com.cn>
Tested-by: Tang Junhui <tang.jun...@zte.com.cn>
> Writeback keys a
From: Tang Junhui <tang.jun...@zte.com.cn>
>>
>> More importantly,
>>> +while (!kthread_should_stop() && next) {
>>> ...
>>> +if (nk != 0 && !keys_contiguous(dc, keys[nk-1], next))
>>> +break;
&
From: Tang Junhui <tang.jun...@zte.com.cn>
I remember I have reviewed this patch befor, still there is a bug
in keys_contiguous(), since KEY_OFFSET(key) stores the end adress of the
request IO, so I think we should judge the contiguous of keys as bellow:
if (bkey_cmp(>key,
From: Tang Junhui <tang.jun...@zte.com.cn>
More importantly,
> +while (!kthread_should_stop() && next) {
> ...
> +if (nk != 0 && !keys_contiguous(dc, keys[nk-1], next))
> +break;
> +
> +size += KEY_SIZE
From: Tang Junhui <tang.jun...@zte.com.cn>
Hello Coly,
Thanks for this serials.
>struct delayed_work writeback_rate_update in struct cache_dev is a delayed
>worker to call function update_writeback_rate() in period (the interval is
>defined by dc->writeback_rate_update
From: Tang Junhui <tang.jun...@zte.com.cn>
Hello Coly,
>dc->writeback_rate_update is a special delayed worker, it re-arms itself
>to run after several seconds by,
>>> schedule_delayed_work(>writeback_rate_update,
>>> dc->writebac
From: Tang Junhui <tang.jun...@zte.com.cn>
Hello Coly,
Thanks for your works!
Acctually stopping write-back thread and writeback_rate_update work in
bcache_device_detach() has already done in:
https://github.com/mlyle/linux/commit/397d02e162b8ee11940a4e9f45e16fee0650d64e
Is it nessary
From: Tang Junhui <tang.jun...@zte.com.cn>
Hello Coly,
This patch is great!
One tips,
Could you replace the c->io_disable with the already exsited c->flags?
So we can just need to add a new macro such as CACHE_SET_IO_DISABLE.
>When too many I/Os failed on cache device, bch_
From: Tang Junhui <tang.jun...@zte.com.cn>
Hi Coly,
>It is about an implicit and interesting ordering, a simple patch with a
>lot detail behind. Let me explain why it's safe,
>
>- cancel_delayed_work_sync() is called in bcache_device_detach() when
>dc->count is 0. But
From: Tang Junhui <tang.jun...@zte.com.cn>
In btree_flush_write(), two places need to take a locker to
avoid races:
Firstly, we need take rcu read locker to protect the bucket_hash
traverse, since hlist_for_each_entry_rcu() must be called under
the protection of rcu read locker.
Second
1 - 100 of 135 matches
Mail list logo