Re: [PATCH 00/10] block: cleanup on direct access to bvec table(prepare for multipage bvec)
Most of this looks sane, but I'd really like to see it in context of the actual multipage bvec patches. Do you have an updated branch on top of these?
Re: [PATCH] locking/lockdep: Make CONFIG_LOCKDEP_CROSSRELEASE and CONFIG_LOCKDEP_COMPLETIONS optional
+cc da...@fromorbit.com +cc ty...@mit.edu +cc wi...@infradead.org +cc torva...@linux-foundation.org +cc amir7...@gmail.com On 12/12/2017 4:11 PM, Byungchul Park wrote: At the moment, it's rather premature to enable CONFIG_LOCKDEP_CROSSRELEASE and CONFIG_LOCKDEP_COMPLETIONS by default, because we face a lot of false positives for now since all locks and waiters are not classified properly yet. Until most of them get annotated properly, it'd be better to be optional. Signed-off-by: Byungchul Park--- lib/Kconfig.debug | 11 +++ 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index 2689b7c..bc099f1 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -1092,8 +1092,6 @@ config PROVE_LOCKING select DEBUG_MUTEXES select DEBUG_RT_MUTEXES if RT_MUTEXES select DEBUG_LOCK_ALLOC - select LOCKDEP_CROSSRELEASE - select LOCKDEP_COMPLETIONS select TRACE_IRQFLAGS default n help @@ -1164,7 +1162,9 @@ config LOCK_STAT (CONFIG_LOCKDEP defines "acquire" and "release" events.) config LOCKDEP_CROSSRELEASE - bool + bool "Lock debugging: enable cross-locking checks in lockdep" + depends on PROVE_LOCKING + default n help This makes lockdep work for crosslock which is a lock allowed to be released in a different context from the acquisition context. @@ -1174,7 +1174,10 @@ config LOCKDEP_CROSSRELEASE detector, lockdep. config LOCKDEP_COMPLETIONS - bool + bool "Lock debugging: allow completions to use deadlock detector" + depends on PROVE_LOCKING + select LOCKDEP_CROSSRELEASE + default n help A deadlock caused by wait_for_completion() and complete() can be detected by lockdep using crossrelease feature. -- Thanks, Byungchul
[PATCH] locking/lockdep: Make CONFIG_LOCKDEP_CROSSRELEASE and CONFIG_LOCKDEP_COMPLETIONS optional
At the moment, it's rather premature to enable CONFIG_LOCKDEP_CROSSRELEASE and CONFIG_LOCKDEP_COMPLETIONS by default, because we face a lot of false positives for now since all locks and waiters are not classified properly yet. Until most of them get annotated properly, it'd be better to be optional. Signed-off-by: Byungchul Park--- lib/Kconfig.debug | 11 +++ 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index 2689b7c..bc099f1 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -1092,8 +1092,6 @@ config PROVE_LOCKING select DEBUG_MUTEXES select DEBUG_RT_MUTEXES if RT_MUTEXES select DEBUG_LOCK_ALLOC - select LOCKDEP_CROSSRELEASE - select LOCKDEP_COMPLETIONS select TRACE_IRQFLAGS default n help @@ -1164,7 +1162,9 @@ config LOCK_STAT (CONFIG_LOCKDEP defines "acquire" and "release" events.) config LOCKDEP_CROSSRELEASE - bool + bool "Lock debugging: enable cross-locking checks in lockdep" + depends on PROVE_LOCKING + default n help This makes lockdep work for crosslock which is a lock allowed to be released in a different context from the acquisition context. @@ -1174,7 +1174,10 @@ config LOCKDEP_CROSSRELEASE detector, lockdep. config LOCKDEP_COMPLETIONS - bool + bool "Lock debugging: allow completions to use deadlock detector" + depends on PROVE_LOCKING + select LOCKDEP_CROSSRELEASE + default n help A deadlock caused by wait_for_completion() and complete() can be detected by lockdep using crossrelease feature. -- 1.9.1
Re: [PATCH] locking/lockdep: Add CONFIG_LOCKDEP_AGGRESSIVE
On 12/12/2017 6:06 AM, Linus Torvalds wrote: On Sun, Dec 10, 2017 at 7:50 PM, Theodore Ts'owrote: CONFIG_LOCKDEP_CROSSRELEASE and CONFIG_LOCKDEP_COMPLETIONS can result in a large number of false positives because lockdep doesn't understand how to deal with multiple stacked loop or MD devices. Guys, can we just remove this nasty crud already? It's not working. Give it up. It was complex, it was buggy, it was slow. Sorry to hear that, but it works well and fortunately it has not been buggy until now, of course, it has been wrongly accused twice though. Furthormore, it's not slow now since it was fixed. You seem to say that w/o foundation but emotionally. The *problem* is false positives, since locks and waiters in kernel are not classified properly, at the moment, which is just a fact that is not related to cross-release stuff at all. IOW, that would be useful once all locks and waiters are classified correctly. It might take time but the classifying is a must-do we have to keep doing. I admit to make it optional for now, but I don't see why you want to remove it entierly. Now it's causing people to disable lockdep entirely, or play these kinds of games in unrelated trees. It's time to give up on bad debugging, and definitely _not_ enable it by default for PROVE_LOCKING. Ted: in the meantime, don't use PROVE_LOCKING. Just use DEBUG_LOCK_ALLOC instead. Please drop this patch from your tree. Linus -- Thanks, Byungchul
Re: [PATCH v2 1/3] scsi: Fix a scsi_show_rq() NULL pointer dereference
On Mon, Dec 11, 2017 at 10:11:29PM -0500, Martin K. Petersen wrote: > > Hi Ming, > > > This patch allocates one array for T10_PI_TYPE2_PROTECTION command, > > size of each element is SD_EXT_CDB_SIZE, and the length is > > host->can_queue, then we can retrieve one command buffer runtime > > via rq->tag. > > > > So we can avoid to allocate the command buffer runtime, also the > > recent use-after-free report[1] in scsi_show_rq() can be fixed too. > > I'm still mulling over the pros and cons of this one for 4.16+... Hi Martin, This patch can't work in case of real multiple hw queues, but can be fixed without much work. Even we can convert the big allocation into page by page allocation if there is case of huge tag space. Anyway if you think this approach is good, please let me know, and I am happy to cook V2 for review. Thanks, Ming
Re: [PATCH v2 1/3] scsi: Fix a scsi_show_rq() NULL pointer dereference
Hi Ming, > This patch allocates one array for T10_PI_TYPE2_PROTECTION command, > size of each element is SD_EXT_CDB_SIZE, and the length is > host->can_queue, then we can retrieve one command buffer runtime > via rq->tag. > > So we can avoid to allocate the command buffer runtime, also the > recent use-after-free report[1] in scsi_show_rq() can be fixed too. I'm still mulling over the pros and cons of this one for 4.16+... -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH v2 1/3] scsi: Fix a scsi_show_rq() NULL pointer dereference
Bart, > Avoid that scsi_show_rq() triggers a NULL pointer dereference if > called after sd_uninit_command(). Swap the NULL pointer assignment > and the mempool_free() call in sd_uninit_command() to make it less > likely that scsi_show_rq() triggers a use-after-free. Note: even > with these changes scsi_show_rq() can trigger a use-after-free but > that's a lesser evil than e.g. suppressing debug information for > T10-PI commands completely. This patch fixes the following oops: Applied to 4.15/scsi-fixes. Thanks, Bart. -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH] locking/lockdep: Add CONFIG_LOCKDEP_AGGRESSIVE
On Mon, Dec 11, 2017 at 01:06:55PM -0800, Linus Torvalds wrote: > On Sun, Dec 10, 2017 at 7:50 PM, Theodore Ts'owrote: > > CONFIG_LOCKDEP_CROSSRELEASE and CONFIG_LOCKDEP_COMPLETIONS can result > > in a large number of false positives because lockdep doesn't > > understand how to deal with multiple stacked loop or MD devices. > > Guys, can we just remove this nasty crud already? > > It's not working. Give it up. It was complex, it was buggy, it was slow. > > Now it's causing people to disable lockdep entirely, or play these > kinds of games in unrelated trees. > > It's time to give up on bad debugging, and definitely _not_ enable it > by default for PROVE_LOCKING. To be fair to Byungchul, I think it *can* be valid for finding some classes of bugs. It's just a disaster for anything to do with storage. I crafted this patch as something something which I thought *could* be a path forward; it disables it by default, and gives a warning about how it could cause a lot of pain for storage developers, but if other kernel devs want to use it to potentially find problem in their networking or wifi drivers --- sure, why not? Just make it be something *optional*. If people really want to make this work for storage, what I think we would need is variants of spin_lock_init(), mutex_init(), etc., which take a struct super or a struct block device, with proper documentation so that people don't have to struggle with undocumented C preprocessor macros where every single time I need to mess with lockdep annotations, I have to try figure out exactly what is a class and subclass. So in fact, what I was really hoping for was that some variant of this patch would end up in the sched tree, and get pushed to you v4.15-rcX patch as a regression fix, and I'd drop it from the ext4 tree. - Ted
Xen PV DomU running Kernel 4.14.5-1.el7.elrepo.x86_64: xl -v vcpu-set triggers domU kernel WARNING, then domU becomes unresponsive
Hi, first of all, I'm not subscribed to the linux-block@ list. Running "xl -v vcpu-set " in Dom0 triggers the warning below, then a number of commands like top or ls stall. The only domU recovery solution is to terminate it immediately using "xl destroy". I couldn't replicate it on: - CentOS 6 running kernel-2.6.32-696.16.1.el6.x86_64, kernel-lt-4.4.105-1.el6.elrepo.x86_64 - CentOS 7 running 4.9.67-1.el7.centos.x86_64 But I can replicate it consistently on: - CentOS 6 running 4.14.5-1.el6.elrepo.x86_64 - CentOS 7 running 4.14.5-1.el7.elrepo.x86_64 dom0 versions tested with similar results in the domU: - 4.6.6-6.el7 on kernel 4.9.63-29.el7.x86_64 - 4.6.3-15.el6 on kernel 4.9.37-29.el6.x86_64 Noticed behaviour: - These commands stall: top ls -l /var/tmp ls -l /tmp - Stuck in D state on the CentOS 7 domU: root 5 0.0 0.0 0 0 ?D11:20 0:00 [kworker/u8:0] root 316 0.0 0.0 0 0 ?D11:20 0:00 [jbd2/xvda1-8] root 1145 0.0 0.2 116636 4776 ?Ds 11:20 0:00 -bash root 1289 0.0 0.1 25852 2420 ?Ds 11:35 0:00 /usr/bin/systemd-tmpfiles --clean root 1290 0.0 0.1 125248 2696 pts/1D+ 11:44 0:00 ls --color=auto -l /tmp/ root 1293 0.0 0.1 125248 2568 pts/2D+ 11:44 0:00 ls --color=auto -l /var/tmp root 1296 0.0 0.2 116636 4908 pts/3Ds+ 11:44 0:00 -bash root 1358 0.0 0.1 125248 2612 pts/4D+ 11:47 0:00 ls --color=auto -l /var/tmp At a first glance it appears the issue is the domU kernel. Stack traces follow: -CentOS 6 kernel-ml-4.14.5-1.el6.elrepo.x86_64 start here- [ cut here ] WARNING: CPU: 4 PID: 60 at block/blk-mq.c:1144 __blk_mq_run_hw_queue+0x9e/0xc0 Modules linked in: intel_cstate(-) ipt_REJECT nf_reject_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 xt_multiport iptable_filter ip_tables ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack libcrc32c ip6table_filter ip6_tables dm_mod dax xen_netfront crc32_pclmul crct10dif_pclmul ghash_clmulni_intel crc32c_intel pcbc aesni_intel glue_helper crypto_simd cryptd aes_x86_64 coretemp hwmon x86_pkg_temp_thermal sb_edac intel_rapl_perf pcspkr ext4 jbd2 mbcache xen_blkfront CPU: 4 PID: 60 Comm: kworker/4:1H Not tainted 4.14.5-1.el6.elrepo.x86_64 #1 Workqueue: kblockd blk_mq_run_work_fn task: 8802711a2780 task.stack: c90041af4000 RIP: e030:__blk_mq_run_hw_queue+0x9e/0xc0 RSP: e02b:c90041af7c48 EFLAGS: 00010202 RAX: 0001 RBX: 88027117fa80 RCX: 0001 RDX: 88026b053ee0 RSI: 88027351bca0 RDI: 88026b072800 RBP: c90041af7c68 R08: c90041af7eb8 R09: 8802711a2810 R10: 7ff0 R11: 0001 R12: 88026b072800 R13: e8d04d00 R14: R15: e8d04d05 FS: 2b7b7c89b700() GS:88027350() knlGS: CS: e033 DS: ES: CR0: 80050033 CR2: ff600400 CR3: 00026d953000 CR4: 00042660 Call Trace: blk_mq_run_work_fn+0x31/0x40 process_one_work+0x174/0x440 ? xen_mc_flush+0xad/0x1b0 ? schedule+0x3a/0xa0 worker_thread+0x6b/0x410 ? default_wake_function+0x12/0x20 ? __wake_up_common+0x84/0x130 ? maybe_create_worker+0x120/0x120 ? schedule+0x3a/0xa0 ? _raw_spin_unlock_irqrestore+0x16/0x20 ? maybe_create_worker+0x120/0x120 kthread+0x111/0x150 ? __kthread_init_worker+0x40/0x40 ret_from_fork+0x25/0x30 Code: 89 df e8 06 2f d9 ff 4c 89 e7 41 89 c5 e8 0b 6e 00 00 44 89 ee 48 89 df e8 20 2f d9 ff 48 8b 5d e8 4c 8b 65 f0 4c 8b 6d f8 c9 c3 <0f> ff eb aa 4c 89 e7 e8 e6 6d 00 00 48 8b 5d e8 4c 8b 65 f0 4c ---[ end trace fe2aaf4e723042fd ]--- -CentOS 6 kernel-ml-4.14.5-1.el6.elrepo.x86_64 end here- -CentOS 7 kernel-ml-4.14.5-1.el7.elrepo.x86_64 start here- [ 116.528885] [ cut here ] [ 116.528894] WARNING: CPU: 3 PID: 38 at block/blk-mq.c:1144 __blk_mq_run_hw_queue+0x89/0xa0 [ 116.528898] Modules linked in: intel_cstate(-) ip_set_hash_ip ip_set nfnetlink x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel crypto_simd glue_helper cryptd intel_rapl_perf pcspkr nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables ext4 mbcache jbd2 xen_netfront xen_blkfront crc32c_intel [ 116.528919] CPU: 3 PID: 38 Comm: kworker/3:1H Not tainted 4.14.5-1.el7.elrepo.x86_64 #1 [ 116.529007] Code: 00 e8 7c c5 45 00 4c 89 e7 e8 14 4b d7 ff 48 89 df 41 89 c5 e8 19 66 00 00 44 89 ee 4c 89 e7 e8 2e 4b d7 ff 5b 41 5c 41 5d 5d c3 <0f> ff eb b4 48 89 df e8 fb 65 00 00 5b 41 5c 41 5d 5d c3 0f ff [ 116.529034] ---[ end trace a7814e3ec9a330c6 ]--- [ 147.424117] [ cut here ] [ 147.424150] WARNING: CPU: 2 PID: 24 at block/blk-mq.c:1144 __blk_mq_run_hw_queue+0x89/0xa0 [ 147.424160] Modules linked in: ip_set_hash_ip ip_set nfnetlink x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul
Re: [PATCH V8 0/7] blk-mq support for ZBC disks
Jens, On Fri, 2017-11-24 at 16:54 -0700, Jens Axboe wrote: > On 11/24/2017 04:48 PM, Damien Le Moal wrote: > > [Full quote deleted] > > > > Hi Jens, > > > > Any comment regarding this series ? > > I understand that this would be for the 4.16 merge window, so no hurry, > > but I would like to know if I need to go back to the drawing board for > > ZBC blk-mq/scsi-mq support or if this is an acceptable solution. > > I'll give it a thorough look-over on Monday. Would you have any comment on the series ? I understand you are busy so please feel free to let me know if there is anything I can do to facilitate your review. Thank you. -- Damien Le Moal Western Digital
Re: [PATCH] locking/lockdep: Add CONFIG_LOCKDEP_AGGRESSIVE
On Sun, Dec 10, 2017 at 7:50 PM, Theodore Ts'owrote: > CONFIG_LOCKDEP_CROSSRELEASE and CONFIG_LOCKDEP_COMPLETIONS can result > in a large number of false positives because lockdep doesn't > understand how to deal with multiple stacked loop or MD devices. Guys, can we just remove this nasty crud already? It's not working. Give it up. It was complex, it was buggy, it was slow. Now it's causing people to disable lockdep entirely, or play these kinds of games in unrelated trees. It's time to give up on bad debugging, and definitely _not_ enable it by default for PROVE_LOCKING. Ted: in the meantime, don't use PROVE_LOCKING. Just use DEBUG_LOCK_ALLOC instead. Please drop this patch from your tree. Linus
[PATCH] block/sed-opal: Fix warnings never less than zero
Fixed following smatch warnings in sed-opal.c block/sed-opal.c:2311 opal_set_new_pw() warn: unsigned 'opal_pw->session.who' is never less than zero. Unless enum opal_user interface will remain stable warning could be fixed in mentioned way by removing unnecessary check Signed-off-by: Vasyl Gomonovych--- block/sed-opal.c | 7 ++- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/block/sed-opal.c b/block/sed-opal.c index 9ed51d0c..a5f0fa2 100644 --- a/block/sed-opal.c +++ b/block/sed-opal.c @@ -,8 +,7 @@ static int opal_lock_unlock(struct opal_dev *dev, { int ret; - if (lk_unlk->session.who < OPAL_ADMIN1 || - lk_unlk->session.who > OPAL_USER9) + if (lk_unlk->session.who > OPAL_USER9) return -EINVAL; mutex_lock(>dev_lock); @@ -2308,9 +2307,7 @@ static int opal_set_new_pw(struct opal_dev *dev, struct opal_new_pw *opal_pw) }; int ret; - if (opal_pw->session.who < OPAL_ADMIN1 || - opal_pw->session.who > OPAL_USER9 || - opal_pw->new_user_pw.who < OPAL_ADMIN1 || + if (opal_pw->session.who > OPAL_USER9 || opal_pw->new_user_pw.who > OPAL_USER9) return -EINVAL; -- 1.9.1
[RFC] blk-throttle: export io_serviced_recursive, io_service_bytes_recursive
export these two interface for cgroup-v1. Signed-off-by: weiping zhang--- block/blk-throttle.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/block/blk-throttle.c b/block/blk-throttle.c index 96ad326..1d7637f 100644 --- a/block/blk-throttle.c +++ b/block/blk-throttle.c @@ -1512,10 +1512,20 @@ static struct cftype throtl_legacy_files[] = { .seq_show = blkg_print_stat_bytes, }, { + .name = "throttle.io_service_bytes_recursive", + .private = (unsigned long)_policy_throtl, + .seq_show = blkg_print_stat_bytes_recursive, + }, + { .name = "throttle.io_serviced", .private = (unsigned long)_policy_throtl, .seq_show = blkg_print_stat_ios, }, + { + .name = "throttle.io_serviced_recursive", + .private = (unsigned long)_policy_throtl, + .seq_show = blkg_print_stat_ios_recursive, + }, { } /* terminate */ }; -- 2.9.4
Re: [PATCH V15 00/22] mmc: Add Command Queue support
On 29 November 2017 at 16:47, Ulf Hanssonwrote: > Hi Adrian, > > On 29 November 2017 at 14:40, Adrian Hunter wrote: >> Hi >> >> Here is V15 of the hardware command queue patches without the software >> command queue patches, now using blk-mq and now with blk-mq support for >> non-CQE I/O. > > I have applied patches 1->19 for next. Deferring patch 21->23 for a while. We haven't got any reports about big regressions, I think this looks solid! So, I have decided to apply 21->23 for next as well. [...] Thanks and kind regards Uffe