Re: [PATCH 00/10] block: cleanup on direct access to bvec table(prepare for multipage bvec)

2017-12-11 Thread Christoph Hellwig
Most of this looks sane, but I'd really like to see it in context
of the actual multipage bvec patches.  Do you have an updated branch
on top of these?


Re: [PATCH] locking/lockdep: Make CONFIG_LOCKDEP_CROSSRELEASE and CONFIG_LOCKDEP_COMPLETIONS optional

2017-12-11 Thread Byungchul Park

+cc da...@fromorbit.com
+cc ty...@mit.edu
+cc wi...@infradead.org
+cc torva...@linux-foundation.org
+cc amir7...@gmail.com

On 12/12/2017 4:11 PM, Byungchul Park wrote:

At the moment, it's rather premature to enable
CONFIG_LOCKDEP_CROSSRELEASE and CONFIG_LOCKDEP_COMPLETIONS by default,
because we face a lot of false positives for now since all locks and
waiters are not classified properly yet.

Until most of them get annotated properly, it'd be better to be optional.

Signed-off-by: Byungchul Park 
---
  lib/Kconfig.debug | 11 +++
  1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 2689b7c..bc099f1 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1092,8 +1092,6 @@ config PROVE_LOCKING
select DEBUG_MUTEXES
select DEBUG_RT_MUTEXES if RT_MUTEXES
select DEBUG_LOCK_ALLOC
-   select LOCKDEP_CROSSRELEASE
-   select LOCKDEP_COMPLETIONS
select TRACE_IRQFLAGS
default n
help
@@ -1164,7 +1162,9 @@ config LOCK_STAT
 (CONFIG_LOCKDEP defines "acquire" and "release" events.)
  
  config LOCKDEP_CROSSRELEASE

-   bool
+   bool "Lock debugging: enable cross-locking checks in lockdep"
+   depends on PROVE_LOCKING
+   default n
help
 This makes lockdep work for crosslock which is a lock allowed to
 be released in a different context from the acquisition context.
@@ -1174,7 +1174,10 @@ config LOCKDEP_CROSSRELEASE
 detector, lockdep.
  
  config LOCKDEP_COMPLETIONS

-   bool
+   bool "Lock debugging: allow completions to use deadlock detector"
+   depends on PROVE_LOCKING
+   select LOCKDEP_CROSSRELEASE
+   default n
help
 A deadlock caused by wait_for_completion() and complete() can be
 detected by lockdep using crossrelease feature.



--
Thanks,
Byungchul


[PATCH] locking/lockdep: Make CONFIG_LOCKDEP_CROSSRELEASE and CONFIG_LOCKDEP_COMPLETIONS optional

2017-12-11 Thread Byungchul Park
At the moment, it's rather premature to enable
CONFIG_LOCKDEP_CROSSRELEASE and CONFIG_LOCKDEP_COMPLETIONS by default,
because we face a lot of false positives for now since all locks and
waiters are not classified properly yet.

Until most of them get annotated properly, it'd be better to be optional.

Signed-off-by: Byungchul Park 
---
 lib/Kconfig.debug | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 2689b7c..bc099f1 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1092,8 +1092,6 @@ config PROVE_LOCKING
select DEBUG_MUTEXES
select DEBUG_RT_MUTEXES if RT_MUTEXES
select DEBUG_LOCK_ALLOC
-   select LOCKDEP_CROSSRELEASE
-   select LOCKDEP_COMPLETIONS
select TRACE_IRQFLAGS
default n
help
@@ -1164,7 +1162,9 @@ config LOCK_STAT
 (CONFIG_LOCKDEP defines "acquire" and "release" events.)
 
 config LOCKDEP_CROSSRELEASE
-   bool
+   bool "Lock debugging: enable cross-locking checks in lockdep"
+   depends on PROVE_LOCKING
+   default n
help
 This makes lockdep work for crosslock which is a lock allowed to
 be released in a different context from the acquisition context.
@@ -1174,7 +1174,10 @@ config LOCKDEP_CROSSRELEASE
 detector, lockdep.
 
 config LOCKDEP_COMPLETIONS
-   bool
+   bool "Lock debugging: allow completions to use deadlock detector"
+   depends on PROVE_LOCKING
+   select LOCKDEP_CROSSRELEASE
+   default n
help
 A deadlock caused by wait_for_completion() and complete() can be
 detected by lockdep using crossrelease feature.
-- 
1.9.1



Re: [PATCH] locking/lockdep: Add CONFIG_LOCKDEP_AGGRESSIVE

2017-12-11 Thread Byungchul Park

On 12/12/2017 6:06 AM, Linus Torvalds wrote:

On Sun, Dec 10, 2017 at 7:50 PM, Theodore Ts'o  wrote:

CONFIG_LOCKDEP_CROSSRELEASE and CONFIG_LOCKDEP_COMPLETIONS can result
in a large number of false positives because lockdep doesn't
understand how to deal with multiple stacked loop or MD devices.


Guys, can we just remove this nasty crud already?

It's not working. Give it up. It was complex, it was buggy, it was slow.


Sorry to hear that, but it works well and fortunately it has not
been buggy until now, of course, it has been wrongly accused
twice though. Furthormore, it's not slow now since it was fixed.
You seem to say that w/o foundation but emotionally.

The *problem* is false positives, since locks and waiters in
kernel are not classified properly, at the moment, which is just
a fact that is not related to cross-release stuff at all. IOW,
that would be useful once all locks and waiters are classified
correctly. It might take time but the classifying is a must-do
we have to keep doing.

I admit to make it optional for now, but I don't see why you
want to remove it entierly.

Now it's causing people to disable lockdep entirely, or play these
kinds of games in unrelated trees.

It's time to give up on bad debugging, and definitely _not_ enable it
by default for PROVE_LOCKING.

Ted: in the meantime, don't use PROVE_LOCKING. Just use
DEBUG_LOCK_ALLOC instead. Please drop this patch from your tree.

   Linus



--
Thanks,
Byungchul


Re: [PATCH v2 1/3] scsi: Fix a scsi_show_rq() NULL pointer dereference

2017-12-11 Thread Ming Lei
On Mon, Dec 11, 2017 at 10:11:29PM -0500, Martin K. Petersen wrote:
> 
> Hi Ming,
> 
> > This patch allocates one array for T10_PI_TYPE2_PROTECTION command,
> > size of each element is SD_EXT_CDB_SIZE, and the length is
> > host->can_queue, then we can retrieve one command buffer runtime
> > via rq->tag.
> >
> > So we can avoid to allocate the command buffer runtime, also the
> > recent use-after-free report[1] in scsi_show_rq() can be fixed too.
> 
> I'm still mulling over the pros and cons of this one for 4.16+...

Hi Martin,

This patch can't work in case of real multiple hw queues, but can be
fixed without much work.

Even we can convert the big allocation into page by page allocation
if there is case of huge tag space.

Anyway if you think this approach is good, please let me know, and I
am happy to cook V2 for review.


Thanks,
Ming


Re: [PATCH v2 1/3] scsi: Fix a scsi_show_rq() NULL pointer dereference

2017-12-11 Thread Martin K. Petersen

Hi Ming,

> This patch allocates one array for T10_PI_TYPE2_PROTECTION command,
> size of each element is SD_EXT_CDB_SIZE, and the length is
> host->can_queue, then we can retrieve one command buffer runtime
> via rq->tag.
>
> So we can avoid to allocate the command buffer runtime, also the
> recent use-after-free report[1] in scsi_show_rq() can be fixed too.

I'm still mulling over the pros and cons of this one for 4.16+...

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH v2 1/3] scsi: Fix a scsi_show_rq() NULL pointer dereference

2017-12-11 Thread Martin K. Petersen

Bart,

> Avoid that scsi_show_rq() triggers a NULL pointer dereference if
> called after sd_uninit_command(). Swap the NULL pointer assignment
> and the mempool_free() call in sd_uninit_command() to make it less
> likely that scsi_show_rq() triggers a use-after-free. Note: even
> with these changes scsi_show_rq() can trigger a use-after-free but
> that's a lesser evil than e.g. suppressing debug information for
> T10-PI commands completely. This patch fixes the following oops:

Applied to 4.15/scsi-fixes. Thanks, Bart.

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH] locking/lockdep: Add CONFIG_LOCKDEP_AGGRESSIVE

2017-12-11 Thread Theodore Ts'o
On Mon, Dec 11, 2017 at 01:06:55PM -0800, Linus Torvalds wrote:
> On Sun, Dec 10, 2017 at 7:50 PM, Theodore Ts'o  wrote:
> > CONFIG_LOCKDEP_CROSSRELEASE and CONFIG_LOCKDEP_COMPLETIONS can result
> > in a large number of false positives because lockdep doesn't
> > understand how to deal with multiple stacked loop or MD devices.
> 
> Guys, can we just remove this nasty crud already?
> 
> It's not working. Give it up. It was complex, it was buggy, it was slow.
> 
> Now it's causing people to disable lockdep entirely, or play these
> kinds of games in unrelated trees.
> 
> It's time to give up on bad debugging, and definitely _not_ enable it
> by default for PROVE_LOCKING.

To be fair to Byungchul, I think it *can* be valid for finding some
classes of bugs.  It's just a disaster for anything to do with storage.

I crafted this patch as something something which I thought *could* be
a path forward; it disables it by default, and gives a warning about
how it could cause a lot of pain for storage developers, but if other
kernel devs want to use it to potentially find problem in their
networking or wifi drivers --- sure, why not?  Just make it be
something *optional*.

If people really want to make this work for storage, what I think we
would need is variants of spin_lock_init(), mutex_init(), etc., which
take a struct super or a struct block device, with proper
documentation so that people don't have to struggle with undocumented
C preprocessor macros where every single time I need to mess with
lockdep annotations, I have to try figure out exactly what is a class
and subclass.

So in fact, what I was really hoping for was that some variant of this
patch would end up in the sched tree, and get pushed to you v4.15-rcX
patch as a regression fix, and I'd drop it from the ext4 tree.

   - Ted


Xen PV DomU running Kernel 4.14.5-1.el7.elrepo.x86_64: xl -v vcpu-set triggers domU kernel WARNING, then domU becomes unresponsive

2017-12-11 Thread Adi Pircalabu
Hi, first of all, I'm not subscribed to the linux-block@ list. Running 
"xl -v vcpu-set  " in Dom0 triggers the warning below, then a 
number of commands like top or ls stall. The only domU recovery solution 
is to terminate it immediately using "xl destroy".


I couldn't replicate it on:
- CentOS 6 running kernel-2.6.32-696.16.1.el6.x86_64, 
kernel-lt-4.4.105-1.el6.elrepo.x86_64

- CentOS 7 running 4.9.67-1.el7.centos.x86_64

But I can replicate it consistently on:
- CentOS 6 running 4.14.5-1.el6.elrepo.x86_64
- CentOS 7 running 4.14.5-1.el7.elrepo.x86_64

dom0 versions tested with similar results in the domU:
- 4.6.6-6.el7 on kernel 4.9.63-29.el7.x86_64
- 4.6.3-15.el6 on kernel 4.9.37-29.el6.x86_64

Noticed behaviour:
- These commands stall:
top
ls -l /var/tmp
ls -l /tmp
- Stuck in D state on the CentOS 7 domU:
root 5  0.0  0.0  0 0 ?D11:20   0:00 
[kworker/u8:0]
root   316  0.0  0.0  0 0 ?D11:20   0:00 
[jbd2/xvda1-8]

root  1145  0.0  0.2 116636  4776 ?Ds   11:20   0:00 -bash
root  1289  0.0  0.1  25852  2420 ?Ds   11:35   0:00 
/usr/bin/systemd-tmpfiles --clean
root  1290  0.0  0.1 125248  2696 pts/1D+   11:44   0:00 ls 
--color=auto -l /tmp/
root  1293  0.0  0.1 125248  2568 pts/2D+   11:44   0:00 ls 
--color=auto -l /var/tmp

root  1296  0.0  0.2 116636  4908 pts/3Ds+  11:44   0:00 -bash
root  1358  0.0  0.1 125248  2612 pts/4D+   11:47   0:00 ls 
--color=auto -l /var/tmp


At a first glance it appears the issue is the domU kernel. Stack traces 
follow:


-CentOS 6 kernel-ml-4.14.5-1.el6.elrepo.x86_64 start here-
[ cut here ]
WARNING: CPU: 4 PID: 60 at block/blk-mq.c:1144 
__blk_mq_run_hw_queue+0x9e/0xc0
Modules linked in: intel_cstate(-) ipt_REJECT nf_reject_ipv4 
nf_conntrack_ipv4 nf_defrag_ipv4 xt_multiport iptable_filter ip_tables 
ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_state 
nf_conntrack libcrc32c ip6table_filter ip6_tables dm_mod dax 
xen_netfront crc32_pclmul crct10dif_pclmul ghash_clmulni_intel 
crc32c_intel pcbc aesni_intel glue_helper crypto_simd cryptd aes_x86_64 
coretemp hwmon x86_pkg_temp_thermal sb_edac intel_rapl_perf pcspkr ext4 
jbd2 mbcache xen_blkfront
CPU: 4 PID: 60 Comm: kworker/4:1H Not tainted 4.14.5-1.el6.elrepo.x86_64 
#1

Workqueue: kblockd blk_mq_run_work_fn
task: 8802711a2780 task.stack: c90041af4000
RIP: e030:__blk_mq_run_hw_queue+0x9e/0xc0
RSP: e02b:c90041af7c48 EFLAGS: 00010202
RAX: 0001 RBX: 88027117fa80 RCX: 0001
RDX: 88026b053ee0 RSI: 88027351bca0 RDI: 88026b072800
RBP: c90041af7c68 R08: c90041af7eb8 R09: 8802711a2810
R10: 7ff0 R11: 0001 R12: 88026b072800
R13: e8d04d00 R14:  R15: e8d04d05
FS:  2b7b7c89b700() GS:88027350() 
knlGS:

CS:  e033 DS:  ES:  CR0: 80050033
CR2: ff600400 CR3: 00026d953000 CR4: 00042660
Call Trace:
 blk_mq_run_work_fn+0x31/0x40
 process_one_work+0x174/0x440
 ? xen_mc_flush+0xad/0x1b0
 ? schedule+0x3a/0xa0
 worker_thread+0x6b/0x410
 ? default_wake_function+0x12/0x20
 ? __wake_up_common+0x84/0x130
 ? maybe_create_worker+0x120/0x120
 ? schedule+0x3a/0xa0
 ? _raw_spin_unlock_irqrestore+0x16/0x20
 ? maybe_create_worker+0x120/0x120
 kthread+0x111/0x150
 ? __kthread_init_worker+0x40/0x40
 ret_from_fork+0x25/0x30
Code: 89 df e8 06 2f d9 ff 4c 89 e7 41 89 c5 e8 0b 6e 00 00 44 89 ee 48 
89 df e8 20 2f d9 ff 48 8b 5d e8 4c 8b 65 f0 4c 8b 6d f8 c9 c3 <0f> ff 
eb aa 4c 89 e7 e8 e6 6d 00 00 48 8b 5d e8 4c 8b 65 f0 4c

---[ end trace fe2aaf4e723042fd ]---
-CentOS 6 kernel-ml-4.14.5-1.el6.elrepo.x86_64 end here-

-CentOS 7 kernel-ml-4.14.5-1.el7.elrepo.x86_64 start here-
[  116.528885] [ cut here ]
[  116.528894] WARNING: CPU: 3 PID: 38 at block/blk-mq.c:1144 
__blk_mq_run_hw_queue+0x89/0xa0
[  116.528898] Modules linked in: intel_cstate(-) ip_set_hash_ip ip_set 
nfnetlink x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul 
ghash_clmulni_intel pcbc aesni_intel crypto_simd glue_helper cryptd 
intel_rapl_perf pcspkr nfsd auth_rpcgss nfs_acl lockd grace sunrpc 
ip_tables ext4 mbcache jbd2 xen_netfront xen_blkfront crc32c_intel
[  116.528919] CPU: 3 PID: 38 Comm: kworker/3:1H Not tainted 
4.14.5-1.el7.elrepo.x86_64 #1
[  116.529007] Code: 00 e8 7c c5 45 00 4c 89 e7 e8 14 4b d7 ff 48 89 df 
41 89 c5 e8 19 66 00 00 44 89 ee 4c 89 e7 e8 2e 4b d7 ff 5b 41 5c 41 5d 
5d c3 <0f> ff eb b4 48 89 df e8 fb 65 00 00 5b 41 5c 41 5d 5d c3 0f ff

[  116.529034] ---[ end trace a7814e3ec9a330c6 ]---
[  147.424117] [ cut here ]
[  147.424150] WARNING: CPU: 2 PID: 24 at block/blk-mq.c:1144 
__blk_mq_run_hw_queue+0x89/0xa0
[  147.424160] Modules linked in: ip_set_hash_ip ip_set nfnetlink 
x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul 

Re: [PATCH V8 0/7] blk-mq support for ZBC disks

2017-12-11 Thread Damien Le Moal
Jens,

On Fri, 2017-11-24 at 16:54 -0700, Jens Axboe wrote:
> On 11/24/2017 04:48 PM, Damien Le Moal wrote:
> > [Full quote deleted]
> > 
> > Hi Jens,
> > 
> > Any comment regarding this series ?
> > I understand that this would be for the 4.16 merge window, so no hurry,
> > but I would like to know if I need to go back to the drawing board for
> > ZBC blk-mq/scsi-mq support or if this is an acceptable solution.
> 
> I'll give it a thorough look-over on Monday.

Would you have any comment on the series ?
I understand you are busy so please feel free to let me know if there is
anything I can do to facilitate your review.

Thank you.

-- 
Damien Le Moal
Western Digital

Re: [PATCH] locking/lockdep: Add CONFIG_LOCKDEP_AGGRESSIVE

2017-12-11 Thread Linus Torvalds
On Sun, Dec 10, 2017 at 7:50 PM, Theodore Ts'o  wrote:
> CONFIG_LOCKDEP_CROSSRELEASE and CONFIG_LOCKDEP_COMPLETIONS can result
> in a large number of false positives because lockdep doesn't
> understand how to deal with multiple stacked loop or MD devices.

Guys, can we just remove this nasty crud already?

It's not working. Give it up. It was complex, it was buggy, it was slow.

Now it's causing people to disable lockdep entirely, or play these
kinds of games in unrelated trees.

It's time to give up on bad debugging, and definitely _not_ enable it
by default for PROVE_LOCKING.

Ted: in the meantime, don't use PROVE_LOCKING. Just use
DEBUG_LOCK_ALLOC instead. Please drop this patch from your tree.

  Linus


[PATCH] block/sed-opal: Fix warnings never less than zero

2017-12-11 Thread Vasyl Gomonovych
Fixed following smatch warnings in sed-opal.c
block/sed-opal.c:2311 opal_set_new_pw()
warn: unsigned 'opal_pw->session.who' is never less than zero.

Unless enum opal_user interface will remain stable warning could be fixed
in mentioned way by removing unnecessary check

Signed-off-by: Vasyl Gomonovych 
---
 block/sed-opal.c | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/block/sed-opal.c b/block/sed-opal.c
index 9ed51d0c..a5f0fa2 100644
--- a/block/sed-opal.c
+++ b/block/sed-opal.c
@@ -,8 +,7 @@ static int opal_lock_unlock(struct opal_dev *dev,
 {
int ret;
 
-   if (lk_unlk->session.who < OPAL_ADMIN1 ||
-   lk_unlk->session.who > OPAL_USER9)
+   if (lk_unlk->session.who > OPAL_USER9)
return -EINVAL;
 
mutex_lock(>dev_lock);
@@ -2308,9 +2307,7 @@ static int opal_set_new_pw(struct opal_dev *dev, struct 
opal_new_pw *opal_pw)
};
int ret;
 
-   if (opal_pw->session.who < OPAL_ADMIN1 ||
-   opal_pw->session.who > OPAL_USER9  ||
-   opal_pw->new_user_pw.who < OPAL_ADMIN1 ||
+   if (opal_pw->session.who > OPAL_USER9  ||
opal_pw->new_user_pw.who > OPAL_USER9)
return -EINVAL;
 
-- 
1.9.1



[RFC] blk-throttle: export io_serviced_recursive, io_service_bytes_recursive

2017-12-11 Thread weiping zhang
export these two interface for cgroup-v1.

Signed-off-by: weiping zhang 
---
 block/blk-throttle.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index 96ad326..1d7637f 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -1512,10 +1512,20 @@ static struct cftype throtl_legacy_files[] = {
.seq_show = blkg_print_stat_bytes,
},
{
+   .name = "throttle.io_service_bytes_recursive",
+   .private = (unsigned long)_policy_throtl,
+   .seq_show = blkg_print_stat_bytes_recursive,
+   },
+   {
.name = "throttle.io_serviced",
.private = (unsigned long)_policy_throtl,
.seq_show = blkg_print_stat_ios,
},
+   {
+   .name = "throttle.io_serviced_recursive",
+   .private = (unsigned long)_policy_throtl,
+   .seq_show = blkg_print_stat_ios_recursive,
+   },
{ } /* terminate */
 };
 
-- 
2.9.4



Re: [PATCH V15 00/22] mmc: Add Command Queue support

2017-12-11 Thread Ulf Hansson
On 29 November 2017 at 16:47, Ulf Hansson  wrote:
> Hi Adrian,
>
> On 29 November 2017 at 14:40, Adrian Hunter  wrote:
>> Hi
>>
>> Here is V15 of the hardware command queue patches without the software
>> command queue patches, now using blk-mq and now with blk-mq support for
>> non-CQE I/O.
>
> I have applied patches 1->19 for next.  Deferring patch 21->23 for a while.

We haven't got any reports about big regressions, I think this looks solid!

So, I have decided to apply 21->23 for next as well.

[...]

Thanks and kind regards
Uffe