Re: [PATCH V3 1/8] scsi: hpsa: fix selection of reply queue

2018-03-06 Thread Martin K. Petersen

Ming,

> Given both Don and Laurence have verified that patch 1 and patch 2
> does fix IO hang, could you consider to merge the two first?

Oh, and I would still need a formal Acked-by: from Don and Tested-by:
from Laurence.

Also, for 4.16/scsi-fixes I would prefer verification to be done with
just patch 1/8 and none of the subsequent changes in place. Just to make
sure we're testing the right thing.

Thanks!

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH V3 1/8] scsi: hpsa: fix selection of reply queue

2018-03-06 Thread Martin K. Petersen

Hi Ming,

> Given both Don and Laurence have verified that patch 1 and patch 2
> does fix IO hang, could you consider to merge the two first?

I'm not going to merge the MR patch until Kashyap acks it.

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH V3 1/8] scsi: hpsa: fix selection of reply queue

2018-03-06 Thread Ming Lei
On Tue, Mar 06, 2018 at 02:24:25PM -0500, Martin K. Petersen wrote:
> 
> Ming,
> 
> > Given both Don and Laurence have verified that patch 1 and patch 2
> > does fix IO hang, could you consider to merge the two first?
> 
> Oh, and I would still need a formal Acked-by: from Don and Tested-by:
> from Laurence.
> 
> Also, for 4.16/scsi-fixes I would prefer verification to be done with
> just patch 1/8 and none of the subsequent changes in place. Just to make
> sure we're testing the right thing.

Hi Martin,

Please consider 2/8 too since it is still a fix.

Thanks,
Ming


Re: [PATCH][RESEND] block: sed-opal: fix u64 short atom length

2018-03-06 Thread Derrick, Jonathan
Hi Jonas,

On Thu, 2018-03-01 at 14:27 +0100, Jonas Rabenstein wrote:
> The length must be given as bytes and not as 4 bit tuples.
> 
> Signed-off-by: Jonas Rabenstein  n.de>
> ---
>  block/sed-opal.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/block/sed-opal.c b/block/sed-opal.c
> index 36842bfa572e..d5f565e1557a 100644
> --- a/block/sed-opal.c
> +++ b/block/sed-opal.c
> @@ -562,7 +562,7 @@ static void add_token_u64(int *err, struct
> opal_dev *cmd, u64 number)
>   }
>  
>   msb = fls(number);
> - len = DIV_ROUND_UP(msb, 4);
> + len = DIV_ROUND_UP(msb, 8);

This change looks partially correct, but I believe we should be doing
fls64() on 'number' as well.

It looks like it currently coincidentally works with u64 numbers
falling in 32-bit ranges.


>  
>   if (cmd->pos >= IO_BUFFER_LENGTH - len - 1) {
>   pr_debug("Error adding u64: end of buffer.\n");

smime.p7s
Description: S/MIME cryptographic signature


Re: [PATCH][RESEND] block: sed-opal: fix response string extraction

2018-03-06 Thread Derrick, Jonathan
This looks correct.

Adding my Ack unless Scott has objections

Acked-by: Jonathan Derrick 

On Thu, 2018-03-01 at 14:26 +0100, Jonas Rabenstein wrote:
> Tokens are prefixed by a variable length of bytes. If a bytestring is
> not stored in an tiny or short atom, we have to skip more than one
> byte
> in order to have the actual bytes not prefixed by the bytes
> describing
> the actual length of the string.
> 
> Signed-off-by: Jonas Rabenstein  n.de>
> ---
>  block/sed-opal.c | 26 +++---
>  1 file changed, 23 insertions(+), 3 deletions(-)
> 
> diff --git a/block/sed-opal.c b/block/sed-opal.c
> index 525506bed399..33052d0111de 100644
> --- a/block/sed-opal.c
> +++ b/block/sed-opal.c
> @@ -876,6 +876,9 @@ static int response_parse(const u8 *buf, size_t
> length,
>  static size_t response_get_string(const struct parsed_resp *resp,
> int n,
> const char **store)
>  {
> + u8 skip;
> + const struct opal_resp_tok *token;
> +
>   *store = NULL;
>   if (!resp) {
>   pr_debug("Response is NULL\n");
> @@ -888,13 +891,30 @@ static size_t response_get_string(const struct
> parsed_resp *resp, int n,
>   return 0;
>   }
>  
> - if (resp->toks[n].type != OPAL_DTA_TOKENID_BYTESTRING) {
> + token = >toks[n];
> + if (token->type != OPAL_DTA_TOKENID_BYTESTRING) {
>   pr_debug("Token is not a byte string!\n");
>   return 0;
>   }
>  
> - *store = resp->toks[n].pos + 1;
> - return resp->toks[n].len - 1;
> + switch (token->width) {
> + case OPAL_WIDTH_TINY:
> + case OPAL_WIDTH_SHORT:
> + skip = 1;
> + break;
> + case OPAL_WIDTH_MEDIUM:
> + skip = 2;
> + break;
> + case OPAL_WIDTH_LONG:
> + skip = 4;
> + break;
> + default:
> + pr_debug("Token has invalid width!\n");
> + return 0;
> + }
> +
> + *store = token->pos + skip;
> + return token->len - skip;
>  }
>  
>  static u64 response_get_u64(const struct parsed_resp *resp, int n)

smime.p7s
Description: S/MIME cryptographic signature


Re: [PATCH v2] bcache: move closure debug file into debug direcotry

2018-03-06 Thread Michael Lyle
Sorry- I had to pull/unapply this actually.

On 03/04/2018 11:40 PM, Chengguang Xu wrote:
> -static struct dentry *debug
> +struct dentry *debug;

This conflicts with other symbols called "debug" and doesn't compile.
Please be sure that your patch set compiles before submitting.

Mike

>  
>  #ifdef CONFIG_BCACHE_DEBUG
>  
> diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
> index 1a9fdab..b784292 100644
> --- a/drivers/md/bcache/super.c
> +++ b/drivers/md/bcache/super.c
> @@ -2133,7 +2133,6 @@ static int __init bcache_init(void)
>   mutex_init(_register_lock);
>   init_waitqueue_head(_wait);
>   register_reboot_notifier();
> - closure_debug_init();
>  
>   bcache_major = register_blkdev(0, "bcache");
>   if (bcache_major < 0) {
> @@ -2145,7 +2144,7 @@ static int __init bcache_init(void)
>   if (!(bcache_wq = alloc_workqueue("bcache", WQ_MEM_RECLAIM, 0)) ||
>   !(bcache_kobj = kobject_create_and_add("bcache", fs_kobj)) ||
>   bch_request_init() ||
> - bch_debug_init(bcache_kobj) ||
> + bch_debug_init(bcache_kobj) || closure_debug_init() ||
>   sysfs_create_files(bcache_kobj, files))
>   goto err;
>  
> 



Re: [PATCH v2] bcache: move closure debug file into debug direcotry

2018-03-06 Thread Chengguang Xu
Hi Mike

I'm really sorry for the inconvenient, in my test box it can compile with no 
error,
so I didn't notice that before, I've sent modified v2 version for the problem.


> Sent: Wednesday, March 07, 2018 at 10:54 AM
> From: "Michael Lyle" 
> To: "Chengguang Xu" , tang.jun...@zte.com.cn, 
> kent.overstr...@gmail.com
> Cc: linux-bca...@vger.kernel.org, linux-block@vger.kernel.org
> Subject: Re: [PATCH v2] bcache: move closure debug file into debug direcotry
>
> Sorry- I had to pull/unapply this actually.
> 
> On 03/04/2018 11:40 PM, Chengguang Xu wrote:
> > -static struct dentry *debug
> > +struct dentry *debug;
> 
> This conflicts with other symbols called "debug" and doesn't compile.
> Please be sure that your patch set compiles before submitting.
> 
> Mike
> 
> >  
> >  #ifdef CONFIG_BCACHE_DEBUG
> >  
> > diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
> > index 1a9fdab..b784292 100644
> > --- a/drivers/md/bcache/super.c
> > +++ b/drivers/md/bcache/super.c
> > @@ -2133,7 +2133,6 @@ static int __init bcache_init(void)
> > mutex_init(_register_lock);
> > init_waitqueue_head(_wait);
> > register_reboot_notifier();
> > -   closure_debug_init();
> >  
> > bcache_major = register_blkdev(0, "bcache");
> > if (bcache_major < 0) {
> > @@ -2145,7 +2144,7 @@ static int __init bcache_init(void)
> > if (!(bcache_wq = alloc_workqueue("bcache", WQ_MEM_RECLAIM, 0)) ||
> > !(bcache_kobj = kobject_create_and_add("bcache", fs_kobj)) ||
> > bch_request_init() ||
> > -   bch_debug_init(bcache_kobj) ||
> > +   bch_debug_init(bcache_kobj) || closure_debug_init() ||
> > sysfs_create_files(bcache_kobj, files))
> > goto err;
> >  
> > 
> 
> 


[PATCH v2] bcache: move closure debug file into debug direcotry

2018-03-06 Thread Chengguang Xu
In current code closure debug file is outside of debug directory
and when unloading module there is lack of removing operation
for closure debug file, so it will cause creating error when trying
to reload  module.

This patch move closure debug file into "bcache" debug direcory
so that the file can get deleted properly.

Signed-off-by: Chengguang Xu 
---
Changes since v1:
- Rename dentry name of debug directory to "bcache_debug" from "debug" to
avoid compile error.

 drivers/md/bcache/closure.c |  9 +
 drivers/md/bcache/closure.h |  5 +++--
 drivers/md/bcache/debug.c   | 14 +++---
 drivers/md/bcache/super.c   |  3 +--
 4 files changed, 16 insertions(+), 15 deletions(-)

diff --git a/drivers/md/bcache/closure.c b/drivers/md/bcache/closure.c
index 7f12920..c0949c9 100644
--- a/drivers/md/bcache/closure.c
+++ b/drivers/md/bcache/closure.c
@@ -157,7 +157,7 @@ void closure_debug_destroy(struct closure *cl)
 }
 EXPORT_SYMBOL(closure_debug_destroy);
 
-static struct dentry *debug;
+static struct dentry *closure_debug;
 
 static int debug_seq_show(struct seq_file *f, void *data)
 {
@@ -199,11 +199,12 @@ static int debug_seq_open(struct inode *inode, struct 
file *file)
.release= single_release
 };
 
-void __init closure_debug_init(void)
+int __init closure_debug_init(void)
 {
-   debug = debugfs_create_file("closures", 0400, NULL, NULL, _ops);
+   closure_debug = debugfs_create_file("closures",
+   0400, bcache_debug, NULL, _ops);
+   return IS_ERR_OR_NULL(closure_debug);
 }
-
 #endif
 
 MODULE_AUTHOR("Kent Overstreet ");
diff --git a/drivers/md/bcache/closure.h b/drivers/md/bcache/closure.h
index 3b9dfc9..71427eb 100644
--- a/drivers/md/bcache/closure.h
+++ b/drivers/md/bcache/closure.h
@@ -105,6 +105,7 @@
 struct closure;
 struct closure_syncer;
 typedef void (closure_fn) (struct closure *);
+extern struct dentry *bcache_debug;
 
 struct closure_waitlist {
struct llist_head   list;
@@ -185,13 +186,13 @@ static inline void closure_sync(struct closure *cl)
 
 #ifdef CONFIG_BCACHE_CLOSURES_DEBUG
 
-void closure_debug_init(void);
+int closure_debug_init(void);
 void closure_debug_create(struct closure *cl);
 void closure_debug_destroy(struct closure *cl);
 
 #else
 
-static inline void closure_debug_init(void) {}
+static inline int closure_debug_init(void) { return 0; }
 static inline void closure_debug_create(struct closure *cl) {}
 static inline void closure_debug_destroy(struct closure *cl) {}
 
diff --git a/drivers/md/bcache/debug.c b/drivers/md/bcache/debug.c
index af89408..028f7b3 100644
--- a/drivers/md/bcache/debug.c
+++ b/drivers/md/bcache/debug.c
@@ -17,7 +17,7 @@
 #include 
 #include 
 
-static struct dentry *debug;
+struct dentry *bcache_debug;
 
 #ifdef CONFIG_BCACHE_DEBUG
 
@@ -232,11 +232,11 @@ static int bch_dump_release(struct inode *inode, struct 
file *file)
 
 void bch_debug_init_cache_set(struct cache_set *c)
 {
-   if (!IS_ERR_OR_NULL(debug)) {
+   if (!IS_ERR_OR_NULL(bcache_debug)) {
char name[50];
snprintf(name, 50, "bcache-%pU", c->sb.set_uuid);
 
-   c->debug = debugfs_create_file(name, 0400, debug, c,
+   c->debug = debugfs_create_file(name, 0400, bcache_debug, c,
   _set_debug_ops);
}
 }
@@ -245,13 +245,13 @@ void bch_debug_init_cache_set(struct cache_set *c)
 
 void bch_debug_exit(void)
 {
-   if (!IS_ERR_OR_NULL(debug))
-   debugfs_remove_recursive(debug);
+   if (!IS_ERR_OR_NULL(bcache_debug))
+   debugfs_remove_recursive(bcache_debug);
 }
 
 int __init bch_debug_init(struct kobject *kobj)
 {
-   debug = debugfs_create_dir("bcache", NULL);
+   bcache_debug = debugfs_create_dir("bcache", NULL);
 
-   return IS_ERR_OR_NULL(debug);
+   return IS_ERR_OR_NULL(bcache_debug);
 }
diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index 1a9fdab..b784292 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -2133,7 +2133,6 @@ static int __init bcache_init(void)
mutex_init(_register_lock);
init_waitqueue_head(_wait);
register_reboot_notifier();
-   closure_debug_init();
 
bcache_major = register_blkdev(0, "bcache");
if (bcache_major < 0) {
@@ -2145,7 +2144,7 @@ static int __init bcache_init(void)
if (!(bcache_wq = alloc_workqueue("bcache", WQ_MEM_RECLAIM, 0)) ||
!(bcache_kobj = kobject_create_and_add("bcache", fs_kobj)) ||
bch_request_init() ||
-   bch_debug_init(bcache_kobj) ||
+   bch_debug_init(bcache_kobj) || closure_debug_init() ||
sysfs_create_files(bcache_kobj, files))
goto err;
 
-- 
1.8.3.1



Re: [PATCH V3 8/8] scsi: megaraid: improve scsi_mq performance via .host_tagset

2018-03-06 Thread Ming Lei
On Wed, Feb 28, 2018 at 08:28:48PM +0530, Kashyap Desai wrote:
> Ming -
> 
> Quick testing on my setup -  Performance slightly degraded (4-5% drop)for
> megaraid_sas driver with this patch. (From 1610K IOPS it goes to 1544K)
> I confirm that after applying this patch, we have #queue = #numa node.
> 
> ls -l
> /sys/devices/pci:80/:80:02.0/:83:00.0/host10/target10:2:23/10:
> 2:23:0/block/sdy/mq
> total 0
> drwxr-xr-x. 18 root root 0 Feb 28 09:53 0
> drwxr-xr-x. 18 root root 0 Feb 28 09:53 1
> 
> 
> I would suggest to skip megaraid_sas driver changes using shared_tagset
> until and unless there is obvious gain. If overall interface of using
> shared_tagset is commit in kernel tree, we will investigate (megaraid_sas
> driver) in future about real benefit of using it.

Hi Kashyap,

Now I have put patches for removing operating on scsi_host->host_busy
in V4[1], especially which are done in the following 3 patches:

9221638b9bc9 scsi: avoid to hold host_busy for scsi_mq
1ffc8c0ffbe4 scsi: read host_busy via scsi_host_busy()
e453d3983243 scsi: introduce scsi_host_busy()


Could you run your test on V4 and see if IOPS can be improved on
megaraid_sas?


[1] https://github.com/ming1/linux/commits/v4.16-rc-host-tags-v4

Thanks,
Ming


Re: [PATCH V3 1/8] scsi: hpsa: fix selection of reply queue

2018-03-06 Thread Martin K. Petersen

Ming,

> Please consider 2/8 too since it is still a fix.

I still need the driver maintainer to ack the change.

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH V2] block: null_blk: fix 'Invalid parameters' when loading module

2018-03-06 Thread Jens Axboe
On 3/5/18 9:07 PM, Ming Lei wrote:
> On ARM64, the default page size has been 64K on some distributions, and
> we should allow ARM64 people to play null_blk.
> 
> This patch fixes the issue by extend page bitmap size for supporting
> other non-4KB PAGE_SIZE.

Applied, thanks.

-- 
Jens Axboe



Re: [PATCH v2 07/10] nvme-pci: Use PCI p2pmem subsystem to manage the CMB

2018-03-06 Thread Oliver
On Tue, Mar 6, 2018 at 12:14 PM, Logan Gunthorpe  wrote:
>
> On 05/03/18 05:49 PM, Oliver wrote:
>>
>> It's in arch/powerpc/kernel/io.c as _memcpy_toio() and it has two full
>> barriers!
>>
>> Awesome!
>>
>> Our io.h indicates that our iomem accessors are designed to provide x86ish
>> strong ordering of accesses to MMIO space. The git log indicates
>> arch/powerpc/kernel/io.c has barely been touched in the last decade so
>> odds are most of that code was written in the elder days when people
>> were less aware of ordering issues. It might just be overly conservative
>> by today's standards, but maybe not (see below).
>
>
> Yes, that seems overly conservative.
>
>> (I'm not going to suggest ditching the lwsync trick. mpe is not going
>> to take that patch
>> without a really good reason)
>
>
> Well, that's pretty gross. Is this not exactly the situation mmiowb() is
> meant to solve? See [1].

Yep, mmiowb() is supposed to be used in this situation. According to BenH,
author of that io_sync hack, we implement the stronger semantics
so that we don't break existing drivers that assume spin_unlock() does
order i/o even though it's not supposed to. At a guess the x86 version of
spin_unlock() happens to do that so the rest of us need to either live
with it or fix all the bugs :)

> Though, you're right in principle. Even if power was similar to other
> systems in this way, it's still a risk that if these pages get passed
> somewhere in the kernel that uses a spin lock like that without an mmiowb()
> call, then it's going to have a bug. For now, the risk is pretty low as we
> know exactly where all the p2pmem pages will be used but if it gets into
> other places, all bets are off.

Yeah this was my main concern with the whole approach. For ioremap()ed
memory we have the __iomem annotation to help with tracking when we
need to be careful, but we'd lose that entirely here.

> I did do some work trying to make a safe
> version of io-pages and also trying to change from pages to pfn_t in large
> areas but neither approach seemed likely to get any traction in the
> community, at least not in the near term.

It's a tricky problem. HMM with DEVICE_PRIVATE might be more
palatable than the pfn_t conversion since there would still be struct pages
backing the device memory. That said, there are probably other issues with
device private memory not being part of the linear mapping, but HMM
provides some assistance there.

Oliver


Re: Hangs in balance_dirty_pages with arm-32 LPAE + highmem

2018-03-06 Thread Tetsuo Handa
2kB inactive_anon:13452kB active_file:14556kB 
inactive_file:14452kB unevictable:0kB writepending:42740kB present:1048576kB 
managed:951188kB mlocked:0kB kernel_stack:22400kB pagetables:37304kB bounce:0kB 
free_pcp:248kB local_pcp:0kB free_cma:0kB
[  783.437211] Node 0 Normal free:16920kB min:17320kB low:21648kB high:25976kB 
active_anon:570132kB inactive_anon:13452kB active_file:14756kB 
inactive_file:13888kB unevictable:0kB writepending:42740kB present:1048576kB 
managed:951188kB mlocked:0kB kernel_stack:22304kB pagetables:37304kB bounce:0kB 
free_pcp:312kB local_pcp:32kB free_cma:0kB
[ 1242.729271] Node 0 Normal free:16920kB min:17320kB low:21648kB high:25976kB 
active_anon:570132kB inactive_anon:13452kB active_file:14072kB 
inactive_file:14304kB unevictable:0kB writepending:42740kB present:1048576kB 
managed:951188kB mlocked:0kB kernel_stack:22128kB pagetables:37304kB bounce:0kB 
free_pcp:440kB local_pcp:48kB free_cma:0kB
[ 1412.248884] Node 0 Normal free:16920kB min:17320kB low:21648kB high:25976kB 
active_anon:570132kB inactive_anon:13452kB active_file:14332kB 
inactive_file:14280kB unevictable:0kB writepending:42740kB present:1048576kB 
managed:951188kB mlocked:0kB kernel_stack:22128kB pagetables:37304kB bounce:0kB 
free_pcp:440kB local_pcp:48kB free_cma:0kB
[ 1549.795514] Node 0 Normal free:16920kB min:17320kB low:21648kB high:25976kB 
active_anon:570132kB inactive_anon:13452kB active_file:14416kB 
inactive_file:14272kB unevictable:0kB writepending:42740kB present:1048576kB 
managed:951188kB mlocked:0kB kernel_stack:22128kB pagetables:37304kB bounce:0kB 
free_pcp:440kB local_pcp:48kB free_cma:0kB

Complete log is http://I-love.SAKURA.ne.jp/tmp/serial-20180306.txt.xz .
Config is http://I-love.SAKURA.ne.jp/tmp/config-4.16-rc4 .