Re: Re: [PATCH] bcache: fix error count in memory shrink

2018-03-05 Thread tang . junhui
Hi Mike--- >>Hi Tang Junhui--- >> >>I'm not really sure about this one. It changes the semantics of the >>amount of work done-- nr_to_scan now means number of things to free >>instead of the number to check. >> >The code seems to be designed as that, sc->nr_to_scan marks how much btree >nodes to

[PATCH] [v2] bcache: fix using of loop variable in memory shrink

2018-03-05 Thread tang . junhui
From: Tang Junhui In bch_mca_scan(), There are some confusion and logical error in the use of loop variables. In this patch, we clarify them as: 1) nr: the number of btree nodes needs to scan, which will decrease after we scan a btree node, and should not be less than 0;

Re: [PATCH] bcache: fix error count in memory shrink

2018-03-05 Thread tang . junhui
Hi Mike--- >Hi Tang Junhui--- > >I'm not really sure about this one. It changes the semantics of the >amount of work done-- nr_to_scan now means number of things to free >instead of the number to check. > The code seems to be designed as that, sc->nr_to_scan marks how much btree nodes to scan in

[PATCH V2] block: null_blk: fix 'Invalid parameters' when loading module

2018-03-05 Thread Ming Lei
On ARM64, the default page size has been 64K on some distributions, and we should allow ARM64 people to play null_blk. This patch fixes the issue by extend page bitmap size for supporting other non-4KB PAGE_SIZE. Cc: Bart Van Assche Cc: Shaohua Li Cc:

Re: [PATCH] block: null_blk: fix 'Invalid parameters' failure when loading module

2018-03-05 Thread Ming Lei
On Mon, Mar 05, 2018 at 03:57:07PM +, Bart Van Assche wrote: > On Sat, 2018-03-03 at 10:24 +0800, Ming Lei wrote: > > struct nullb_page { > > struct page *page; > > - unsigned long bitmap; > > + unsigned long bitmap[DIV_ROUND_UP(MAP_SZ, sizeof(unsigned long) * 8)]; > > }; > > Could

Re: [PATCH v2 07/10] nvme-pci: Use PCI p2pmem subsystem to manage the CMB

2018-03-05 Thread Logan Gunthorpe
On 05/03/18 05:49 PM, Oliver wrote: It's in arch/powerpc/kernel/io.c as _memcpy_toio() and it has two full barriers! Awesome! Our io.h indicates that our iomem accessors are designed to provide x86ish strong ordering of accesses to MMIO space. The git log indicates arch/powerpc/kernel/io.c

Re: [PATCH v2 07/10] nvme-pci: Use PCI p2pmem subsystem to manage the CMB

2018-03-05 Thread Oliver
On Tue, Mar 6, 2018 at 4:10 AM, Logan Gunthorpe wrote: > > > On 05/03/18 09:00 AM, Keith Busch wrote: >> >> On Mon, Mar 05, 2018 at 12:33:29PM +1100, Oliver wrote: >>> >>> On Thu, Mar 1, 2018 at 10:40 AM, Logan Gunthorpe >>> wrote: @@ -429,10

Re: EXT4 Oops (Re: [PATCH V15 06/22] mmc: block: Add blk-mq support)

2018-03-05 Thread Dmitry Osipenko
On 01.03.2018 19:04, Theodore Ts'o wrote: > On Thu, Mar 01, 2018 at 10:55:37AM +0200, Adrian Hunter wrote: >> On 27/02/18 11:28, Adrian Hunter wrote: >>> On 26/02/18 23:48, Dmitry Osipenko wrote: But still something is wrong... I've been getting occasional EXT4 Ooops's, like the

Re: [PATCH v2 04/10] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches

2018-03-05 Thread Logan Gunthorpe
On 05/03/18 03:28 PM, Bjorn Helgaas wrote: If you put the #ifdef right here, then it's easier to read because we can see that "oh, this is a special and uncommon case that I can probably ignore". Makes sense. I'll do that. Thanks, Logan

Re: [PATCH v2 04/10] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches

2018-03-05 Thread Bjorn Helgaas
On Thu, Mar 01, 2018 at 12:13:10PM -0700, Logan Gunthorpe wrote: > > > On 01/03/18 11:02 AM, Bjorn Helgaas wrote: > > > void pci_enable_acs(struct pci_dev *dev) > > > { > > > + if (pci_p2pdma_disable_acs(dev)) > > > + return; > > > > This doesn't read naturally to me. I do see that

Re: Some bcache patches need reveiw

2018-03-05 Thread Michael Lyle
Hey Tang Junhui-- You're right. I'm sorry, these had slipped through the cracks / I had lost track of them. I also have the rest of Coly's patchset to work through. I was able to apply the first two, and will continue to work on my for-next branch. Mike On 03/05/2018 02:39 AM,

Re: [PATCH] bcache: fix error count in memory shrink

2018-03-05 Thread Michael Lyle
Hi Tang Junhui--- I'm not really sure about this one. It changes the semantics of the amount of work done-- nr_to_scan now means number of things to free instead of the number to check. If the system is under severe memory pressure, and most of the cache is essential/actively used, this could

Re: [PATCH] bcache: fix error return value in memory shrink

2018-03-05 Thread Michael Lyle
LGTM, applied On 01/30/2018 07:30 PM, tang.jun...@zte.com.cn wrote: > From: Tang Junhui > > In bch_mca_scan(), the return value should not be the number of freed btree > nodes, but the number of pages of freed btree nodes. > > Signed-off-by: Tang Junhui

Re: [PATCH] bcache: fix incorrect sysfs output value of strip size

2018-03-05 Thread Michael Lyle
LGTM, applied. On 02/10/2018 06:30 PM, tang.jun...@zte.com.cn wrote: > From: Tang Junhui > > Stripe size is shown as zero when no strip in back end device: > [root@ceph132 ~]# cat /sys/block/sdd/bcache/stripe_size > 0.0k > > Actually it should be 1T Bytes (1 << 31

Re: [PATCH 0/2] Duplicate UUID fixes for 4.16

2018-03-05 Thread Jens Axboe
On 3/5/18 2:41 PM, Michael Lyle wrote: > Jens-- > > Sorry for a couple of last-minute changes. Marc Merlin noticed an issue > with disk imaging where if multiple devices were present with the same > bcache UUID that a kernel panic could result. Tang Junhui fixed this. > I found a related data

Re: [PATCH] bcache: don't attach backing with duplicate UUID

2018-03-05 Thread Michael Lyle
Hi Tang Junhui--- Thanks for your review. I just sent it upstream (with your change) to Jens. Mike On 03/04/2018 05:07 PM, tang.jun...@zte.com.cn wrote: > Hello Mike > > I send the email from my personal mailbox(110950...@qq.com), it may be fail, > so I resend this email from my office

[PATCH 1/2] bcache: fix crashes in duplicate cache device register

2018-03-05 Thread Michael Lyle
From: Tang Junhui Kernel crashed when register a duplicate cache device, the call trace is bellow: [ 417.643790] CPU: 1 PID: 16886 Comm: bcache-register Tainted: G W OE4.15.5-amd64-preempt-sysrq-20171018 #2 [ 417.643861] Hardware name: LENOVO

[PATCH 2/2] bcache: don't attach backing with duplicate UUID

2018-03-05 Thread Michael Lyle
This can happen e.g. during disk cloning. This is an incomplete fix: it does not catch duplicate UUIDs earlier when things are still unattached. It does not unregister the device. Further changes to cope better with this are planned but conflict with Coly's ongoing improvements to handling

[PATCH 0/2] Duplicate UUID fixes for 4.16

2018-03-05 Thread Michael Lyle
Jens-- Sorry for a couple of last-minute changes. Marc Merlin noticed an issue with disk imaging where if multiple devices were present with the same bcache UUID that a kernel panic could result. Tang Junhui fixed this. I found a related data corruption issue with duplicate backing devices.

Re: Hangs in balance_dirty_pages with arm-32 LPAE + highmem

2018-03-05 Thread Laura Abbott
On 02/26/2018 06:28 AM, Michal Hocko wrote: On Fri 23-02-18 11:51:41, Laura Abbott wrote: Hi, The Fedora arm-32 build VMs have a somewhat long standing problem of hanging when running mkfs.ext4 with a bunch of processes stuck in D state. This has been seen as far back as 4.13 but is still

Re: [PATCH v2 07/10] nvme-pci: Use PCI p2pmem subsystem to manage the CMB

2018-03-05 Thread Jason Gunthorpe
On Mon, Mar 05, 2018 at 01:42:12PM -0700, Keith Busch wrote: > On Mon, Mar 05, 2018 at 01:10:53PM -0700, Jason Gunthorpe wrote: > > So when reading the above mlx code, we see the first wmb() being used > > to ensure that CPU stores to cachable memory are visible to the DMA > > triggered by the

Re: [PATCH v2 07/10] nvme-pci: Use PCI p2pmem subsystem to manage the CMB

2018-03-05 Thread Keith Busch
On Mon, Mar 05, 2018 at 01:10:53PM -0700, Jason Gunthorpe wrote: > So when reading the above mlx code, we see the first wmb() being used > to ensure that CPU stores to cachable memory are visible to the DMA > triggered by the doorbell ring. IIUC, we don't need a similar barrier for NVMe to ensure

Re: [PATCH v2 00/10] Copy Offload in NVMe Fabrics with P2P PCI Memory

2018-03-05 Thread Stephen Bates
>Yes i need to document that some more in hmm.txt... Hi Jermone, thanks for the explanation. Can I suggest you update hmm.txt with what you sent out? > I am about to send RFC for nouveau, i am still working out some bugs. Great. I will keep an eye out for it. An example user of hmm will

Re: [PATCH v2 07/10] nvme-pci: Use PCI p2pmem subsystem to manage the CMB

2018-03-05 Thread Logan Gunthorpe
On 05/03/18 01:10 PM, Jason Gunthorpe wrote: So when reading the above mlx code, we see the first wmb() being used to ensure that CPU stores to cachable memory are visible to the DMA triggered by the doorbell ring. Oh, yes, that makes sense. Disregard my previous email as I was wrong. Logan

Re: [PATCH v2 07/10] nvme-pci: Use PCI p2pmem subsystem to manage the CMB

2018-03-05 Thread Logan Gunthorpe
On 05/03/18 12:57 PM, Sagi Grimberg wrote: Keith, while we're on this, regardless of cmb, is SQE memcopy and DB update ordering always guaranteed? If you look at mlx4 (rdma device driver) that works exactly the same as nvme you will find: --     qp->sq.head += nreq;    

Re: [PATCH v2 07/10] nvme-pci: Use PCI p2pmem subsystem to manage the CMB

2018-03-05 Thread Jason Gunthorpe
On Mon, Mar 05, 2018 at 09:57:27PM +0200, Sagi Grimberg wrote: > Keith, while we're on this, regardless of cmb, is SQE memcopy and DB update > ordering always guaranteed? > > If you look at mlx4 (rdma device driver) that works exactly the same as > nvme you will find: > -- >

Re: [PATCH v2 07/10] nvme-pci: Use PCI p2pmem subsystem to manage the CMB

2018-03-05 Thread Sagi Grimberg
- if (nvmeq->sq_cmds_io) - memcpy_toio(>sq_cmds_io[tail], cmd, sizeof(*cmd)); - else - memcpy(>sq_cmds[tail], cmd, sizeof(*cmd)); + memcpy(>sq_cmds[tail], cmd, sizeof(*cmd)); Hmm, how safe is replacing memcpy_toio() with regular memcpy()? On PPC

Re: [PATCH] lightnvm: pblk: refactor init/exit sequences

2018-03-05 Thread Matias Bjørling
On 03/05/2018 03:18 PM, Javier González wrote: On 5 Mar 2018, at 15.16, Matias Bjørling wrote: On 03/05/2018 02:45 PM, Javier González wrote: On 5 Mar 2018, at 14.38, Matias Bjørling wrote: On 03/01/2018 08:29 PM, Javier González wrote: On 1 Mar 2018, at

Re: [PATCH v2 07/10] nvme-pci: Use PCI p2pmem subsystem to manage the CMB

2018-03-05 Thread Logan Gunthorpe
On 05/03/18 11:02 AM, Sinan Kaya wrote: writel has a barrier inside on ARM64. https://elixir.bootlin.com/linux/latest/source/arch/arm64/include/asm/io.h#L143 Yes, and no barrier inside memcpy_toio as it uses __raw_writes. This should be sufficient as we are only accessing addresses that

Re: [PATCH v2 07/10] nvme-pci: Use PCI p2pmem subsystem to manage the CMB

2018-03-05 Thread Logan Gunthorpe
On 05/03/18 09:00 AM, Keith Busch wrote: On Mon, Mar 05, 2018 at 12:33:29PM +1100, Oliver wrote: On Thu, Mar 1, 2018 at 10:40 AM, Logan Gunthorpe wrote: @@ -429,10 +429,7 @@ static void __nvme_submit_cmd(struct nvme_queue *nvmeq, { u16 tail = nvmeq->sq_tail;

Re: vgdisplay hang on iSCSI session

2018-03-05 Thread Bart Van Assche
On Mon, 2018-03-05 at 10:44 +0100, Jean-Louis Dupond wrote: > Maby a long shot, but could it be fixed by > https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit/drivers/block/loop.c?h=v4.9.86=56bc086358cac1a2949783646eabd57447b9d672 > > ? > Or shouldn't that fix such

Re: [PATCH v2 07/10] nvme-pci: Use PCI p2pmem subsystem to manage the CMB

2018-03-05 Thread Keith Busch
On Mon, Mar 05, 2018 at 12:33:29PM +1100, Oliver wrote: > On Thu, Mar 1, 2018 at 10:40 AM, Logan Gunthorpe wrote: > > @@ -429,10 +429,7 @@ static void __nvme_submit_cmd(struct nvme_queue *nvmeq, > > { > > u16 tail = nvmeq->sq_tail; > > > - if

Re: [PATCH] block: null_blk: fix 'Invalid parameters' failure when loading module

2018-03-05 Thread Bart Van Assche
On Sat, 2018-03-03 at 10:24 +0800, Ming Lei wrote: > struct nullb_page { > struct page *page; > - unsigned long bitmap; > + unsigned long bitmap[DIV_ROUND_UP(MAP_SZ, sizeof(unsigned long) * 8)]; > }; Could DECLARE_BITMAP() have been used here? Thanks, Bart.

[PATCH v3] blk-throttle: fix race between blkcg_bio_issue_check() and cgroup_rmdir()

2018-03-05 Thread Joseph Qi
We've triggered a WARNING in blk_throtl_bio() when throttling writeback io, which complains blkg->refcnt is already 0 when calling blkg_get(), and then kernel crashes with invalid page request. After investigating this issue, we've found it is caused by a race between blkcg_bio_issue_check() and

RE: [PATCH V3 1/8] scsi: hpsa: fix selection of reply queue

2018-03-05 Thread Don Brace
> -Original Message- > From: Kashyap Desai [mailto:kashyap.de...@broadcom.com] > Sent: Monday, March 05, 2018 1:24 AM > To: Laurence Oberman ; Don Brace > ; Ming Lei > Cc: Jens Axboe ;

Re: [PATCH] lightnvm: pblk: refactor init/exit sequences

2018-03-05 Thread Javier González
> On 5 Mar 2018, at 15.16, Matias Bjørling wrote: > > On 03/05/2018 02:45 PM, Javier González wrote: >>> On 5 Mar 2018, at 14.38, Matias Bjørling wrote: >>> >>> On 03/01/2018 08:29 PM, Javier González wrote: > On 1 Mar 2018, at 19.49, Matias Bjørling

Re: [PATCH] lightnvm: pblk: refactor init/exit sequences

2018-03-05 Thread Matias Bjørling
On 03/05/2018 02:45 PM, Javier González wrote: On 5 Mar 2018, at 14.38, Matias Bjørling wrote: On 03/01/2018 08:29 PM, Javier González wrote: On 1 Mar 2018, at 19.49, Matias Bjørling wrote: On 03/01/2018 04:59 PM, Javier González wrote: Refactor init and

Re: [PATCH] lightnvm: pblk: refactor init/exit sequences

2018-03-05 Thread Javier González
> On 5 Mar 2018, at 14.38, Matias Bjørling wrote: > > On 03/01/2018 08:29 PM, Javier González wrote: >>> On 1 Mar 2018, at 19.49, Matias Bjørling wrote: >>> >>> On 03/01/2018 04:59 PM, Javier González wrote: Refactor init and exit sequences to eliminate

Re: [PATCH] lightnvm: pblk: refactor init/exit sequences

2018-03-05 Thread Matias Bjørling
On 03/01/2018 08:29 PM, Javier González wrote: On 1 Mar 2018, at 19.49, Matias Bjørling wrote: On 03/01/2018 04:59 PM, Javier González wrote: Refactor init and exit sequences to eliminate dependencies among init modules and improve readability. Signed-off-by: Javier

Re: [PATCH 01/12] lightnvm: simplify geometry structure.

2018-03-05 Thread Javier González
> On 5 Mar 2018, at 14.07, Matias Bjørling wrote: > > On 03/02/2018 04:21 PM, Javier González wrote: >> Currently, the device geometry is stored redundantly in the nvm_id and >> nvm_geo structures at a device level. Moreover, when instantiating >> targets on a specific number

Re: [PATCH 01/12] lightnvm: simplify geometry structure.

2018-03-05 Thread Matias Bjørling
On 03/02/2018 04:21 PM, Javier González wrote: Currently, the device geometry is stored redundantly in the nvm_id and nvm_geo structures at a device level. Moreover, when instantiating targets on a specific number of LUNs, these structures are replicated and manually modified to fit the instance

[PATCH] block, bfq: keep peak_rate estimation within range 1..2^32-1

2018-03-05 Thread Konstantin Khlebnikov
Rate should never overflow or become zero because it is used as divider. This patch accumulates it with saturation. Signed-off-by: Konstantin Khlebnikov --- block/bfq-iosched.c |8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git

loop: properly declare rotational flag of underlying device

2018-03-05 Thread Holger Hoffstätte
The loop driver has always declared the rotational flag of its device as rotational, even when the device of the mapped file is nonrotational, as is the case with SSDs or on tmpfs. This can confuse filesystem tools which are SSD-aware; in my case I frequently forget to tell mkfs.btrfs that my

Some bcache patches need reveiw

2018-03-05 Thread tang . junhui
Hello Mike I send some patches some times before, with no response, Bellow patches are very simple, can you or anybody else have a review? [PATCH] bcache: fix incorrect sysfs output value of strip size [PATCH] bcache: fix error return value in memory shrink [PATCH] bcache: fix error count in

Re: vgdisplay hang on iSCSI session

2018-03-05 Thread Jean-Louis Dupond
Hi, Maby a long shot, but could it be fixed by https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit/drivers/block/loop.c?h=v4.9.86=56bc086358cac1a2949783646eabd57447b9d672 ? Or shouldn't that fix such kind of issues? It seems like some kind of race condition, cause

[PATCH v2] bcache: move closure debug file into debug direcotry

2018-03-05 Thread tang . junhui
LGTM. Reviewed-by: Tang Junhui > In current code closure debug file is outside of debug directory > and when unloading module there is lack of removing operation > for closure debug file, so it will cause creating error when trying > to reload module. > > This patch