Add bsg_job_put() and bsg_job_get() so don't need to export
bsg_destroy_job() any more.
Signed-off-by: Johannes Thumshirn
Reviewed-by: Hannes Reinecke
---
block/bsg-lib.c | 17 ++---
drivers/scsi/scsi_transport_fc.c | 4 ++--
This basically passed up the pci_irq_get_affinity information through
virtio through an optional get_vq_affinity method. It is only implemented
by the PCI backend for now, and only when we use per-virtqueue IRQs.
Signed-off-by: Christoph Hellwig
---
Use automatic IRQ affinity assignment in the virtio layer if available,
and build the blk-mq queues based on it.
Signed-off-by: Christoph Hellwig
---
drivers/block/virtio_blk.c | 13 -
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git
bsg_softirq_done() and fc_bsg_softirq_done() are copies of each other, so
ditch the fc specific one.
Signed-off-by: Johannes Thumshirn
Reviewed-by: Hannes Reinecke
---
block/bsg-lib.c | 3 ++-
drivers/scsi/scsi_transport_fc.c | 15
Unexport bsg_softirq_done() again, we don't need it outside of bsg-lib.c
anymore now that scsi_transport_fc is a pure bsg-lib client.
Signed-off-by: Johannes Thumshirn
Reviewed-by: Hannes Reinecke
---
block/bsg-lib.c | 3 +--
include/linux/bsg-lib.h |
vp_request_msix_vectors is only called by vp_try_to_find_vqs, which already
calls vp_free_vectors through vp_del_vqs in the failure case.
Signed-off-by: Christoph Hellwig
---
drivers/virtio/virtio_pci_common.c | 1 -
1 file changed, 1 deletion(-)
diff --git
This lets IRQ layer handle dispatching IRQs to separate handlers for the
case where we don't have per-VQ MSI-X vectors, and allows us to greatly
simplify the code based on the assumption that we always have interrupt
vector 0 (legacy INTx or config interrupt for MSI-X) available, and
any other
Hi Michael,
this series contains a couple cleanups for the virtio_pci interrupt
handling code, including a switch to the new pci_irq_alloc_vectors
helper, and support for automatic affinity by the PCI layer if the
consumers ask for it. It then converts over virtio_blk to use this
functionality
Add a struct irq_affinity pointer to the find_vqs methods, which if set
is used to tell the PCI layer to create the MSI-X vectors for our I/O
virtqueues with the proper affinity from the start. Compared to after
the fact affinity hints this gives us an instantly working setup and
allows to
Signed-off-by: Christoph Hellwig
---
drivers/virtio/virtio_pci_common.c | 61 +-
1 file changed, 27 insertions(+), 34 deletions(-)
diff --git a/drivers/virtio/virtio_pci_common.c
b/drivers/virtio/virtio_pci_common.c
index 93700c5..f6c5499 100644
Add reference counting to 'struct bsg_job' so we can implement a reuqest
timeout handler for bsg_jobs, which is needed for Fibre Channel.
Signed-off-by: Johannes Thumshirn
Reviewed-by: Hannes Reinecke
---
block/bsg-lib.c | 7 +--
fc_destroy_bsgjob() and bsg_destroy_job() are now 1:1 copies, so use the
later. As bsg_destroy_job() comes from bsg-lib we need to select it in Kconfig
once CONFOG_SCSI_FC_ATTRS is active.
Signed-off-by: Johannes Thumshirn
Reviewed-by: Hannes Reinecke
---
On Wed, Nov 16, 2016 at 06:47:22PM +0200, Sagi Grimberg wrote:
> > + if (__blkdev_issue_zeroout(req->ns->bdev, sector, nr_sector,
> > + GFP_KERNEL, , false))
> > + status = NVME_SC_INTERNAL | NVME_SC_DNR;
> > +
> > + if (bio) {
> > +
On 11/17/2016 05:50 AM, Christoph Hellwig wrote:
On Wed, Nov 16, 2016 at 09:43:57PM -0700, Jens Axboe wrote:
OK, I'm getting reasonably happy with it now:
http://git.kernel.dk/cgit/linux-block/log/?h=for-4.10/hch-dio
Sharing the FUA setting to avoid the manual flush between the two, and
On 11/17/2016 05:46 AM, Christoph Hellwig wrote:
On Wed, Nov 16, 2016 at 01:02:05PM -0700, Jens Axboe wrote:
I'm fine with this approach, but I would still REALLY like to see
blkdev_bio_end_io() split in two, once for sync and once for async. That
would be a lot cleaner, imho.
Do you still
On Thu, Nov 17, 2016 at 05:12:51AM -0800, Christoph Hellwig wrote:
> Hi Scott,
>
> I took a look at the code and here are some very high level comments:
>
> - we only call into block_device_operations.sec_ops from the ioctl
>handlers. So instead of adding it to the block layer I'd rather
>
On Thu, Nov 17, 2016 at 07:22:15AM -0800, Christoph Hellwig wrote:
> > @@ -0,0 +1,58 @@
> > +/*
> > + * Copyright © 2016 Intel Corporation
> > + *
> > + * Permission is hereby granted, free of charge, to any person obtaining a
> > + * copy of this software and associated documentation files (the
NBD can become contended on its single connection. We have to serialize all
writes and we can only process one read response at a time. Fix this by
allowing userspace to provide multiple connections to a single nbd device. This
coupled with block-mq drastically increases performance in
On 11/17/2016 01:17 PM, Christoph Hellwig wrote:
Ok, the updated tree is here:
http://git.infradead.org/users/hch/block.git/shortlog/refs/heads/new-dio
This moves your O_SYNC patch earlier and makes it not directly
assign bi_opf.
Then comes the main direct I/O path with the various folds, and
On Thu, Nov 17, 2016 at 01:30:12PM -0700, Jens Axboe wrote:
> The last patch seems to have some issues:
>
> + bool is_sync = is_sync = is_sync_kiocb(iocb);
>
> and it's reverting part of the dio->is_sync that was part of my union
> waiter/iocb change. So I'll leave that out for now, for
On 11/17/2016 02:04 PM, Christoph Hellwig wrote:
On Thu, Nov 17, 2016 at 01:30:12PM -0700, Jens Axboe wrote:
The last patch seems to have some issues:
+ bool is_sync = is_sync = is_sync_kiocb(iocb);
and it's reverting part of the dio->is_sync that was part of my union
waiter/iocb
We ran into a funky issue, where someone doing 256K buffered reads saw
128K requests at the device level. Turns out it is read-ahead capping
the request size, since we use 128K as the default setting. This doesn't
make a lot of sense - if someone is issuing 256K reads, they should see
256K reads,
> @@ -0,0 +1,58 @@
> +/*
> + * Copyright © 2016 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including
On 11/17/2016 06:25 AM, Jens Axboe wrote:
On 11/17/2016 05:50 AM, Christoph Hellwig wrote:
On Wed, Nov 16, 2016 at 09:43:57PM -0700, Jens Axboe wrote:
OK, I'm getting reasonably happy with it now:
http://git.kernel.dk/cgit/linux-block/log/?h=for-4.10/hch-dio
Sharing the FUA setting to avoid
Hi Scott,
I took a look at the code and here are some very high level comments:
- we only call into block_device_operations.sec_ops from the ioctl
handlers. So instead of adding it to the block layer I'd rather
structure the code so that the driver itself calls a new common
On Thu, Nov 17, 2016 at 10:36:14AM -0700, Scott Bauer wrote:
> On Thu, Nov 17, 2016 at 05:12:51AM -0800, Christoph Hellwig wrote:
> > Hi Scott,
> >
> > I took a look at the code and here are some very high level comments:
> >
> > - we only call into block_device_operations.sec_ops from the
On Thu, Nov 17, 2016 at 08:03:46AM -0700, Jens Axboe wrote:
> > Looks like I forgot to push out the latest changes, that is gone. We're
> > just using + 1 now to see if we should use the multibio function or not.
>
> Just to clarify, the correct tip is:
Re-checking for-4.10/dio now everything
On Wed, Nov 16, 2016 at 01:02:05PM -0700, Jens Axboe wrote:
> I'm fine with this approach, but I would still REALLY like to see
> blkdev_bio_end_io() split in two, once for sync and once for async. That
> would be a lot cleaner, imho.
Do you still want that? It's not in your latest tree even
On Thu, Nov 17, 2016 at 11:28:07AM -0800, Christoph Hellwig wrote:
> On Thu, Nov 17, 2016 at 10:36:14AM -0700, Scott Bauer wrote:
> >
> > I want some further clarification, if you don't mind. We call sec_ops
> > inside the actual logic for the opal code. Which is only accessible via the
> >
Ok, the updated tree is here:
http://git.infradead.org/users/hch/block.git/shortlog/refs/heads/new-dio
This moves your O_SYNC patch earlier and makes it not directly
assign bi_opf.
Then comes the main direct I/O path with the various folds, and
a use after free fix ported from the iomap code in
On Thu, Nov 17, 2016 at 10:36:14AM -0700, Scott Bauer wrote:
>
> I want some further clarification, if you don't mind. We call sec_ops
> inside the actual logic for the opal code. Which is only accessible via the
> ioctls, is that what you were meaning? When you say "the driver calls"
> do you
From: Chaitanya Kulkarni
This adds a new block layer operation to zero out a range of
LBAs. This allows to implement zeroing for devices that don't use
either discard with a predictable zero pattern or WRITE SAME of zeroes.
The prominent example of that is NVMe with
This patch just adds a 'failfast' per-device flag which can be stored
in v0.90 or v1.x metadata.
The flag is not used yet but the intent is that it can be used for
mirrored (raid1/raid10) arrays where low latency is more important
than keeping all devices on-line.
Setting the flag for a device
This can only be supported on personalities which ensure
that md_error() never causes an array to enter the 'failed'
state. i.e. if marking a device Faulty would cause some
data to be inaccessible, the device is status is left as
non-Faulty. This is true for RAID1 and RAID10.
If we get a
If a device is marked FailFast, and it is not the only
device we can read from, we mark the bio as MD_FAILFAST.
If this does fail-fast, we don't try read repair but just
allow failure.
If it was the last device, it doesn't get marked Faulty so
the retry happens on the same device - this time
When writing to a fastfail device, we use MD_FASTFAIL unless
it is the only device being written to. For
resync/recovery, assume there was a working device to read
from so always use MD_FASTFAIL.
If a write for resync/recovery fails, we just fail the
device - there is not much else to do.
If a
When writing to a fastfail device we use MD_FASTFAIL unless
it is the only device being written to.
For resync/recovery, assume there was a working device to
read from so always use REQ_FASTFAIL_DEV.
If a write for resync/recovery fails, we just fail the
device - there is not much else to do.
Hi,
I've been sitting on these patches for a while because although they
solve a real problem, it is a fairly limited use-case, and I don't
really like some of the details.
So I'm posting them as RFC in the hope that a different perspective
might help me like them better, or find a better
Somewhere around
Commit: 20d0189b1012 ("block: Introduce new bio_split()")
and
Commit: 4b1faf931650 ("block: Kill bio_pair_split()")
in 3.14 we lost the call to trace_block_split() from bio_split().
Commit: cda22646adaa ("block: add call to split trace point")
in 4.5 added it back for
If a device is marked FailFast and it is not the only device
we can read from, we mark the bio with REQ_FAILFAST_* flags.
If this does fail, we don't try read repair but just allow
failure. If it was the last device it doesn't fail of
course, so the retry happens on the same device - this time
> "Chaitanya" == Chaitanya Kulkarni writes:
Chaitanya> This adds a new block layer operation to zero out a range of
Chaitanya> LBAs. This allows to implement zeroing for devices that don't
Chaitanya> use either discard with a predictable zero pattern or WRITE
Incorporated the comments and sent new patch.
On Wed, Nov 16, 2016 at 6:48 PM, Martin K. Petersen
wrote:
>> "Keith" == Keith Busch writes:
>
> Keith> Your maximum bi_size exceeds the 2-bytes an NVMe Write Zeroes
> Keith> command provides
On Thu, Nov 17, 2016 at 02:17:11PM -0800, Chaitanya Kulkarni wrote:
> From: Chaitanya Kulkarni
>
> This adds a new block layer operation to zero out a range of
> LBAs. This allows to implement zeroing for devices that don't use
> either discard with a predictable
(Seeing that it was me who initiated those patches I guess I should
speak up here)
On 11/18/2016 06:16 AM, NeilBrown wrote:
> Hi,
>
> I've been sitting on these patches for a while because although they
> solve a real problem, it is a fairly limited use-case, and I don't
> really like some of
On Friday, November 18, 2016 5:23 AM Jens Axboe wrote:
>
> We ran into a funky issue, where someone doing 256K buffered reads saw
> 128K requests at the device level. Turns out it is read-ahead capping
> the request size, since we use 128K as the default setting. This doesn't
> make a lot of
45 matches
Mail list logo