Any comments? Getting rid of this driver which was never wired up
at all would help with some of the pending block work..
On Thu, Apr 06, 2017 at 01:28:46PM +0200, Christoph Hellwig wrote:
> This drivers was added in 2008, but as far as a I can tell we never had a
> single platform that actually
Hi Nic,
this patch looks fine, and I'll include it for the next post. I'll
move some of the explanation in this mail into the patch, though.
On Mon, 2017-04-10 at 18:08 +0200, Christoph Hellwig wrote:
> Use the pscsi driver to support arbitrary command passthrough
> instead.
>
The people who are actively using iblock_execute_write_same_direct() are
doing so in the context of ESX VAAI BlockZero, together with
EXTENDED_COPY and
> Il giorno 11 apr 2017, alle ore 23:47, Tejun Heo ha scritto:
>
> Hello,
>
> On Tue, Apr 11, 2017 at 03:43:01PM +0200, Paolo Valente wrote:
>> From: Arianna Avanzini
>>
>> Add complete support for full hierarchical scheduling, with a cgroups
>>
On Wed, Apr 12, 2017 at 7:58 AM, Bart Van Assche
wrote:
> Although blk_execute_rq_nowait() asks blk_mq_sched_insert_request()
> to run the queue, the function that should run the queue
> (__blk_mq_delay_run_hw_queue()) skips hardware queues for which
> .tags == NULL.
On Tue, Apr 11, 2017 at 06:18:36PM +, Bart Van Assche wrote:
> On Tue, 2017-04-11 at 14:03 -0400, Mike Snitzer wrote:
> > Rather than working so hard to use DM code against me, your argument
> > should be: "blk-mq drivers X, Y and Z rerun the hw queue; this is a well
> > established pattern"
>
Although blk_execute_rq_nowait() asks blk_mq_sched_insert_request()
to run the queue, the function that should run the queue
(__blk_mq_delay_run_hw_queue()) skips hardware queues for which
.tags == NULL. Since blk_mq_free_tag_set() clears .tags this means
if blk_execute_rq_nowait() is called after
On Wed, 2017-04-12 at 00:13 +0200, Javier González wrote:
> please point out to any other tools/concerns you may have.
Hello Javier,
Do you already have an account at https://scan.coverity.com/? Any Linux
kernel developer can get an account for free. A full Coverity scan of
Linus' tree is
Hi Bart,
> On 11 Apr 2017, at 17.19, Bart Van Assche wrote:
>
> On Tue, 2017-04-11 at 16:31 +0200, Javier González wrote:
>> Changes since v4:
>> * Rebase on top of Matias' for-4.12/core
>> * Fix type implicit conversions reported by sparse (reported by Bart Van
>>
Hello,
On Tue, Apr 11, 2017 at 03:43:01PM +0200, Paolo Valente wrote:
> From: Arianna Avanzini
>
> Add complete support for full hierarchical scheduling, with a cgroups
> interface. Full hierarchical scheduling is implemented through the
> 'entity' abstraction: both
The blk-mq debugfs attributes are removed after blk_cleanup_queue()
has finished. Since running a queue after a queue has entered the
"dead" state is not allowed, disallow this. This patch avoids that
an attempt to run a dead queue triggers a kernel crash.
Signed-off-by: Bart Van Assche
Hello Jens,
Please consider the six patches in this series for kernel v4.12. The
first patch in this series is a bug fix for code that has already
been queued for kernel v4.12. The second patch implements a change
requested by Omar. Patches 3-6 are blk-mq debugfs enhancements.
Thanks,
Bart.
Move the "state" attribute from the top level to the "mq" directory
as requested by Omar.
Signed-off-by: Bart Van Assche
Cc: Omar Sandoval
Cc: Hannes Reinecke
---
block/blk-mq-debugfs.c | 9 +
1 file changed, 1 insertion(+), 8
Show the SCSI CDB, .eh_eflags and .result for pending SCSI commands
in /sys/kernel/debug/block/*/mq/*/dispatch and */rq_list.
Signed-off-by: Bart Van Assche
Cc: Martin K. Petersen
Cc: James Bottomley
Show the operation name, .cmd_flags and .rq_flags as names instead
of numbers.
Signed-off-by: Bart Van Assche
Cc: Omar Sandoval
Cc: Hannes Reinecke
---
block/blk-mq-debugfs.c | 72 +++---
1
This patch does not change any functionality but makes it possible
to produce a single line of output with multiple flag-to-name
translations.
Signed-off-by: Bart Van Assche
Cc: Omar Sandoval
Cc: Hannes Reinecke
---
On Tue, 2017-04-11 at 19:37 +0200, Paolo Valente wrote:
> Just pushed:
> https://github.com/Algodev-github/bfq-mq/tree/add-bfq-mq-logical
Thanks!
But are you aware that the code on that branch doesn't build?
$ make all
[ ... ]
ERROR: "bfq_mark_bfqq_busy" [block/bfq-wf2q.ko] undefined!
ERROR:
On Tue, 2017-04-11 at 14:03 -0400, Mike Snitzer wrote:
> Rather than working so hard to use DM code against me, your argument
> should be: "blk-mq drivers X, Y and Z rerun the hw queue; this is a well
> established pattern"
>
> I see drivers/nvme/host/fc.c:nvme_fc_start_fcp_op() does. But that
On Tue, Apr 11 2017 at 1:51pm -0400,
Bart Van Assche wrote:
> On Tue, 2017-04-11 at 13:47 -0400, Mike Snitzer wrote:
> > Other drivers will very likely be caught about by
> > this blk-mq quirk in the future.
>
> Hello Mike,
>
> Are you aware that the requirement
On Mon, 2017-04-10 at 09:54 -0600, Jens Axboe wrote:
> void blk_mq_stop_hw_queue(struct blk_mq_hw_ctx *hctx)
> {
> - cancel_work(>run_work);
> + cancel_delayed_work(>run_work);
> cancel_delayed_work(>delay_work);
> set_bit(BLK_MQ_S_STOPPED, >state);
> }
Hello Jens,
I would
On Tue, 2017-04-11 at 13:47 -0400, Mike Snitzer wrote:
> Other drivers will very likely be caught about by
> this blk-mq quirk in the future.
Hello Mike,
Are you aware that the requirement that blk-mq drivers rerun the queue after
having returned BLK_MQ_RQ_QUEUE_BUSY is a requirement that is
On Tue, Apr 11 2017 at 12:26pm -0400,
Bart Van Assche wrote:
> On Tue, 2017-04-11 at 12:09 -0400, Mike Snitzer wrote:
> > This has no place in dm-mq (or any blk-mq
> > driver). If it is needed it should be elevated to blk-mq core to
> > trigger
> Il giorno 11 apr 2017, alle ore 16:37, Bart Van Assche
> ha scritto:
>
> On Tue, 2017-04-11 at 15:42 +0200, Paolo Valente wrote:
>> new patch series, addressing (both) issues raised by Bart [1].
>
> Hello Paolo,
>
> Is there a git tree available somewhere with
On 04/10/2017 11:07 AM, Christoph Hellwig wrote:
> Now that we are using REQ_OP_WRITE_ZEROES for all zeroing needs in the
> kernel there is very little use left for REQ_OP_WRITE_SAME. We only
> have two callers left, and both just export optional protocol features
> to remote systems: DRBD and
On Mon, Apr 10, 2017 at 11:55:43AM +0200, Paolo Valente wrote:
>
> > Il giorno 10 apr 2017, alle ore 11:05, Andreas Herrmann
> > ha scritto:
> >
> > Hi Paolo,
> >
> > I've looked at your WIP branch as of 4.11.0-bfq-mq-rc4-00155-gbce0818
> > and did some fio tests to
On Tue, 2017-04-11 at 12:09 -0400, Mike Snitzer wrote:
> This has no place in dm-mq (or any blk-mq
> driver). If it is needed it should be elevated to blk-mq core to
> trigger blk_mq_delay_run_hw_queue() when BLK_MQ_RQ_QUEUE_BUSY is
> returned from blk_mq_ops' .queue_rq.
Hello Mike,
If the
On Fri, Apr 07 2017 at 2:16pm -0400,
Bart Van Assche wrote:
> While running the srp-test software I noticed that request
> processing stalls sporadically at the beginning of a test, namely
> when mkfs is run against a dm-mpath device. Every time when that
> happened
On Tue, 11 Apr 2017 11:38:27 +0200, Jan Kara wrote:
> when testing my fix for 0-day reports with writeback throttling I came
> across somewhat unexpected behavior with user interface of writeback
> throttling. So currently if CFQ is used as an IO scheduler, we disable
> writeback throttling
On Tue, 2017-04-11 at 16:31 +0200, Javier González wrote:
> Changes since v4:
> * Rebase on top of Matias' for-4.12/core
> * Fix type implicit conversions reported by sparse (reported by Bart Van
> Assche)
> * Make error and debug statistics long atomic variables.
Hello Javier,
Thanks for the
On 04/11/2017 04:31 PM, Javier González wrote:
This patch introduces pblk, a host-side translation layer for
Open-Channel SSDs to expose them like block devices. The translation
layer allows data placement decisions, and I/O scheduling to be
managed by the host, enabling users to optimize the
On Tue, 2017-04-11 at 15:42 +0200, Paolo Valente wrote:
> new patch series, addressing (both) issues raised by Bart [1].
Hello Paolo,
Is there a git tree available somewhere with these patches and without
the single queue BFQ scheduler?
Thanks,
Bart.
On 04/11/2017 04:18 PM, Javier González wrote:
sector_t is always unsigned, therefore avoid < 0 checks on it.
Signed-off-by: Javier González
---
drivers/lightnvm/rrpc.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/lightnvm/rrpc.c
On 04/11/2017 04:18 PM, Javier González wrote:
Convert sprintf calls to strlcpy in order to make possible buffer
overflow more obvious.
Signed-off-by: Javier González
---
drivers/lightnvm/core.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git
On 04/11/2017 04:18 PM, Javier González wrote:
Clean unused variable on lightnvm core.
Signed-off-by: Javier González
---
drivers/lightnvm/core.c | 3 ---
1 file changed, 3 deletions(-)
diff --git a/drivers/lightnvm/core.c b/drivers/lightnvm/core.c
index eb9ab1a..258007a
Changes since v4:
* Rebase on top of Matias' for-4.12/core
* Fix type implicit conversions reported by sparse (reported by Bart Van
Assche)
* Make error and debug statistics long atomic variables.
Changes since v3:
* Apply Bart's feedback [1]
* Implement dynamic L2P optimizations for > 32-bit
From: Goldwyn Rodrigues
If IOCB_NOWAIT is set, bail if the i_rwsem is not lockable
immediately.
IF IOMAP_NOWAIT is set, return EAGAIN in xfs_file_iomap_begin
if it needs allocation either due to file extension, writing to a hole,
or COW or waiting for other DIOs to finish.
From: Goldwyn Rodrigues
IOCB_NOWAIT translates to IOMAP_NOWAIT for iomaps.
This is used by XFS in the XFS patch.
---
fs/iomap.c| 2 ++
include/linux/iomap.h | 1 +
2 files changed, 3 insertions(+)
diff --git a/fs/iomap.c b/fs/iomap.c
index
From: Goldwyn Rodrigues
Return EAGAIN if any of the following checks fail
+ i_rwsem is not lockable
+ NODATACOW or PREALLOC is not set
+ Cannot nocow at the desired location
+ Writing beyond end of file which is not allocated
Signed-off-by: Goldwyn Rodrigues
From: Goldwyn Rodrigues
Return EAGAIN if any of the following checks fail for direct I/O:
+ i_rwsem is lockable
+ Writing beyond end of file (will trigger allocation)
+ Blocks are not allocated at the write location
Signed-off-by: Goldwyn Rodrigues
From: Goldwyn Rodrigues
A new flag BIO_NOWAIT is introduced to identify bio's
orignating from iocb with IOCB_NOWAIT. This flag indicates
to return immediately if a request cannot be made instead
of retrying.
To facilitate this, QUEUE_FLAG_NOWAIT is set to devices
which
From: Goldwyn Rodrigues
The check is in generic_file_write_iter(), which is called by
most filesystems, either through fsops.write_iter() or through
the function defined by write_iter(). If not, we perform the
check in the defined .write_iter() function which is called
for
From: Goldwyn Rodrigues
Find out if the write will trigger a wait due to writeback. If yes,
return -EAGAIN.
This introduces a new function filemap_range_has_page() which
returns true if the file's mapping has a page within the range
mentioned.
Return -EINVAL for buffered
From: Goldwyn Rodrigues
This flag informs kernel to bail out if an AIO request will block
for reasons such as file allocations, or a writeback triggered,
or would block while allocating requests while performing
direct I/O.
Unfortunately, aio_flags is not checked for
From: Goldwyn Rodrigues
RWF_* flags is used for preadv2/pwritev2 calls. Port to use
it for aio operations as well. For this, aio_rw_flags is
introduced in struct iocb (using aio_reserved1) which will
carry these flags.
This is a precursor to the nowait AIO calls.
Note, the
Formerly known as non-blocking AIO.
This series adds nonblocking feature to asynchronous I/O writes.
io_submit() can be delayed because of a number of reason:
- Block allocation for files
- Data writebacks for direct I/O
- Sleeping because of waiting to acquire i_rwsem
- Congested block
Convert sprintf calls to strlcpy in order to make possible buffer
overflow more obvious.
Signed-off-by: Javier González
---
drivers/lightnvm/core.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/lightnvm/core.c b/drivers/lightnvm/core.c
Last small lightnvm fixes for lightnvm core. These are motivated by
Bart's comments on pblk's patch.
Javier González (3):
lightnvm: clean unused variable
lightnvm: fix type checks on rrpc
lightnvm: convert sprintf into strlcpy
drivers/lightnvm/core.c | 9 +++--
Clean unused variable on lightnvm core.
Signed-off-by: Javier González
---
drivers/lightnvm/core.c | 3 ---
1 file changed, 3 deletions(-)
diff --git a/drivers/lightnvm/core.c b/drivers/lightnvm/core.c
index eb9ab1a..258007a 100644
--- a/drivers/lightnvm/core.c
+++
sector_t is always unsigned, therefore avoid < 0 checks on it.
Signed-off-by: Javier González
---
drivers/lightnvm/rrpc.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/lightnvm/rrpc.c b/drivers/lightnvm/rrpc.c
index 5dba544..cf0e28a 100644
Convert sprintf calls to strlcpy in order to make possible buffer
overflow more obvious.
Signed-off-by: Javier González
---
drivers/lightnvm/core.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/lightnvm/core.c b/drivers/lightnvm/core.c
On 04/11/2017 03:29 AM, Jan Kara wrote:
> When CFQ calls wbt_disable_default(), it will call
> blk_stat_remove_callback() to stop gathering IO statistics for the
> purposes of writeback throttling. Later, when request_queue is
> unregistered, wbt_exit() will call blk_stat_remove_callback() again
>
This patch deals with two sources of unfairness, which can also cause
high latencies and throughput loss. The first source is related to
write requests. Write requests tend to starve read requests, basically
because, on one side, writes are slower than reads, whereas, on the
other side, storage
Unless the maximum budget B_max that BFQ can assign to a queue is set
explicitly by the user, BFQ automatically updates B_max. In
particular, BFQ dynamically sets B_max to the number of sectors that
can be read, at the current estimated peak rate, during the maximum
time, T_max, allowed before a
The feedback-loop algorithm used by BFQ to compute queue (process)
budgets is basically a set of three update rules, one for each of the
main reasons why a queue may be expired. If many processes suddenly
switch from sporadic I/O to greedy and sequential I/O, then these
rules are quite slow to
Hi Bart,
> On 10 Apr 2017, at 22.35, Bart Van Assche wrote:
>
> On 04/10/2017 11:36 AM, Javier González wrote:
>> Changes since v3:
>> * Apply Bart's feedback [1]
>
> Thanks for having addressed these comments. But please also make sure
> that the pblk driver builds
I/O schedulers typically allow NCQ-capable drives to prefetch I/O
requests, as NCQ boosts the throughput exactly by prefetching and
internally reordering requests.
Unfortunately, as discussed in detail and shown experimentally in [1],
this may cause fairness and latency guarantees to be violated.
To guarantee a low latency also to the I/O requests issued by soft
real-time applications, this patch introduces a further heuristic,
which weight-raises (in the sense explained in the previous patch)
also the queues associated to applications deemed as soft real-time.
To be deemed as soft
This patch introduces a simple heuristic to load applications quickly,
and to perform the I/O requested by interactive applications just as
quickly. To this purpose, both a newly-created queue and a queue
associated with an interactive application (we explain in a moment how
BFQ decides whether
From: Arianna Avanzini
A set of processes may happen to perform interleaved reads, i.e.,
read requests whose union would give rise to a sequential read pattern.
There are two typical cases: first, processes reading fixed-size chunks
of data at a fixed distance from
This patch is basically the counterpart, for NCQ-capable rotational
devices, of the previous patch. Exactly as the previous patch does on
flash-based devices and for any workload, this patch disables device
idling on rotational devices, but only for random I/O. In fact, only
with these queues
From: Arianna Avanzini
Many popular I/O-intensive services or applications spawn or
reactivate many parallel threads/processes during short time
intervals. Examples are systemd during boot or git grep. These
services or applications benefit mostly from a high
When a bfq queue is set in service and when it is merged, a reference
to the I/O context associated with the queue is taken. This reference
is then released when the queue is deselected from service or
split. More precisely, the release of the reference is postponed to
when the scheduler lock is
This patch boosts the throughput on NCQ-capable flash-based devices,
while still preserving latency guarantees for interactive and soft
real-time applications. The throughput is boosted by just not idling
the device when the in-service queue remains empty, even if the queue
is sync and has a
Hi,
new patch series, addressing (both) issues raised by Bart [1].
Thanks,
Paolo
[1] https://lkml.org/lkml/2017/3/31/393
Arianna Avanzini (4):
block, bfq: add full hierarchical scheduling and cgroups support
block, bfq: add Early Queue Merge (EQM)
block, bfq: reduce idling only in
> Il giorno 02 apr 2017, alle ore 12:02, kbuild test robot ha
> scritto:
>
> Hi Paolo,
>
> [auto build test ERROR on block/for-next]
> [also build test ERROR on v4.11-rc4 next-20170331]
> [if your patch is applied to the wrong git tree, please drop us a note to
> help improve
On 04/10/2017 08:51 PM, Javier González wrote:
Prefix the nvm_free static function with a missing static keyword.
Signed-off-by: Javier González
---
drivers/lightnvm/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/lightnvm/core.c
On 04/07/2017 08:31 PM, Javier González wrote:
The NVMe I/O command control bits are 16 bytes, but is interpreted as
32 bytes in the lightnvm user I/O data path.
Signed-off-by: Javier González
---
drivers/nvme/host/lightnvm.c | 2 +-
1 file changed, 1 insertion(+), 1
On 04/07/2017 08:31 PM, Javier González wrote:
The dev->lun_map bits are cleared twice if an target init error occurs.
First in the target clean routine, and then next in the nvm_tgt_create
error function. Make sure that it is only cleared once by extending
nvm_remove_tgt_devi() with a clear
On 04/07/2017 08:31 PM, Javier González wrote:
Target initialization has two responsibilities: creating the target
partition and instantiating the target. This patch enables to create a
factory partition (e.g., do not trigger recovery on the given target).
This is useful for target development
On 04/07/2017 08:31 PM, Javier González wrote:
Reorder disk allocation such that the disk structure can be put
safely.
Signed-off-by: Javier González
---
drivers/lightnvm/core.c | 14 +++---
1 file changed, 7 insertions(+), 7 deletions(-)
diff --git
Hi,
when testing my fix for 0-day reports with writeback throttling I came
across somewhat unexpected behavior with user interface of writeback
throttling. So currently if CFQ is used as an IO scheduler, we disable
writeback throttling because they don't go well together. However when user
has
When CFQ calls wbt_disable_default(), it will call
blk_stat_remove_callback() to stop gathering IO statistics for the
purposes of writeback throttling. Later, when request_queue is
unregistered, wbt_exit() will call blk_stat_remove_callback() again
which will try to delete callback from the list
> Il giorno 10 apr 2017, alle ore 18:56, Bart Van Assche
> ha scritto:
>
> On Fri, 2017-03-31 at 14:47 +0200, Paolo Valente wrote:
>> [ ... ]
>
> Hello Paolo,
>
> Is the git tree that is available at https://github.com/Algodev-github/bfq-mq
> appropriate for
> Il giorno 10 apr 2017, alle ore 17:15, Bart Van Assche
> ha scritto:
>
> On Mon, 2017-04-10 at 11:55 +0200, Paolo Valente wrote:
>> That said, if you do always want maximum throughput, even at the
>> expense of latency, then just switch off low-latency heuristics,
> Il giorno 10 apr 2017, alle ore 11:55, Paolo Valente
> ha scritto:
>
>>
>> Il giorno 10 apr 2017, alle ore 11:05, Andreas Herrmann
>> ha scritto:
>>
>> Hi Paolo,
>>
>> I've looked at your WIP branch as of 4.11.0-bfq-mq-rc4-00155-gbce0818
>>
> On 10 Apr 2017, at 20.56, Bart Van Assche wrote:
>
> On Mon, 2017-04-10 at 20:51 +0200, Javier González wrote:
>> Convert sprintf calls to snprintf in order to make possible buffer
>> overflow more obvious.
>>
>> Signed-off-by: Javier González
76 matches
Mail list logo