On Fri, Feb 09, 2018 at 10:28:23AM +0530, Kashyap Desai wrote:
> > -Original Message-
> > From: Ming Lei [mailto:ming@redhat.com]
> > Sent: Thursday, February 8, 2018 10:23 PM
> > To: Hannes Reinecke
> > Cc: Kashyap Desai; Jens Axboe; linux-block@vger.kernel.org; Christoph
> > Hellwig;
> -Original Message-
> From: Ming Lei [mailto:ming@redhat.com]
> Sent: Thursday, February 8, 2018 10:23 PM
> To: Hannes Reinecke
> Cc: Kashyap Desai; Jens Axboe; linux-block@vger.kernel.org; Christoph
> Hellwig; Mike Snitzer; linux-s...@vger.kernel.org; Arun Easi; Omar
Sandoval;
> Marti
On Thu, Feb 08, 2018 at 12:59:47PM -0600, Goldwyn Rodrigues wrote:
> From: Goldwyn Rodrigues
>
> Since we can return less than count in case of partial direct
> writes, remove the ASSERT.
>
> Signed-off-by: Goldwyn Rodrigues
Looks ok,
Reviewed-by: Darrick J. Wong
--D
Hi Tejun,
On 18/2/8 23:23, Tejun Heo wrote:
> Hello, Joseph.
>
> On Thu, Feb 08, 2018 at 10:29:43AM +0800, Joseph Qi wrote:
>> So you mean checking css->refcnt to prevent the further use of
>> blkg_get? I think it makes sense.
>
> Yes.
>
>> IMO, we should use css_tryget_online instead, and righ
Hi,
I'd like to attend LSF/MM to talk about the state of I/O scheduling in
blk-mq. I can present some results about mq-deadline and Kyber on our
production workloads. I'd also like to talk about what's next, in
particular, improvements and features for Kyber. Finally, I'd like to
participate in Mi
From: Goldwyn Rodrigues
In case direct I/O encounters an error midway, it returns the error.
Instead it should be returning the number of bytes transferred so far.
Test case for filesystems (with ENOSPC):
1. Create an almost full filesystem
2. Create a file, say /mnt/lastfile, until the filesyst
From: Goldwyn Rodrigues
Since we can return less than count in case of partial direct
writes, remove the ASSERT.
Signed-off-by: Goldwyn Rodrigues
---
fs/xfs/xfs_file.c | 6 --
1 file changed, 6 deletions(-)
Changes since v6:
- Reordered to before direct write fix
diff --git a/fs/xfs/xfs
On Thu, 2018-02-08 at 18:38 +0100, Danil Kipnis wrote:
> thanks for the link to the article. To the best of my understanding,
> the guys suggest to authenticate the devices first and only then
> authenticate the users who use the devices in order to get access to a
> corporate service. They also me
On Thu, Feb 08, 2018 at 05:48:32PM +, Bart Van Assche wrote:
> On Thu, 2018-02-08 at 09:40 -0800, t...@kernel.org wrote:
> > Heh, sorry about not being clear. What I'm trying to say is that
> > scmd->device != NULL && device->host == NULL. Or was this what you
> > were saying all along?
>
>
On Thu, 2018-02-08 at 09:40 -0800, t...@kernel.org wrote:
> Heh, sorry about not being clear. What I'm trying to say is that
> scmd->device != NULL && device->host == NULL. Or was this what you
> were saying all along?
What I agree with is that the request pointer (req argument) is stored in %rd
On Thu, Feb 08, 2018 at 05:37:46PM +, Bart Van Assche wrote:
> On Thu, 2018-02-08 at 09:19 -0800, t...@kernel.org wrote:
> > Hello, Bart.
> >
> > On Thu, Feb 08, 2018 at 05:10:45PM +, Bart Van Assche wrote:
> > > I think "dereferencing a pointer" means reading the memory location that
> >
On Wed, Feb 7, 2018 at 6:32 PM, Bart Van Assche wrote:
> On Wed, 2018-02-07 at 18:18 +0100, Roman Penyaev wrote:
>> So the question is: are there real life setups where
>> some of the local IB network members can be untrusted?
>
> Hello Roman,
>
> You may want to read more about the latest evoluti
On Thu, 2018-02-08 at 09:19 -0800, t...@kernel.org wrote:
> Hello, Bart.
>
> On Thu, Feb 08, 2018 at 05:10:45PM +, Bart Van Assche wrote:
> > I think "dereferencing a pointer" means reading the memory location that
> > pointer points
> > at? Anyway, I think we both interpret the crash report
On Sat, 2018-02-03 at 12:21 +0800, Ming Lei wrote:
> diff --git a/block/blk-mq-tag.h b/block/blk-mq-tag.h
> index 61deab0b5a5a..a68323fa0c02 100644
> --- a/block/blk-mq-tag.h
> +++ b/block/blk-mq-tag.h
> @@ -11,10 +11,14 @@ struct blk_mq_tags {
> unsigned int nr_tags;
> unsigned int nr_
Hello, Bart.
On Thu, Feb 08, 2018 at 05:10:45PM +, Bart Van Assche wrote:
> I think "dereferencing a pointer" means reading the memory location that
> pointer points
> at? Anyway, I think we both interpret the crash report in the same way,
> namely that it
> means that scmd->device == NULL.
On Thu, 2018-02-08 at 09:00 -0800, t...@kernel.org wrote:
> On Thu, Feb 08, 2018 at 04:31:43PM +, Bart Van Assche wrote:
> > The crash is reported at address scsi_times_out+0x17 == scsi_times_out+23.
> > The
> > instruction at that address tries to dereference scsi_cmnd.device (%rax).
> > The
Hello, Bart.
On Thu, Feb 08, 2018 at 04:31:43PM +, Bart Van Assche wrote:
> > That sounds more like a scsi hotplug bug than an issue in the timeout
> > code unless we messed up @req pointer to begin with.
>
> I don't think that this is related to SCSI hotplugging: this crash does not
> occur
On Thu, Feb 08, 2018 at 08:00:29AM +0100, Hannes Reinecke wrote:
> On 02/07/2018 03:14 PM, Kashyap Desai wrote:
> >> -Original Message-
> >> From: Ming Lei [mailto:ming@redhat.com]
> >> Sent: Wednesday, February 7, 2018 5:53 PM
> >> To: Hannes Reinecke
> >> Cc: Kashyap Desai; Jens Axboe
On Thu, 2018-02-08 at 07:39 -0800, t...@kernel.org wrote:
> On Thu, Feb 08, 2018 at 01:09:57AM +, Bart Van Assche wrote:
> > On Wed, 2018-02-07 at 23:48 +, Bart Van Assche wrote:
> > > With this patch applied I see requests for which it seems like the
> > > timeout handler
> > > did not ge
I think it'd be simpler to have blk_poll set it back to running if
need_resched is true rather than repeat this patter across all the
callers:
---
diff --git a/block/blk-mq.c b/block/blk-mq.c
index df93102e2149..40285fe1c8ad 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -3164,6 +3164,7 @@
Currently bcache does not handle backing device failure, if backing
device is offline and disconnected from system, its bcache device can still
be accessible. If the bcache device is in writeback mode, I/O requests even
can success if the requests hit on cache device. That is to say, when and
how b
If a bcache device is configured to writeback mode, current code does not
handle write I/O errors on backing devices properly.
In writeback mode, write request is written to cache device, and
latter being flushed to backing device. If I/O failed when writing from
cache device to the backing device
When there are too many I/O errors on cache device, current bcache code
will retire the whole cache set, and detach all bcache devices. But the
detached bcache devices are not stopped, which is problematic when bcache
is in writeback mode.
If the retired cache set has dirty data of backing devices
In order to catch I/O error of backing device, a separate bi_end_io
call back is required. Then a per backing device counter can record I/O
errors number and retire the backing device if the counter reaches a
per backing device I/O error limit.
This patch adds backing_request_endio() to bcache bac
From: Tang Junhui
When we run IO in a detached device, and run iostat to shows IO status,
normally it will show like bellow (Omitted some fields):
Device: ... avgrq-sz avgqu-sz await r_await w_await svctm %util
sdd... 15.89 0.531.820.202.23 1.81 52.30
bcache0..
When too many I/Os failed on cache device, bch_cache_set_error() is called
in the error handling code path to retire whole problematic cache set. If
new I/O requests continue to come and take refcount dc->count, the cache
set won't be retired immediately, this is a problem.
Further more, there are
struct delayed_work writeback_rate_update in struct cache_dev is a delayed
worker to call function update_writeback_rate() in period (the interval is
defined by dc->writeback_rate_update_seconds).
When a metadate I/O error happens on cache device, bcache error handling
routine bch_cache_set_error(
In patch "bcache: fix cached_dev->count usage for bch_cache_set_error()",
cached_dev_get() is called when creating dc->writeback_thread, and
cached_dev_put() is called when exiting dc->writeback_thread. This
modification works well unless people detach the bcache device manually by
'echo 1 > /s
When bcache metadata I/O fails, bcache will call bch_cache_set_error()
to retire the whole cache set. The expected behavior to retire a cache
set is to unregister the cache set, and unregister all backing device
attached to this cache set, then remove sysfs entries of the cache set
and all attached
Hi maintainers and folks,
This patch set tries to improve bcache device failure handling, includes
cache device and backing device failures.
The basic idea to handle failed cache device is,
- Unregister cache set
- Detach all backing devices which are attached to this cache set
- Stop all the det
Hello,
On Thu, Feb 08, 2018 at 11:39:19AM +0800, xuejiufei wrote:
> Hi Tejun,
>
> Could you please kindly review this patch or give some advice?
I don't have anything against it but let's wait for Shaohua.
Thanks.
--
tejun
On Thu, Feb 08, 2018 at 07:39:40AM -0800, t...@kernel.org wrote:
> That sounds more like a scsi hotplug but than an issue in the timeout
^bug
> code unless we messed up @req pointer to begin with.
--
tejun
On Thu, Feb 08, 2018 at 01:09:57AM +, Bart Van Assche wrote:
> On Wed, 2018-02-07 at 23:48 +, Bart Van Assche wrote:
> > With this patch applied I see requests for which it seems like the timeout
> > handler
> > did not get invoked: [ ... ]
>
> I just noticed the following in the system l
On Mon, 2018-02-05 at 23:20 +0800, Ming Lei wrote:
> diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
> index 55c0a745b427..385bbec73804 100644
> --- a/block/blk-mq-sched.c
> +++ b/block/blk-mq-sched.c
> @@ -81,6 +81,17 @@ static bool blk_mq_sched_restart_hctx(struct blk_mq_hw_ctx
> *hctx)
On Sun, May 30, 2083 at 09:51:06AM +0530, Nitesh Shetty wrote:
> This removes the dependency on interrupts to wake up task. Set task
> state as TASK_RUNNING, if need_resched() returns true,
> while polling for IO completion.
> Earlier, polling task used to sleep, relying on interrupt to wake it up.
Hello, Joseph.
On Thu, Feb 08, 2018 at 10:29:43AM +0800, Joseph Qi wrote:
> So you mean checking css->refcnt to prevent the further use of
> blkg_get? I think it makes sense.
Yes.
> IMO, we should use css_tryget_online instead, and rightly after taking
Not really. An offline css still can have
This removes the dependency on interrupts to wake up task. Set task
state as TASK_RUNNING, if need_resched() returns true,
while polling for IO completion.
Earlier, polling task used to sleep, relying on interrupt to wake it up.
This made some IO take very long when interrupt-coalescing is enabled
> On 5 Feb 2018, at 13.15, Matias Bjørling wrote:
>
> Implement the geometry data structures for 2.0 and enable a drive
> to be identified as one, including exposing the appropriate 2.0
> sysfs entries.
>
> Signed-off-by: Matias Bjørling
> ---
> drivers/lightnvm/core.c | 2 +-
> drivers/n
On 02/06/2018 12:54 PM, hans.ml.holmb...@owltronix.com wrote:
From: Hans Holmberg
When pblk receives a sync, all data up to that point in the write buffer
must be comitted to persistent storage, and as flash memory comes with a
minimal write size there is a significant cost involved both in ter
On 02/08/2018 10:35 AM, Javier Gonzalez wrote:
On 5 Feb 2018, at 13.15, Matias Bjørling wrote:
Hi,
A couple of patches for 2.0 support for the lightnvm subsystem. They
form the basis for integrating 2.0 support.
For the rest of the support, Javier has code that implements report
chunk and set
> On 5 Feb 2018, at 13.15, Matias Bjørling wrote:
>
> Hi,
>
> A couple of patches for 2.0 support for the lightnvm subsystem. They
> form the basis for integrating 2.0 support.
>
> For the rest of the support, Javier has code that implements report
> chunk and sets up the LBA format data struct
41 matches
Mail list logo