On Wed, 2018-05-09 at 13:50 -0600, Jens Axboe wrote:
> On 5/9/18 12:31 PM, Mike Galbraith wrote:
> > On Wed, 2018-05-09 at 11:01 -0600, Jens Axboe wrote:
> >> On 5/9/18 10:57 AM, Mike Galbraith wrote:
> >>
> >>>>> Confirmed. Impressive high speed bug s
On Wed, 2018-05-09 at 11:01 -0600, Jens Axboe wrote:
> On 5/9/18 10:57 AM, Mike Galbraith wrote:
>
> >>> Confirmed. Impressive high speed bug stomping.
> >>
> >> Well, that's good news. Can I get you to try this patch?
> >
> > Sure thin
On Wed, 2018-05-09 at 09:18 -0600, Jens Axboe wrote:
> On 5/8/18 10:11 PM, Mike Galbraith wrote:
> > On Tue, 2018-05-08 at 19:09 -0600, Jens Axboe wrote:
> >>
> >> Alright, I managed to reproduce it. What I think is happening is that
> >> BFQ is limiting the
On Tue, 2018-05-08 at 14:37 -0600, Jens Axboe wrote:
>
> - sdd has nothing pending, yet has 6 active waitqueues.
sdd is where ccache storage lives, which that should have been the only
activity on that drive, as I built source in sdb, and was doing nothing
else that utilizes sdd.
-Mike
On Tue, 2018-05-08 at 19:09 -0600, Jens Axboe wrote:
>
> Alright, I managed to reproduce it. What I think is happening is that
> BFQ is limiting the inflight case to something less than the wake
> batch for sbitmap, which can lead to stalls. I don't have time to test
> this tonight, but perhaps yo
On Tue, 2018-05-08 at 08:55 -0600, Jens Axboe wrote:
>
> All the block debug files are empty...
Sigh. Take 2, this time cat debug files, having turned block tracing
off before doing anything else (so trace bits in dmesg.txt should end
AT the stall).
-Mike
dmesg.xz
Description: applicat
On Tue, 2018-05-08 at 06:51 +0200, Mike Galbraith wrote:
>
> I'm deadlined ATM, but will get to it.
(Bah, even a zombie can type ccache -C; make -j8 and stare...)
kbuild again hung on the first go (yay), and post hang data written to
sdd1 survived (kernel source lives in sdb3).
On Mon, 2018-05-07 at 20:02 +0200, Paolo Valente wrote:
>
>
> > Is there a reproducer?
Just building fat config kernels works for me. It was highly non-
deterministic, but reproduced quickly twice in a row with Paolos hack.
> Ok Mike, I guess it's your turn now, for at least a stack trace.
On Mon, 2018-05-07 at 11:27 +0200, Paolo Valente wrote:
>
>
> Where is the bug?
Hm, seems potent pain-killers and C don't mix all that well.
On Sun, 2018-05-06 at 09:42 +0200, Paolo Valente wrote:
>
> diff --git a/block/bfq-mq-iosched.c b/block/bfq-mq-iosched.c
> index 118f319af7c0..6662efe29b69 100644
> --- a/block/bfq-mq-iosched.c
> +++ b/block/bfq-mq-iosched.c
> @@ -525,8 +525,13 @@ static void bfq_limit_depth(unsigned int op, struc
On Mon, 2018-05-07 at 04:43 +0200, Mike Galbraith wrote:
> On Sun, 2018-05-06 at 09:42 +0200, Paolo Valente wrote:
> >
> > I've attached a compressed patch (to avoid possible corruption from my
> > mailer). I'm little confident, but no pain, no gain, right?
> &g
On Sun, 2018-05-06 at 09:42 +0200, Paolo Valente wrote:
>
> I've attached a compressed patch (to avoid possible corruption from my
> mailer). I'm little confident, but no pain, no gain, right?
>
> If possible, apply this patch on top of the fix I proposed in this
> thread, just to eliminate poss
On Sat, 2018-05-05 at 12:39 +0200, Paolo Valente wrote:
>
> BTW, if you didn't run out of patience with this permanent issue yet,
> I was thinking of two o three changes to retry to trigger your failure
> reliably.
Sure, fire away, I'll happily give the annoying little bugger
opportunities to sho
On Fri, 2018-05-04 at 21:46 +0200, Mike Galbraith wrote:
> Tentatively, I suspect you've just fixed the nasty stalls I reported a
> while back.
Oh well, so much for optimism. It took a lot, but just hung.
Tentatively, I suspect you've just fixed the nasty stalls I reported a
while back. Not a hint of stall as yet (should have shown itself by
now), spinning rust buckets are being all they can be, box feels good.
Later mq-deadline (I hope to eventually forget the module dependency
eternities we've s
On Fri, 2018-02-09 at 14:21 +0100, Oleksandr Natalenko wrote:
>
> In addition to this I think it should be worth considering CC'ing Greg
> to pull this fix into 4.15 stable tree.
This isn't one he can cherry-pick, some munging required, in which case
he usually wants a properly tested backport.
On Wed, 2018-02-07 at 12:12 +0100, Paolo Valente wrote:
> Just to be certain, before submitting a new patch: you changed *only*
> the BUG_ON at line 4742, on top of my instrumentation patch.
Nah, I completely rewrite it with only a little help from an ouija
board to compensate for missing (all) k
On Wed, 2018-02-07 at 11:27 +0100, Paolo Valente wrote:
>
> 2. Could you please turn that BUG_ON into:
> if (!(rq->rq_flags & RQF_ELVPRIV))
> return;
> and see what happens?
That seems to make it forgets how to make boom.
-Mike
On Wed, 2018-02-07 at 11:27 +0100, Paolo Valente wrote:
>
> 1. Could you paste a stack trace for this OOPS, just to understand how we
> get there?
[ 442.421058] kernel BUG at block/bfq-iosched.c:4742!
[ 442.421762] invalid opcode: [#1] SMP PTI
[ 442.422436] Dumping ftrace buffer:
[ 442.4
On Wed, 2018-02-07 at 10:45 +0100, Paolo Valente wrote:
>
> > Il giorno 07 feb 2018, alle ore 10:23, Mike Galbraith ha
> > scritto:
> >
> > On Wed, 2018-02-07 at 10:08 +0100, Paolo Valente wrote:
> >>
> >> The first piece of information I need is w
On Wed, 2018-02-07 at 10:08 +0100, Paolo Valente wrote:
>
> The first piece of information I need is whether this failure happens
> even without "BFQ hierarchical scheduling support".
I presume you mean BFQ_GROUP_IOSCHED, which I do not have enabled.
-Mike
On Tue, 2018-02-06 at 13:43 +0100, Holger Hoffstätte wrote:
>
> A much more interesting question to me is why there is kyber in the middle. :)
Yeah, given per sysfs I have zero devices using kyber.
-Mike
On Tue, 2018-02-06 at 13:26 +0100, Paolo Valente wrote:
>
> ok, right in the middle of bfq this time ... Was this the first OOPS in your
> kernel log?
Yeah.
On Tue, 2018-02-06 at 13:16 +0100, Oleksandr Natalenko wrote:
> Hi.
>
> 06.02.2018 12:57, Mike Galbraith wrote:
> > Not me. Box seems to be fairly sure that it is bfq. Twice again box
> > went belly up on me in fairly short order with bfq, but seemed fine
> > wi
On Tue, 2018-02-06 at 10:38 +0100, Paolo Valente wrote:
>
> Hi Mike,
> as you can imagine, I didn't get any failure in my pre-submission
> tests on this patch. In addition, it is not that easy to link this
> patch, which just adds some internal bfq housekeeping in case of a
> requeue, with a corr
On Tue, 2018-02-06 at 09:37 +0100, Oleksandr Natalenko wrote:
> Hi.
>
> 06.02.2018 08:56, Mike Galbraith wrote:
> > I was doing kbuilds, and it blew up on me twice. Switching back to cfq
> > seemed to confirm it was indeed the patch causing trouble, but that's
&
On Tue, 2018-02-06 at 08:44 +0100, Oleksandr Natalenko wrote:
> Hi, Paolo.
>
> I can confirm that this patch fixes cfdisk hang for me. I've also tried
> to trigger the issue Mike has encountered, but with no luck (maybe, I
> wasn't insistent enough, just was doing dd on usb-storage device in the
Hi Paolo,
I applied this to master.today, flipped udev back to bfq and took it
for a spin. Unfortunately, box fairly quickly went boom under load.
[ 454.739975] [ cut here ]
[ 454.739979] list_add corruption. prev->next should be next
(5f99a42a), but was
On Thu, 2017-12-14 at 22:54 +0100, Peter Zijlstra wrote:
> On Thu, Dec 14, 2017 at 09:42:48PM +, Bart Van Assche wrote:
>
> > Some time ago the block layer was changed to handle timeouts in thread
> > context
> > instead of interrupt context. See also commit 287922eb0b18 ("block: defer
> > ti
On Sun, 2017-12-03 at 17:47 -0700, Jens Axboe wrote:
> On 12/03/2017 05:44 PM, Eric Biggers wrote:
> >
> >>> #syz fix: blktrace: fix trace mutex deadlock
> >>
> >> This is fixed in current -git.
> >>
> >
> > I know, but syzbot needed to be told what commit fixes the bug.
> > See https://github.co
On Thu, 2017-08-31 at 19:12 +0200, Paolo Valente wrote:
> > Il giorno 31 ago 2017, alle ore 19:06, Mike Galbraith ha
> > scritto:
> >
> > On Thu, 2017-08-31 at 15:42 +0100, Mel Gorman wrote:
> >> On Thu, Aug 31, 2017 at 08:46:28AM +0200, Paolo Valente wrote:
On Thu, 2017-08-31 at 15:42 +0100, Mel Gorman wrote:
> On Thu, Aug 31, 2017 at 08:46:28AM +0200, Paolo Valente wrote:
> > [SECOND TAKE, with just the name of one of the tester fixed]
> >
> > Hi,
> > while testing the read-write unfairness issues reported by Mel, I
> > found BFQ failing to guarante
On Tue, 2017-08-08 at 18:50 +0200, Mike Galbraith wrote:
> On Tue, 2017-08-08 at 09:44 -0700, Greg KH wrote:
> >
> > Should these go back farther than 4.12? Looks like they apply cleanly
> > to 4.9, didn't look older than that...
>
> I met prerequisites a
On Tue, 2017-08-08 at 09:44 -0700, Greg KH wrote:
>
> Should these go back farther than 4.12? Looks like they apply cleanly
> to 4.9, didn't look older than that...
I met prerequisites at 4.11, but I wasn't patching anything remotely
resembling virgin source.
-Mike
On Tue, 2017-08-08 at 09:22 -0700, Greg KH wrote:
> On Sun, Jul 30, 2017 at 03:50:15PM +0200, Oleksandr Natalenko wrote:
> > Hello Mike et al.
> >
> > On neděle 30. července 2017 7:12:31 CEST Mike Galbraith wrote:
> > > FWIW, first thing I'd do is updat
On Sat, 2017-07-29 at 17:27 +0200, Oleksandr Natalenko wrote:
> Hello Jens, Christoph.
>
> Unfortunately, even with "block: disable runtime-pm for blk-mq" patch applied
> blk-mq breaks suspend to RAM for me. It is reproducible on my laptop as well
> as in a VM.
>
> I use complex disk layout inv
On Thu, 2017-06-08 at 10:17 +0800, James Wang wrote:
> This condition check was exist at before commit b5dd2f6047ca ("block: loop:
> improve performance via blk-mq") When add MQ support to loop device, it be
> removed because the member of '->lo_thread' be removed. And then upstream
> add '->worker
37 matches
Mail list logo