RE: Boot regression (was "Re: [PATCH] genhd: Do not hold event lock when scheduling workqueue elements")

2017-02-15 Thread Dexuan Cui
> From: h...@lst.de [mailto:h...@lst.de] > Sent: Wednesday, February 15, 2017 00:35 > > I tested today's linux-next (next-20170214) + the 2 patches just now and > got > > a weird result: > > sometimes the VM stills hung with a new calltrace (BUG: spinlock bad > > magic) , but sometimes the VM did

Re: Boot regression (was "Re: [PATCH] genhd: Do not hold event lock when scheduling workqueue elements")

2017-02-14 Thread h...@lst.de
> I tested today's linux-next (next-20170214) + the 2 patches just now and got > a weird result: > sometimes the VM stills hung with a new calltrace (BUG: spinlock bad > magic) , but sometimes the VM did boot up despite the new calltrace! > > Attached is the log of a "good" boot. > > It looks

RE: Boot regression (was "Re: [PATCH] genhd: Do not hold event lock when scheduling workqueue elements")

2017-02-14 Thread Dexuan Cui
) ads...@microsoft.com>; Chris Valean (Cloudbase Solutions SRL) chv...@microsoft.com> > Subject: Re: Boot regression (was "Re: [PATCH] genhd: Do not hold event lock > when scheduling workqueue elements") > > On Tue, Feb 14, 2017 at 02:46:41PM +, Dexuan Cui

Re: Boot regression (was "Re: [PATCH] genhd: Do not hold event lock when scheduling workqueue elements")

2017-02-14 Thread h...@lst.de
On Tue, Feb 14, 2017 at 02:46:41PM +, Dexuan Cui wrote: > > From: h...@lst.de [mailto:h...@lst.de] > > Sent: Tuesday, February 14, 2017 22:29 > > To: Dexuan Cui <de...@microsoft.com> > > Subject: Re: Boot regression (was "Re: [PATCH] genhd: Do not hold event

RE: Boot regression (was "Re: [PATCH] genhd: Do not hold event lock when scheduling workqueue elements")

2017-02-14 Thread Dexuan Cui
> From: h...@lst.de [mailto:h...@lst.de] > Sent: Tuesday, February 14, 2017 22:29 > To: Dexuan Cui <de...@microsoft.com> > Subject: Re: Boot regression (was "Re: [PATCH] genhd: Do not hold event lock > when scheduling workqueue elements") > > Ok, thanks for

Re: Boot regression (was "Re: [PATCH] genhd: Do not hold event lock when scheduling workqueue elements")

2017-02-14 Thread h...@lst.de
Ok, thanks for testing. Can you try the patch below? It fixes a clear problem which was partially papered over before the commit you bisected to, although it can't explain why blk-mq still works. >From e4a66856fa2d92c0298000de658365f31bea60cd Mon Sep 17 00:00:00 2001 From: Christoph Hellwig

RE: Boot regression (was "Re: [PATCH] genhd: Do not hold event lock when scheduling workqueue elements")

2017-02-14 Thread Dexuan Cui
> From: h...@lst.de [mailto:h...@lst.de] > > Hi Dexuan, > > can you try the hack below for now? I disable the TUR call from > sd_check_events, which I think your VM is hanging on. The checks > it does on the sense data look a bit fishy, but so far I've not > identified a possible root cause. >

RE: Boot regression (was "Re: [PATCH] genhd: Do not hold event lock when scheduling workqueue elements")

2017-02-08 Thread Dexuan Cui
artin.peter...@oracle.com>; h...@lst.de; linux- > ker...@vger.kernel.org; linux-block@vger.kernel.org; j...@kernel.org > Subject: Re: Boot regression (was "Re: [PATCH] genhd: Do not hold event lock > when scheduling workqueue elements") > > On Wed, Feb 08, 2017 at 10

Re: Boot regression (was "Re: [PATCH] genhd: Do not hold event lock when scheduling workqueue elements")

2017-02-08 Thread h...@lst.de
On Wed, Feb 08, 2017 at 10:43:59AM -0700, Jens Axboe wrote: > I've changed the subject line, this issue has nothing to do with the > issue that Hannes was attempting to fix. Nothing really useful in the thread. Dexuan, can you throw in some prints to see which command times out?

Boot regression (was "Re: [PATCH] genhd: Do not hold event lock when scheduling workqueue elements")

2017-02-08 Thread Jens Axboe
h...@lst.de; linux-ker...@vger.kernel.org; linux-block@vger.kernel.org; >> j...@kernel.org >> Subject: Re: [PATCH] genhd: Do not hold event lock when scheduling workqueue >> elements >> >> On 02/06/2017 11:29 PM, Dexuan Cui wrote: >>>> From: linux-bl

RE: [PATCH] genhd: Do not hold event lock when scheduling workqueue elements

2017-02-08 Thread Dexuan Cui
rnel.org; > j...@kernel.org > Subject: Re: [PATCH] genhd: Do not hold event lock when scheduling workqueue > elements > > On 02/06/2017 11:29 PM, Dexuan Cui wrote: > >> From: linux-block-ow...@vger.kernel.org [mailto:linux-block- > >> ow...@vger.kernel.org] On Behal

Re: [PATCH] genhd: Do not hold event lock when scheduling workqueue elements

2017-02-07 Thread Jens Axboe
On 02/06/2017 11:29 PM, Dexuan Cui wrote: >> From: linux-block-ow...@vger.kernel.org [mailto:linux-block- >> ow...@vger.kernel.org] On Behalf Of Dexuan Cui >> with the linux-next kernel. >> >> I can boot the guest with linux-next's next-20170130 without any issue, >> but since next-20170131 I

Re: [PATCH] genhd: Do not hold event lock when scheduling workqueue elements

2017-02-06 Thread Bart Van Assche
On Tue, 2017-02-07 at 02:23 +, Dexuan Cui wrote: > Any news on this thread? > > The issue is still blocking Linux from booting up normally in my test. :-( > > Have we identified the faulty patch? > If so, at least I can try to revert it to boot up. It's interesting that you have a

RE: [PATCH] genhd: Do not hold event lock when scheduling workqueue elements

2017-02-06 Thread Dexuan Cui
gt; Cc: h...@lst.de; linux-ker...@vger.kernel.org; linux-block@vger.kernel.org; > j...@kernel.org > Subject: RE: [PATCH] genhd: Do not hold event lock when scheduling workqueue > elements > > > From: linux-kernel-ow...@vger.kernel.org [mailto:linux-kernel- > > ow...@

RE: [PATCH] genhd: Do not hold event lock when scheduling workqueue elements

2017-02-03 Thread Dexuan Cui
x-ker...@vger.kernel.org; linux-block@vger.kernel.org; > j...@kernel.org > Subject: Re: [PATCH] genhd: Do not hold event lock when scheduling workqueue > elements > > On 01/31/2017 01:31 AM, Bart Van Assche wrote: > > On Wed, 2017-01-18 at 10:48 +0100, Hannes Reinecke wrote: >

Re: [PATCH] genhd: Do not hold event lock when scheduling workqueue elements

2017-01-31 Thread Hannes Reinecke
On 01/31/2017 01:31 AM, Bart Van Assche wrote: > On Wed, 2017-01-18 at 10:48 +0100, Hannes Reinecke wrote: >> @@ -1488,26 +1487,13 @@ static unsigned long disk_events_poll_jiffies(struct >> gendisk *disk) >> void disk_block_events(struct gendisk *disk) >> { >> struct disk_events *ev =

Re: [PATCH] genhd: Do not hold event lock when scheduling workqueue elements

2017-01-30 Thread Bart Van Assche
On Wed, 2017-01-18 at 10:48 +0100, Hannes Reinecke wrote: > @@ -1488,26 +1487,13 @@ static unsigned long disk_events_poll_jiffies(struct > gendisk *disk) >  void disk_block_events(struct gendisk *disk) >  { > struct disk_events *ev = disk->ev; > -   unsigned long flags; > -   bool

[PATCH] genhd: Do not hold event lock when scheduling workqueue elements

2017-01-18 Thread Hannes Reinecke
When scheduling workqueue elements the callback function might be called directly, so holding the event lock is potentially dangerous as it might lead to a deadlock: [ 989.542827] INFO: task systemd-udevd:459 blocked for more than 480 seconds. [ 989.609721] Not tainted 4.10.0-rc4+ #546 [