Re: blk-mq crash under KVM in multiqueue block code (with virtio-blk and ext4)

2014-09-17 Thread Christian Borntraeger
On 09/12/2014 10:09 PM, Christian Borntraeger wrote: On 09/12/2014 01:54 PM, Ming Lei wrote: On Thu, Sep 11, 2014 at 6:26 PM, Christian Borntraeger borntrae...@de.ibm.com wrote: Folks, we have seen the following bug with 3.16 as a KVM guest. It suspect the blk-mq rework that happened

Re: blk-mq crash under KVM in multiqueue block code (with virtio-blk and ext4)

2014-09-17 Thread Ming Lei
On Wed, Sep 17, 2014 at 3:59 PM, Christian Borntraeger borntrae...@de.ibm.com wrote: On 09/12/2014 10:09 PM, Christian Borntraeger wrote: On 09/12/2014 01:54 PM, Ming Lei wrote: On Thu, Sep 11, 2014 at 6:26 PM, Christian Borntraeger borntrae...@de.ibm.com wrote: Folks, we have seen the

Re: blk-mq crash under KVM in multiqueue block code (with virtio-blk and ext4)

2014-09-17 Thread Ming Lei
On Wed, 17 Sep 2014 14:00:34 +0200 David Hildenbrand d...@linux.vnet.ibm.com wrote: Does anyone have an idea? The request itself is completely filled with cc That is very weird, the 'rq' is got from hctx-tags, and rq should be valid, and rq-q shouldn't have been changed even

Re: blk-mq crash under KVM in multiqueue block code (with virtio-blk and ext4)

2014-09-17 Thread David Hildenbrand
Does anyone have an idea? The request itself is completely filled with cc That is very weird, the 'rq' is got from hctx-tags, and rq should be valid, and rq-q shouldn't have been changed even though it was double free or double allocation. I am currently asking myself if

Re: blk-mq crash under KVM in multiqueue block code (with virtio-blk and ext4)

2014-09-17 Thread David Hildenbrand
On Wed, 17 Sep 2014 14:00:34 +0200 David Hildenbrand d...@linux.vnet.ibm.com wrote: Does anyone have an idea? The request itself is completely filled with cc That is very weird, the 'rq' is got from hctx-tags, and rq should be valid, and rq-q shouldn't have been changed

Re: blk-mq crash under KVM in multiqueue block code (with virtio-blk and ext4)

2014-09-17 Thread Jens Axboe
On 2014-09-17 07:52, Ming Lei wrote: On Wed, 17 Sep 2014 14:00:34 +0200 David Hildenbrand d...@linux.vnet.ibm.com wrote: Does anyone have an idea? The request itself is completely filled with cc That is very weird, the 'rq' is got from hctx-tags, and rq should be valid, and rq-q shouldn't

Re: blk-mq crash under KVM in multiqueue block code (with virtio-blk and ext4)

2014-09-17 Thread Ming Lei
On Wed, Sep 17, 2014 at 10:22 PM, Jens Axboe ax...@kernel.dk wrote: Another way would be to ensure that the timeout handler doesn't touch hw_ctx or tag_sets that aren't fully initialized yet. But I think this is safer/cleaner. That may not be easy or enough to check if hw_ctx/tag_sets are

Re: blk-mq crash under KVM in multiqueue block code (with virtio-blk and ext4)

2014-09-17 Thread David Hildenbrand
On Wed, Sep 17, 2014 at 10:22 PM, Jens Axboe ax...@kernel.dk wrote: Another way would be to ensure that the timeout handler doesn't touch hw_ctx or tag_sets that aren't fully initialized yet. But I think this is safer/cleaner. That may not be easy or enough to check if hw_ctx/tag_sets

Re: blk-mq crash under KVM in multiqueue block code (with virtio-blk and ext4)

2014-09-17 Thread Jens Axboe
On 09/17/2014 01:09 PM, David Hildenbrand wrote: 0. That should already be sufficient to hinder blk_mq_tag_to_rq and the calling method to do the wrong thing. Yes, clearing rq-cmd_flags should be enough. And looks better to move rq initialization to __blk_mq_free_request() too, otherwise

Re: blk-mq crash under KVM in multiqueue block code (with virtio-blk and ext4)

2014-09-17 Thread Ming Lei
On Thu, Sep 18, 2014 at 3:09 AM, David Hildenbrand d...@linux.vnet.ibm.com wrote: On Wed, Sep 17, 2014 at 10:22 PM, Jens Axboe ax...@kernel.dk wrote: Another way would be to ensure that the timeout handler doesn't touch hw_ctx or tag_sets that aren't fully initialized yet. But I think

Re: blk-mq crash under KVM in multiqueue block code (with virtio-blk and ext4)

2014-09-12 Thread Christian Borntraeger
On 09/11/2014 12:26 PM, Christian Borntraeger wrote: Folks, we have seen the following bug with 3.16 as a KVM guest. It suspect the blk-mq rework that happened between 3.15 and 3.16, but it can be something completely different. [ 65.992022] Unable to handle kernel pointer

Re: blk-mq crash under KVM in multiqueue block code (with virtio-blk and ext4)

2014-09-12 Thread Christian Borntraeger
On 09/12/2014 01:54 PM, Ming Lei wrote: On Thu, Sep 11, 2014 at 6:26 PM, Christian Borntraeger borntrae...@de.ibm.com wrote: Folks, we have seen the following bug with 3.16 as a KVM guest. It suspect the blk-mq rework that happened between 3.15 and 3.16, but it can be something completely

blk-mq crash under KVM in multiqueue block code (with virtio-blk and ext4)

2014-09-11 Thread Christian Borntraeger
Folks, we have seen the following bug with 3.16 as a KVM guest. It suspect the blk-mq rework that happened between 3.15 and 3.16, but it can be something completely different. [ 65.992022] Unable to handle kernel pointer dereference in virtual kernel address space [ 65.992187] failing