Re: [PATCH] nvme: utilize two queue maps, one for reads and one for writes

2018-11-15 Thread Jens Axboe
On 11/15/18 3:46 PM, Guenter Roeck wrote:
> On Thu, Nov 15, 2018 at 03:12:48PM -0700, Jens Axboe wrote:
>> On 11/15/18 3:06 PM, Guenter Roeck wrote:
>>> On Thu, Nov 15, 2018 at 12:43:40PM -0700, Jens Axboe wrote:
 On 11/15/18 12:40 PM, Jens Axboe wrote:
> On 11/15/18 12:38 PM, Guenter Roeck wrote:
>> On Thu, Nov 15, 2018 at 12:29:04PM -0700, Jens Axboe wrote:
>>> On 11/15/18 12:11 PM, Guenter Roeck wrote:
 On Wed, Nov 14, 2018 at 10:12:44AM -0700, Jens Axboe wrote:
>
> I think the below patch should fix it.
>

 I spoke too early. sparc64, next-20181115:

 [   14.204370] nvme nvme0: pci function :02:00.0
 [   14.249956] nvme nvme0: Removing after probe failure status: -5
 [   14.263496] [ cut here ]
 [   14.263913] WARNING: CPU: 0 PID: 15 at kernel/irq/manage.c:1597 
 __free_irq+0xa4/0x320
 [   14.264265] Trying to free already-free IRQ 9
 [   14.264519] Modules linked in:
 [   14.264961] CPU: 0 PID: 15 Comm: kworker/u2:1 Not tainted 
 4.20.0-rc2-next-20181115 #1
 [   14.26] Workqueue: nvme-reset-wq nvme_reset_work
 [   14.265899] Call Trace:
 [   14.266118]  [0046944c] __warn+0xcc/0x100
 [   14.266375]  [004694b0] warn_slowpath_fmt+0x30/0x40
 [   14.266635]  [004d4ce4] __free_irq+0xa4/0x320
 [   14.266867]  [004d4ff8] free_irq+0x38/0x80
 [   14.267092]  [007b1874] pci_free_irq+0x14/0x40
 [   14.267327]  [008a5444] nvme_dev_disable+0xe4/0x520
 [   14.267576]  [008a69b8] nvme_reset_work+0x138/0x1c60
 [   14.267827]  [00488dd0] process_one_work+0x230/0x6e0
 [   14.268079]  [004894f4] worker_thread+0x274/0x520
 [   14.268321]  [00490624] kthread+0xe4/0x120
 [   14.268544]  [004060c4] ret_from_fork+0x1c/0x2c
 [   14.268825]  []   (null)
 [   14.269089] irq event stamp: 32796
 [   14.269350] hardirqs last  enabled at (32795): [<00b624a4>] 
 _raw_spin_unlock_irqrestore+0x24/0x80
 [   14.269757] hardirqs last disabled at (32796): [<00b622f4>] 
 _raw_spin_lock_irqsave+0x14/0x60
 [   14.270566] softirqs last  enabled at (32780): [<00b64c18>] 
 __do_softirq+0x238/0x520
 [   14.271206] softirqs last disabled at (32729): [<0042ceec>] 
 do_softirq_own_stack+0x2c/0x40
 [   14.272288] ---[ end trace cb79ccd2a0a03f3c ]---

 Looks like an error during probe followed by an error cleanup problem.
>>>
>>> Did it previous probe fine? Or is the new thing just the fact that
>>> we spew a warning on trying to free a non-existing vector?
>>>
>> This works fine in mainline, if that is your question.
>
> Yeah, as soon as I sent the other email I realized that. Let me send
> you a quick patch.

 How's this?


 diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
 index ffbab5b01df4..fd73bfd2d1be 100644
 --- a/drivers/nvme/host/pci.c
 +++ b/drivers/nvme/host/pci.c
 @@ -2088,15 +2088,11 @@ static int nvme_setup_irqs(struct nvme_dev *dev, 
 int nr_io_queues)
affd.nr_sets = 1;
  
/*
 -   * Need IRQs for read+write queues, and one for the admin queue.
 -   * If we can't get more than one vector, we have to share the
 -   * admin queue and IO queue vector. For that case, don't add
 -   * an extra vector for the admin queue, or we'll continue
 -   * asking for 2 and get -ENOSPC in return.
 +   * If we got a failure and we're down to asking for just
 +   * 1 + 1 queues, just ask for a single vector. We'll share
 +   * that between the single IO queue and the admin queue.
 */
 -  if (result == -ENOSPC && nr_io_queues == 1)
 -  nr_io_queues = 1;
 -  else
 +  if (!(result < 0 && nr_io_queues == 1))
nr_io_queues = irq_sets[0] + irq_sets[1] + 1;
  
>>>
>>> Unfortunately, the code doesn't even get here because the call of
>>> pci_alloc_irq_vectors_affinity in the first iteration fails with
>>> -EINVAL, which results in an immediate return with -EIO.
>>
>> Oh yeah... How about this then?
>>
> Yes, this one works (at least on sparc64). Do I need to test
> on other architectures as well ?

Should be fine, hopefully... Thanks for testing!

>> @@ -2111,6 +2107,9 @@ static int nvme_setup_irqs(struct nvme_dev *dev, int 
>> nr_io_queues)
>>  if (!nr_io_queues)
>>  return result;
>>  continue;
>> +} else if (result == -EINVAL) {
> 
> Add an 

Re: [PATCH] nvme: utilize two queue maps, one for reads and one for writes

2018-11-15 Thread Jens Axboe
On 11/15/18 3:46 PM, Guenter Roeck wrote:
> On Thu, Nov 15, 2018 at 03:12:48PM -0700, Jens Axboe wrote:
>> On 11/15/18 3:06 PM, Guenter Roeck wrote:
>>> On Thu, Nov 15, 2018 at 12:43:40PM -0700, Jens Axboe wrote:
 On 11/15/18 12:40 PM, Jens Axboe wrote:
> On 11/15/18 12:38 PM, Guenter Roeck wrote:
>> On Thu, Nov 15, 2018 at 12:29:04PM -0700, Jens Axboe wrote:
>>> On 11/15/18 12:11 PM, Guenter Roeck wrote:
 On Wed, Nov 14, 2018 at 10:12:44AM -0700, Jens Axboe wrote:
>
> I think the below patch should fix it.
>

 I spoke too early. sparc64, next-20181115:

 [   14.204370] nvme nvme0: pci function :02:00.0
 [   14.249956] nvme nvme0: Removing after probe failure status: -5
 [   14.263496] [ cut here ]
 [   14.263913] WARNING: CPU: 0 PID: 15 at kernel/irq/manage.c:1597 
 __free_irq+0xa4/0x320
 [   14.264265] Trying to free already-free IRQ 9
 [   14.264519] Modules linked in:
 [   14.264961] CPU: 0 PID: 15 Comm: kworker/u2:1 Not tainted 
 4.20.0-rc2-next-20181115 #1
 [   14.26] Workqueue: nvme-reset-wq nvme_reset_work
 [   14.265899] Call Trace:
 [   14.266118]  [0046944c] __warn+0xcc/0x100
 [   14.266375]  [004694b0] warn_slowpath_fmt+0x30/0x40
 [   14.266635]  [004d4ce4] __free_irq+0xa4/0x320
 [   14.266867]  [004d4ff8] free_irq+0x38/0x80
 [   14.267092]  [007b1874] pci_free_irq+0x14/0x40
 [   14.267327]  [008a5444] nvme_dev_disable+0xe4/0x520
 [   14.267576]  [008a69b8] nvme_reset_work+0x138/0x1c60
 [   14.267827]  [00488dd0] process_one_work+0x230/0x6e0
 [   14.268079]  [004894f4] worker_thread+0x274/0x520
 [   14.268321]  [00490624] kthread+0xe4/0x120
 [   14.268544]  [004060c4] ret_from_fork+0x1c/0x2c
 [   14.268825]  []   (null)
 [   14.269089] irq event stamp: 32796
 [   14.269350] hardirqs last  enabled at (32795): [<00b624a4>] 
 _raw_spin_unlock_irqrestore+0x24/0x80
 [   14.269757] hardirqs last disabled at (32796): [<00b622f4>] 
 _raw_spin_lock_irqsave+0x14/0x60
 [   14.270566] softirqs last  enabled at (32780): [<00b64c18>] 
 __do_softirq+0x238/0x520
 [   14.271206] softirqs last disabled at (32729): [<0042ceec>] 
 do_softirq_own_stack+0x2c/0x40
 [   14.272288] ---[ end trace cb79ccd2a0a03f3c ]---

 Looks like an error during probe followed by an error cleanup problem.
>>>
>>> Did it previous probe fine? Or is the new thing just the fact that
>>> we spew a warning on trying to free a non-existing vector?
>>>
>> This works fine in mainline, if that is your question.
>
> Yeah, as soon as I sent the other email I realized that. Let me send
> you a quick patch.

 How's this?


 diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
 index ffbab5b01df4..fd73bfd2d1be 100644
 --- a/drivers/nvme/host/pci.c
 +++ b/drivers/nvme/host/pci.c
 @@ -2088,15 +2088,11 @@ static int nvme_setup_irqs(struct nvme_dev *dev, 
 int nr_io_queues)
affd.nr_sets = 1;
  
/*
 -   * Need IRQs for read+write queues, and one for the admin queue.
 -   * If we can't get more than one vector, we have to share the
 -   * admin queue and IO queue vector. For that case, don't add
 -   * an extra vector for the admin queue, or we'll continue
 -   * asking for 2 and get -ENOSPC in return.
 +   * If we got a failure and we're down to asking for just
 +   * 1 + 1 queues, just ask for a single vector. We'll share
 +   * that between the single IO queue and the admin queue.
 */
 -  if (result == -ENOSPC && nr_io_queues == 1)
 -  nr_io_queues = 1;
 -  else
 +  if (!(result < 0 && nr_io_queues == 1))
nr_io_queues = irq_sets[0] + irq_sets[1] + 1;
  
>>>
>>> Unfortunately, the code doesn't even get here because the call of
>>> pci_alloc_irq_vectors_affinity in the first iteration fails with
>>> -EINVAL, which results in an immediate return with -EIO.
>>
>> Oh yeah... How about this then?
>>
> Yes, this one works (at least on sparc64). Do I need to test
> on other architectures as well ?

Should be fine, hopefully... Thanks for testing!

>> @@ -2111,6 +2107,9 @@ static int nvme_setup_irqs(struct nvme_dev *dev, int 
>> nr_io_queues)
>>  if (!nr_io_queues)
>>  return result;
>>  continue;
>> +} else if (result == -EINVAL) {
> 
> Add an 

Re: [PATCH] nvme: utilize two queue maps, one for reads and one for writes

2018-11-15 Thread Guenter Roeck
On Thu, Nov 15, 2018 at 03:12:48PM -0700, Jens Axboe wrote:
> On 11/15/18 3:06 PM, Guenter Roeck wrote:
> > On Thu, Nov 15, 2018 at 12:43:40PM -0700, Jens Axboe wrote:
> >> On 11/15/18 12:40 PM, Jens Axboe wrote:
> >>> On 11/15/18 12:38 PM, Guenter Roeck wrote:
>  On Thu, Nov 15, 2018 at 12:29:04PM -0700, Jens Axboe wrote:
> > On 11/15/18 12:11 PM, Guenter Roeck wrote:
> >> On Wed, Nov 14, 2018 at 10:12:44AM -0700, Jens Axboe wrote:
> >>>
> >>> I think the below patch should fix it.
> >>>
> >>
> >> I spoke too early. sparc64, next-20181115:
> >>
> >> [   14.204370] nvme nvme0: pci function :02:00.0
> >> [   14.249956] nvme nvme0: Removing after probe failure status: -5
> >> [   14.263496] [ cut here ]
> >> [   14.263913] WARNING: CPU: 0 PID: 15 at kernel/irq/manage.c:1597 
> >> __free_irq+0xa4/0x320
> >> [   14.264265] Trying to free already-free IRQ 9
> >> [   14.264519] Modules linked in:
> >> [   14.264961] CPU: 0 PID: 15 Comm: kworker/u2:1 Not tainted 
> >> 4.20.0-rc2-next-20181115 #1
> >> [   14.26] Workqueue: nvme-reset-wq nvme_reset_work
> >> [   14.265899] Call Trace:
> >> [   14.266118]  [0046944c] __warn+0xcc/0x100
> >> [   14.266375]  [004694b0] warn_slowpath_fmt+0x30/0x40
> >> [   14.266635]  [004d4ce4] __free_irq+0xa4/0x320
> >> [   14.266867]  [004d4ff8] free_irq+0x38/0x80
> >> [   14.267092]  [007b1874] pci_free_irq+0x14/0x40
> >> [   14.267327]  [008a5444] nvme_dev_disable+0xe4/0x520
> >> [   14.267576]  [008a69b8] nvme_reset_work+0x138/0x1c60
> >> [   14.267827]  [00488dd0] process_one_work+0x230/0x6e0
> >> [   14.268079]  [004894f4] worker_thread+0x274/0x520
> >> [   14.268321]  [00490624] kthread+0xe4/0x120
> >> [   14.268544]  [004060c4] ret_from_fork+0x1c/0x2c
> >> [   14.268825]  []   (null)
> >> [   14.269089] irq event stamp: 32796
> >> [   14.269350] hardirqs last  enabled at (32795): [<00b624a4>] 
> >> _raw_spin_unlock_irqrestore+0x24/0x80
> >> [   14.269757] hardirqs last disabled at (32796): [<00b622f4>] 
> >> _raw_spin_lock_irqsave+0x14/0x60
> >> [   14.270566] softirqs last  enabled at (32780): [<00b64c18>] 
> >> __do_softirq+0x238/0x520
> >> [   14.271206] softirqs last disabled at (32729): [<0042ceec>] 
> >> do_softirq_own_stack+0x2c/0x40
> >> [   14.272288] ---[ end trace cb79ccd2a0a03f3c ]---
> >>
> >> Looks like an error during probe followed by an error cleanup problem.
> >
> > Did it previous probe fine? Or is the new thing just the fact that
> > we spew a warning on trying to free a non-existing vector?
> >
>  This works fine in mainline, if that is your question.
> >>>
> >>> Yeah, as soon as I sent the other email I realized that. Let me send
> >>> you a quick patch.
> >>
> >> How's this?
> >>
> >>
> >> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> >> index ffbab5b01df4..fd73bfd2d1be 100644
> >> --- a/drivers/nvme/host/pci.c
> >> +++ b/drivers/nvme/host/pci.c
> >> @@ -2088,15 +2088,11 @@ static int nvme_setup_irqs(struct nvme_dev *dev, 
> >> int nr_io_queues)
> >>affd.nr_sets = 1;
> >>  
> >>/*
> >> -   * Need IRQs for read+write queues, and one for the admin queue.
> >> -   * If we can't get more than one vector, we have to share the
> >> -   * admin queue and IO queue vector. For that case, don't add
> >> -   * an extra vector for the admin queue, or we'll continue
> >> -   * asking for 2 and get -ENOSPC in return.
> >> +   * If we got a failure and we're down to asking for just
> >> +   * 1 + 1 queues, just ask for a single vector. We'll share
> >> +   * that between the single IO queue and the admin queue.
> >> */
> >> -  if (result == -ENOSPC && nr_io_queues == 1)
> >> -  nr_io_queues = 1;
> >> -  else
> >> +  if (!(result < 0 && nr_io_queues == 1))
> >>nr_io_queues = irq_sets[0] + irq_sets[1] + 1;
> >>  
> > 
> > Unfortunately, the code doesn't even get here because the call of
> > pci_alloc_irq_vectors_affinity in the first iteration fails with
> > -EINVAL, which results in an immediate return with -EIO.
> 
> Oh yeah... How about this then?
> 
Yes, this one works (at least on sparc64). Do I need to test
on other architectures as well ?

Thanks,
Guenter

> 
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index ffbab5b01df4..4d161daa9c3a 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -2088,15 +2088,11 @@ static int nvme_setup_irqs(struct nvme_dev *dev, int 
> nr_io_queues)
>   affd.nr_sets = 1;
>  
>   /*
> -  * Need IRQs 

Re: [PATCH] nvme: utilize two queue maps, one for reads and one for writes

2018-11-15 Thread Guenter Roeck
On Thu, Nov 15, 2018 at 03:12:48PM -0700, Jens Axboe wrote:
> On 11/15/18 3:06 PM, Guenter Roeck wrote:
> > On Thu, Nov 15, 2018 at 12:43:40PM -0700, Jens Axboe wrote:
> >> On 11/15/18 12:40 PM, Jens Axboe wrote:
> >>> On 11/15/18 12:38 PM, Guenter Roeck wrote:
>  On Thu, Nov 15, 2018 at 12:29:04PM -0700, Jens Axboe wrote:
> > On 11/15/18 12:11 PM, Guenter Roeck wrote:
> >> On Wed, Nov 14, 2018 at 10:12:44AM -0700, Jens Axboe wrote:
> >>>
> >>> I think the below patch should fix it.
> >>>
> >>
> >> I spoke too early. sparc64, next-20181115:
> >>
> >> [   14.204370] nvme nvme0: pci function :02:00.0
> >> [   14.249956] nvme nvme0: Removing after probe failure status: -5
> >> [   14.263496] [ cut here ]
> >> [   14.263913] WARNING: CPU: 0 PID: 15 at kernel/irq/manage.c:1597 
> >> __free_irq+0xa4/0x320
> >> [   14.264265] Trying to free already-free IRQ 9
> >> [   14.264519] Modules linked in:
> >> [   14.264961] CPU: 0 PID: 15 Comm: kworker/u2:1 Not tainted 
> >> 4.20.0-rc2-next-20181115 #1
> >> [   14.26] Workqueue: nvme-reset-wq nvme_reset_work
> >> [   14.265899] Call Trace:
> >> [   14.266118]  [0046944c] __warn+0xcc/0x100
> >> [   14.266375]  [004694b0] warn_slowpath_fmt+0x30/0x40
> >> [   14.266635]  [004d4ce4] __free_irq+0xa4/0x320
> >> [   14.266867]  [004d4ff8] free_irq+0x38/0x80
> >> [   14.267092]  [007b1874] pci_free_irq+0x14/0x40
> >> [   14.267327]  [008a5444] nvme_dev_disable+0xe4/0x520
> >> [   14.267576]  [008a69b8] nvme_reset_work+0x138/0x1c60
> >> [   14.267827]  [00488dd0] process_one_work+0x230/0x6e0
> >> [   14.268079]  [004894f4] worker_thread+0x274/0x520
> >> [   14.268321]  [00490624] kthread+0xe4/0x120
> >> [   14.268544]  [004060c4] ret_from_fork+0x1c/0x2c
> >> [   14.268825]  []   (null)
> >> [   14.269089] irq event stamp: 32796
> >> [   14.269350] hardirqs last  enabled at (32795): [<00b624a4>] 
> >> _raw_spin_unlock_irqrestore+0x24/0x80
> >> [   14.269757] hardirqs last disabled at (32796): [<00b622f4>] 
> >> _raw_spin_lock_irqsave+0x14/0x60
> >> [   14.270566] softirqs last  enabled at (32780): [<00b64c18>] 
> >> __do_softirq+0x238/0x520
> >> [   14.271206] softirqs last disabled at (32729): [<0042ceec>] 
> >> do_softirq_own_stack+0x2c/0x40
> >> [   14.272288] ---[ end trace cb79ccd2a0a03f3c ]---
> >>
> >> Looks like an error during probe followed by an error cleanup problem.
> >
> > Did it previous probe fine? Or is the new thing just the fact that
> > we spew a warning on trying to free a non-existing vector?
> >
>  This works fine in mainline, if that is your question.
> >>>
> >>> Yeah, as soon as I sent the other email I realized that. Let me send
> >>> you a quick patch.
> >>
> >> How's this?
> >>
> >>
> >> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> >> index ffbab5b01df4..fd73bfd2d1be 100644
> >> --- a/drivers/nvme/host/pci.c
> >> +++ b/drivers/nvme/host/pci.c
> >> @@ -2088,15 +2088,11 @@ static int nvme_setup_irqs(struct nvme_dev *dev, 
> >> int nr_io_queues)
> >>affd.nr_sets = 1;
> >>  
> >>/*
> >> -   * Need IRQs for read+write queues, and one for the admin queue.
> >> -   * If we can't get more than one vector, we have to share the
> >> -   * admin queue and IO queue vector. For that case, don't add
> >> -   * an extra vector for the admin queue, or we'll continue
> >> -   * asking for 2 and get -ENOSPC in return.
> >> +   * If we got a failure and we're down to asking for just
> >> +   * 1 + 1 queues, just ask for a single vector. We'll share
> >> +   * that between the single IO queue and the admin queue.
> >> */
> >> -  if (result == -ENOSPC && nr_io_queues == 1)
> >> -  nr_io_queues = 1;
> >> -  else
> >> +  if (!(result < 0 && nr_io_queues == 1))
> >>nr_io_queues = irq_sets[0] + irq_sets[1] + 1;
> >>  
> > 
> > Unfortunately, the code doesn't even get here because the call of
> > pci_alloc_irq_vectors_affinity in the first iteration fails with
> > -EINVAL, which results in an immediate return with -EIO.
> 
> Oh yeah... How about this then?
> 
Yes, this one works (at least on sparc64). Do I need to test
on other architectures as well ?

Thanks,
Guenter

> 
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index ffbab5b01df4..4d161daa9c3a 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -2088,15 +2088,11 @@ static int nvme_setup_irqs(struct nvme_dev *dev, int 
> nr_io_queues)
>   affd.nr_sets = 1;
>  
>   /*
> -  * Need IRQs 

Re: [PATCH] nvme: utilize two queue maps, one for reads and one for writes

2018-11-15 Thread Jens Axboe
On 11/15/18 3:06 PM, Guenter Roeck wrote:
> On Thu, Nov 15, 2018 at 12:43:40PM -0700, Jens Axboe wrote:
>> On 11/15/18 12:40 PM, Jens Axboe wrote:
>>> On 11/15/18 12:38 PM, Guenter Roeck wrote:
 On Thu, Nov 15, 2018 at 12:29:04PM -0700, Jens Axboe wrote:
> On 11/15/18 12:11 PM, Guenter Roeck wrote:
>> On Wed, Nov 14, 2018 at 10:12:44AM -0700, Jens Axboe wrote:
>>>
>>> I think the below patch should fix it.
>>>
>>
>> I spoke too early. sparc64, next-20181115:
>>
>> [   14.204370] nvme nvme0: pci function :02:00.0
>> [   14.249956] nvme nvme0: Removing after probe failure status: -5
>> [   14.263496] [ cut here ]
>> [   14.263913] WARNING: CPU: 0 PID: 15 at kernel/irq/manage.c:1597 
>> __free_irq+0xa4/0x320
>> [   14.264265] Trying to free already-free IRQ 9
>> [   14.264519] Modules linked in:
>> [   14.264961] CPU: 0 PID: 15 Comm: kworker/u2:1 Not tainted 
>> 4.20.0-rc2-next-20181115 #1
>> [   14.26] Workqueue: nvme-reset-wq nvme_reset_work
>> [   14.265899] Call Trace:
>> [   14.266118]  [0046944c] __warn+0xcc/0x100
>> [   14.266375]  [004694b0] warn_slowpath_fmt+0x30/0x40
>> [   14.266635]  [004d4ce4] __free_irq+0xa4/0x320
>> [   14.266867]  [004d4ff8] free_irq+0x38/0x80
>> [   14.267092]  [007b1874] pci_free_irq+0x14/0x40
>> [   14.267327]  [008a5444] nvme_dev_disable+0xe4/0x520
>> [   14.267576]  [008a69b8] nvme_reset_work+0x138/0x1c60
>> [   14.267827]  [00488dd0] process_one_work+0x230/0x6e0
>> [   14.268079]  [004894f4] worker_thread+0x274/0x520
>> [   14.268321]  [00490624] kthread+0xe4/0x120
>> [   14.268544]  [004060c4] ret_from_fork+0x1c/0x2c
>> [   14.268825]  []   (null)
>> [   14.269089] irq event stamp: 32796
>> [   14.269350] hardirqs last  enabled at (32795): [<00b624a4>] 
>> _raw_spin_unlock_irqrestore+0x24/0x80
>> [   14.269757] hardirqs last disabled at (32796): [<00b622f4>] 
>> _raw_spin_lock_irqsave+0x14/0x60
>> [   14.270566] softirqs last  enabled at (32780): [<00b64c18>] 
>> __do_softirq+0x238/0x520
>> [   14.271206] softirqs last disabled at (32729): [<0042ceec>] 
>> do_softirq_own_stack+0x2c/0x40
>> [   14.272288] ---[ end trace cb79ccd2a0a03f3c ]---
>>
>> Looks like an error during probe followed by an error cleanup problem.
>
> Did it previous probe fine? Or is the new thing just the fact that
> we spew a warning on trying to free a non-existing vector?
>
 This works fine in mainline, if that is your question.
>>>
>>> Yeah, as soon as I sent the other email I realized that. Let me send
>>> you a quick patch.
>>
>> How's this?
>>
>>
>> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
>> index ffbab5b01df4..fd73bfd2d1be 100644
>> --- a/drivers/nvme/host/pci.c
>> +++ b/drivers/nvme/host/pci.c
>> @@ -2088,15 +2088,11 @@ static int nvme_setup_irqs(struct nvme_dev *dev, int 
>> nr_io_queues)
>>  affd.nr_sets = 1;
>>  
>>  /*
>> - * Need IRQs for read+write queues, and one for the admin queue.
>> - * If we can't get more than one vector, we have to share the
>> - * admin queue and IO queue vector. For that case, don't add
>> - * an extra vector for the admin queue, or we'll continue
>> - * asking for 2 and get -ENOSPC in return.
>> + * If we got a failure and we're down to asking for just
>> + * 1 + 1 queues, just ask for a single vector. We'll share
>> + * that between the single IO queue and the admin queue.
>>   */
>> -if (result == -ENOSPC && nr_io_queues == 1)
>> -nr_io_queues = 1;
>> -else
>> +if (!(result < 0 && nr_io_queues == 1))
>>  nr_io_queues = irq_sets[0] + irq_sets[1] + 1;
>>  
> 
> Unfortunately, the code doesn't even get here because the call of
> pci_alloc_irq_vectors_affinity in the first iteration fails with
> -EINVAL, which results in an immediate return with -EIO.

Oh yeah... How about this then?


diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index ffbab5b01df4..4d161daa9c3a 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -2088,15 +2088,11 @@ static int nvme_setup_irqs(struct nvme_dev *dev, int 
nr_io_queues)
affd.nr_sets = 1;
 
/*
-* Need IRQs for read+write queues, and one for the admin queue.
-* If we can't get more than one vector, we have to share the
-* admin queue and IO queue vector. For that case, don't add
-* an extra vector for the admin queue, or we'll continue
-* asking for 2 and get -ENOSPC in return.

Re: [PATCH] nvme: utilize two queue maps, one for reads and one for writes

2018-11-15 Thread Jens Axboe
On 11/15/18 3:06 PM, Guenter Roeck wrote:
> On Thu, Nov 15, 2018 at 12:43:40PM -0700, Jens Axboe wrote:
>> On 11/15/18 12:40 PM, Jens Axboe wrote:
>>> On 11/15/18 12:38 PM, Guenter Roeck wrote:
 On Thu, Nov 15, 2018 at 12:29:04PM -0700, Jens Axboe wrote:
> On 11/15/18 12:11 PM, Guenter Roeck wrote:
>> On Wed, Nov 14, 2018 at 10:12:44AM -0700, Jens Axboe wrote:
>>>
>>> I think the below patch should fix it.
>>>
>>
>> I spoke too early. sparc64, next-20181115:
>>
>> [   14.204370] nvme nvme0: pci function :02:00.0
>> [   14.249956] nvme nvme0: Removing after probe failure status: -5
>> [   14.263496] [ cut here ]
>> [   14.263913] WARNING: CPU: 0 PID: 15 at kernel/irq/manage.c:1597 
>> __free_irq+0xa4/0x320
>> [   14.264265] Trying to free already-free IRQ 9
>> [   14.264519] Modules linked in:
>> [   14.264961] CPU: 0 PID: 15 Comm: kworker/u2:1 Not tainted 
>> 4.20.0-rc2-next-20181115 #1
>> [   14.26] Workqueue: nvme-reset-wq nvme_reset_work
>> [   14.265899] Call Trace:
>> [   14.266118]  [0046944c] __warn+0xcc/0x100
>> [   14.266375]  [004694b0] warn_slowpath_fmt+0x30/0x40
>> [   14.266635]  [004d4ce4] __free_irq+0xa4/0x320
>> [   14.266867]  [004d4ff8] free_irq+0x38/0x80
>> [   14.267092]  [007b1874] pci_free_irq+0x14/0x40
>> [   14.267327]  [008a5444] nvme_dev_disable+0xe4/0x520
>> [   14.267576]  [008a69b8] nvme_reset_work+0x138/0x1c60
>> [   14.267827]  [00488dd0] process_one_work+0x230/0x6e0
>> [   14.268079]  [004894f4] worker_thread+0x274/0x520
>> [   14.268321]  [00490624] kthread+0xe4/0x120
>> [   14.268544]  [004060c4] ret_from_fork+0x1c/0x2c
>> [   14.268825]  []   (null)
>> [   14.269089] irq event stamp: 32796
>> [   14.269350] hardirqs last  enabled at (32795): [<00b624a4>] 
>> _raw_spin_unlock_irqrestore+0x24/0x80
>> [   14.269757] hardirqs last disabled at (32796): [<00b622f4>] 
>> _raw_spin_lock_irqsave+0x14/0x60
>> [   14.270566] softirqs last  enabled at (32780): [<00b64c18>] 
>> __do_softirq+0x238/0x520
>> [   14.271206] softirqs last disabled at (32729): [<0042ceec>] 
>> do_softirq_own_stack+0x2c/0x40
>> [   14.272288] ---[ end trace cb79ccd2a0a03f3c ]---
>>
>> Looks like an error during probe followed by an error cleanup problem.
>
> Did it previous probe fine? Or is the new thing just the fact that
> we spew a warning on trying to free a non-existing vector?
>
 This works fine in mainline, if that is your question.
>>>
>>> Yeah, as soon as I sent the other email I realized that. Let me send
>>> you a quick patch.
>>
>> How's this?
>>
>>
>> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
>> index ffbab5b01df4..fd73bfd2d1be 100644
>> --- a/drivers/nvme/host/pci.c
>> +++ b/drivers/nvme/host/pci.c
>> @@ -2088,15 +2088,11 @@ static int nvme_setup_irqs(struct nvme_dev *dev, int 
>> nr_io_queues)
>>  affd.nr_sets = 1;
>>  
>>  /*
>> - * Need IRQs for read+write queues, and one for the admin queue.
>> - * If we can't get more than one vector, we have to share the
>> - * admin queue and IO queue vector. For that case, don't add
>> - * an extra vector for the admin queue, or we'll continue
>> - * asking for 2 and get -ENOSPC in return.
>> + * If we got a failure and we're down to asking for just
>> + * 1 + 1 queues, just ask for a single vector. We'll share
>> + * that between the single IO queue and the admin queue.
>>   */
>> -if (result == -ENOSPC && nr_io_queues == 1)
>> -nr_io_queues = 1;
>> -else
>> +if (!(result < 0 && nr_io_queues == 1))
>>  nr_io_queues = irq_sets[0] + irq_sets[1] + 1;
>>  
> 
> Unfortunately, the code doesn't even get here because the call of
> pci_alloc_irq_vectors_affinity in the first iteration fails with
> -EINVAL, which results in an immediate return with -EIO.

Oh yeah... How about this then?


diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index ffbab5b01df4..4d161daa9c3a 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -2088,15 +2088,11 @@ static int nvme_setup_irqs(struct nvme_dev *dev, int 
nr_io_queues)
affd.nr_sets = 1;
 
/*
-* Need IRQs for read+write queues, and one for the admin queue.
-* If we can't get more than one vector, we have to share the
-* admin queue and IO queue vector. For that case, don't add
-* an extra vector for the admin queue, or we'll continue
-* asking for 2 and get -ENOSPC in return.

Re: [PATCH] nvme: utilize two queue maps, one for reads and one for writes

2018-11-15 Thread Guenter Roeck
On Thu, Nov 15, 2018 at 12:43:40PM -0700, Jens Axboe wrote:
> On 11/15/18 12:40 PM, Jens Axboe wrote:
> > On 11/15/18 12:38 PM, Guenter Roeck wrote:
> >> On Thu, Nov 15, 2018 at 12:29:04PM -0700, Jens Axboe wrote:
> >>> On 11/15/18 12:11 PM, Guenter Roeck wrote:
>  On Wed, Nov 14, 2018 at 10:12:44AM -0700, Jens Axboe wrote:
> >
> > I think the below patch should fix it.
> >
> 
>  I spoke too early. sparc64, next-20181115:
> 
>  [   14.204370] nvme nvme0: pci function :02:00.0
>  [   14.249956] nvme nvme0: Removing after probe failure status: -5
>  [   14.263496] [ cut here ]
>  [   14.263913] WARNING: CPU: 0 PID: 15 at kernel/irq/manage.c:1597 
>  __free_irq+0xa4/0x320
>  [   14.264265] Trying to free already-free IRQ 9
>  [   14.264519] Modules linked in:
>  [   14.264961] CPU: 0 PID: 15 Comm: kworker/u2:1 Not tainted 
>  4.20.0-rc2-next-20181115 #1
>  [   14.26] Workqueue: nvme-reset-wq nvme_reset_work
>  [   14.265899] Call Trace:
>  [   14.266118]  [0046944c] __warn+0xcc/0x100
>  [   14.266375]  [004694b0] warn_slowpath_fmt+0x30/0x40
>  [   14.266635]  [004d4ce4] __free_irq+0xa4/0x320
>  [   14.266867]  [004d4ff8] free_irq+0x38/0x80
>  [   14.267092]  [007b1874] pci_free_irq+0x14/0x40
>  [   14.267327]  [008a5444] nvme_dev_disable+0xe4/0x520
>  [   14.267576]  [008a69b8] nvme_reset_work+0x138/0x1c60
>  [   14.267827]  [00488dd0] process_one_work+0x230/0x6e0
>  [   14.268079]  [004894f4] worker_thread+0x274/0x520
>  [   14.268321]  [00490624] kthread+0xe4/0x120
>  [   14.268544]  [004060c4] ret_from_fork+0x1c/0x2c
>  [   14.268825]  []   (null)
>  [   14.269089] irq event stamp: 32796
>  [   14.269350] hardirqs last  enabled at (32795): [<00b624a4>] 
>  _raw_spin_unlock_irqrestore+0x24/0x80
>  [   14.269757] hardirqs last disabled at (32796): [<00b622f4>] 
>  _raw_spin_lock_irqsave+0x14/0x60
>  [   14.270566] softirqs last  enabled at (32780): [<00b64c18>] 
>  __do_softirq+0x238/0x520
>  [   14.271206] softirqs last disabled at (32729): [<0042ceec>] 
>  do_softirq_own_stack+0x2c/0x40
>  [   14.272288] ---[ end trace cb79ccd2a0a03f3c ]---
> 
>  Looks like an error during probe followed by an error cleanup problem.
> >>>
> >>> Did it previous probe fine? Or is the new thing just the fact that
> >>> we spew a warning on trying to free a non-existing vector?
> >>>
> >> This works fine in mainline, if that is your question.
> > 
> > Yeah, as soon as I sent the other email I realized that. Let me send
> > you a quick patch.
> 
> How's this?
> 
> 
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index ffbab5b01df4..fd73bfd2d1be 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -2088,15 +2088,11 @@ static int nvme_setup_irqs(struct nvme_dev *dev, int 
> nr_io_queues)
>   affd.nr_sets = 1;
>  
>   /*
> -  * Need IRQs for read+write queues, and one for the admin queue.
> -  * If we can't get more than one vector, we have to share the
> -  * admin queue and IO queue vector. For that case, don't add
> -  * an extra vector for the admin queue, or we'll continue
> -  * asking for 2 and get -ENOSPC in return.
> +  * If we got a failure and we're down to asking for just
> +  * 1 + 1 queues, just ask for a single vector. We'll share
> +  * that between the single IO queue and the admin queue.
>*/
> - if (result == -ENOSPC && nr_io_queues == 1)
> - nr_io_queues = 1;
> - else
> + if (!(result < 0 && nr_io_queues == 1))
>   nr_io_queues = irq_sets[0] + irq_sets[1] + 1;
>  

Unfortunately, the code doesn't even get here because the call of
pci_alloc_irq_vectors_affinity in the first iteration fails with
-EINVAL, which results in an immediate return with -EIO.

Guenter

>   result = pci_alloc_irq_vectors_affinity(pdev, nr_io_queues,
> 
> -- 
> Jens Axboe
> 


Re: [PATCH] nvme: utilize two queue maps, one for reads and one for writes

2018-11-15 Thread Guenter Roeck
On Thu, Nov 15, 2018 at 12:43:40PM -0700, Jens Axboe wrote:
> On 11/15/18 12:40 PM, Jens Axboe wrote:
> > On 11/15/18 12:38 PM, Guenter Roeck wrote:
> >> On Thu, Nov 15, 2018 at 12:29:04PM -0700, Jens Axboe wrote:
> >>> On 11/15/18 12:11 PM, Guenter Roeck wrote:
>  On Wed, Nov 14, 2018 at 10:12:44AM -0700, Jens Axboe wrote:
> >
> > I think the below patch should fix it.
> >
> 
>  I spoke too early. sparc64, next-20181115:
> 
>  [   14.204370] nvme nvme0: pci function :02:00.0
>  [   14.249956] nvme nvme0: Removing after probe failure status: -5
>  [   14.263496] [ cut here ]
>  [   14.263913] WARNING: CPU: 0 PID: 15 at kernel/irq/manage.c:1597 
>  __free_irq+0xa4/0x320
>  [   14.264265] Trying to free already-free IRQ 9
>  [   14.264519] Modules linked in:
>  [   14.264961] CPU: 0 PID: 15 Comm: kworker/u2:1 Not tainted 
>  4.20.0-rc2-next-20181115 #1
>  [   14.26] Workqueue: nvme-reset-wq nvme_reset_work
>  [   14.265899] Call Trace:
>  [   14.266118]  [0046944c] __warn+0xcc/0x100
>  [   14.266375]  [004694b0] warn_slowpath_fmt+0x30/0x40
>  [   14.266635]  [004d4ce4] __free_irq+0xa4/0x320
>  [   14.266867]  [004d4ff8] free_irq+0x38/0x80
>  [   14.267092]  [007b1874] pci_free_irq+0x14/0x40
>  [   14.267327]  [008a5444] nvme_dev_disable+0xe4/0x520
>  [   14.267576]  [008a69b8] nvme_reset_work+0x138/0x1c60
>  [   14.267827]  [00488dd0] process_one_work+0x230/0x6e0
>  [   14.268079]  [004894f4] worker_thread+0x274/0x520
>  [   14.268321]  [00490624] kthread+0xe4/0x120
>  [   14.268544]  [004060c4] ret_from_fork+0x1c/0x2c
>  [   14.268825]  []   (null)
>  [   14.269089] irq event stamp: 32796
>  [   14.269350] hardirqs last  enabled at (32795): [<00b624a4>] 
>  _raw_spin_unlock_irqrestore+0x24/0x80
>  [   14.269757] hardirqs last disabled at (32796): [<00b622f4>] 
>  _raw_spin_lock_irqsave+0x14/0x60
>  [   14.270566] softirqs last  enabled at (32780): [<00b64c18>] 
>  __do_softirq+0x238/0x520
>  [   14.271206] softirqs last disabled at (32729): [<0042ceec>] 
>  do_softirq_own_stack+0x2c/0x40
>  [   14.272288] ---[ end trace cb79ccd2a0a03f3c ]---
> 
>  Looks like an error during probe followed by an error cleanup problem.
> >>>
> >>> Did it previous probe fine? Or is the new thing just the fact that
> >>> we spew a warning on trying to free a non-existing vector?
> >>>
> >> This works fine in mainline, if that is your question.
> > 
> > Yeah, as soon as I sent the other email I realized that. Let me send
> > you a quick patch.
> 
> How's this?
> 
> 
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index ffbab5b01df4..fd73bfd2d1be 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -2088,15 +2088,11 @@ static int nvme_setup_irqs(struct nvme_dev *dev, int 
> nr_io_queues)
>   affd.nr_sets = 1;
>  
>   /*
> -  * Need IRQs for read+write queues, and one for the admin queue.
> -  * If we can't get more than one vector, we have to share the
> -  * admin queue and IO queue vector. For that case, don't add
> -  * an extra vector for the admin queue, or we'll continue
> -  * asking for 2 and get -ENOSPC in return.
> +  * If we got a failure and we're down to asking for just
> +  * 1 + 1 queues, just ask for a single vector. We'll share
> +  * that between the single IO queue and the admin queue.
>*/
> - if (result == -ENOSPC && nr_io_queues == 1)
> - nr_io_queues = 1;
> - else
> + if (!(result < 0 && nr_io_queues == 1))
>   nr_io_queues = irq_sets[0] + irq_sets[1] + 1;
>  

Unfortunately, the code doesn't even get here because the call of
pci_alloc_irq_vectors_affinity in the first iteration fails with
-EINVAL, which results in an immediate return with -EIO.

Guenter

>   result = pci_alloc_irq_vectors_affinity(pdev, nr_io_queues,
> 
> -- 
> Jens Axboe
> 


Re: [PATCH] nvme: utilize two queue maps, one for reads and one for writes

2018-11-15 Thread Jens Axboe
On 11/15/18 12:40 PM, Jens Axboe wrote:
> On 11/15/18 12:38 PM, Guenter Roeck wrote:
>> On Thu, Nov 15, 2018 at 12:29:04PM -0700, Jens Axboe wrote:
>>> On 11/15/18 12:11 PM, Guenter Roeck wrote:
 On Wed, Nov 14, 2018 at 10:12:44AM -0700, Jens Axboe wrote:
>
> I think the below patch should fix it.
>

 I spoke too early. sparc64, next-20181115:

 [   14.204370] nvme nvme0: pci function :02:00.0
 [   14.249956] nvme nvme0: Removing after probe failure status: -5
 [   14.263496] [ cut here ]
 [   14.263913] WARNING: CPU: 0 PID: 15 at kernel/irq/manage.c:1597 
 __free_irq+0xa4/0x320
 [   14.264265] Trying to free already-free IRQ 9
 [   14.264519] Modules linked in:
 [   14.264961] CPU: 0 PID: 15 Comm: kworker/u2:1 Not tainted 
 4.20.0-rc2-next-20181115 #1
 [   14.26] Workqueue: nvme-reset-wq nvme_reset_work
 [   14.265899] Call Trace:
 [   14.266118]  [0046944c] __warn+0xcc/0x100
 [   14.266375]  [004694b0] warn_slowpath_fmt+0x30/0x40
 [   14.266635]  [004d4ce4] __free_irq+0xa4/0x320
 [   14.266867]  [004d4ff8] free_irq+0x38/0x80
 [   14.267092]  [007b1874] pci_free_irq+0x14/0x40
 [   14.267327]  [008a5444] nvme_dev_disable+0xe4/0x520
 [   14.267576]  [008a69b8] nvme_reset_work+0x138/0x1c60
 [   14.267827]  [00488dd0] process_one_work+0x230/0x6e0
 [   14.268079]  [004894f4] worker_thread+0x274/0x520
 [   14.268321]  [00490624] kthread+0xe4/0x120
 [   14.268544]  [004060c4] ret_from_fork+0x1c/0x2c
 [   14.268825]  []   (null)
 [   14.269089] irq event stamp: 32796
 [   14.269350] hardirqs last  enabled at (32795): [<00b624a4>] 
 _raw_spin_unlock_irqrestore+0x24/0x80
 [   14.269757] hardirqs last disabled at (32796): [<00b622f4>] 
 _raw_spin_lock_irqsave+0x14/0x60
 [   14.270566] softirqs last  enabled at (32780): [<00b64c18>] 
 __do_softirq+0x238/0x520
 [   14.271206] softirqs last disabled at (32729): [<0042ceec>] 
 do_softirq_own_stack+0x2c/0x40
 [   14.272288] ---[ end trace cb79ccd2a0a03f3c ]---

 Looks like an error during probe followed by an error cleanup problem.
>>>
>>> Did it previous probe fine? Or is the new thing just the fact that
>>> we spew a warning on trying to free a non-existing vector?
>>>
>> This works fine in mainline, if that is your question.
> 
> Yeah, as soon as I sent the other email I realized that. Let me send
> you a quick patch.

How's this?


diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index ffbab5b01df4..fd73bfd2d1be 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -2088,15 +2088,11 @@ static int nvme_setup_irqs(struct nvme_dev *dev, int 
nr_io_queues)
affd.nr_sets = 1;
 
/*
-* Need IRQs for read+write queues, and one for the admin queue.
-* If we can't get more than one vector, we have to share the
-* admin queue and IO queue vector. For that case, don't add
-* an extra vector for the admin queue, or we'll continue
-* asking for 2 and get -ENOSPC in return.
+* If we got a failure and we're down to asking for just
+* 1 + 1 queues, just ask for a single vector. We'll share
+* that between the single IO queue and the admin queue.
 */
-   if (result == -ENOSPC && nr_io_queues == 1)
-   nr_io_queues = 1;
-   else
+   if (!(result < 0 && nr_io_queues == 1))
nr_io_queues = irq_sets[0] + irq_sets[1] + 1;
 
result = pci_alloc_irq_vectors_affinity(pdev, nr_io_queues,

-- 
Jens Axboe



Re: [PATCH] nvme: utilize two queue maps, one for reads and one for writes

2018-11-15 Thread Jens Axboe
On 11/15/18 12:40 PM, Jens Axboe wrote:
> On 11/15/18 12:38 PM, Guenter Roeck wrote:
>> On Thu, Nov 15, 2018 at 12:29:04PM -0700, Jens Axboe wrote:
>>> On 11/15/18 12:11 PM, Guenter Roeck wrote:
 On Wed, Nov 14, 2018 at 10:12:44AM -0700, Jens Axboe wrote:
>
> I think the below patch should fix it.
>

 I spoke too early. sparc64, next-20181115:

 [   14.204370] nvme nvme0: pci function :02:00.0
 [   14.249956] nvme nvme0: Removing after probe failure status: -5
 [   14.263496] [ cut here ]
 [   14.263913] WARNING: CPU: 0 PID: 15 at kernel/irq/manage.c:1597 
 __free_irq+0xa4/0x320
 [   14.264265] Trying to free already-free IRQ 9
 [   14.264519] Modules linked in:
 [   14.264961] CPU: 0 PID: 15 Comm: kworker/u2:1 Not tainted 
 4.20.0-rc2-next-20181115 #1
 [   14.26] Workqueue: nvme-reset-wq nvme_reset_work
 [   14.265899] Call Trace:
 [   14.266118]  [0046944c] __warn+0xcc/0x100
 [   14.266375]  [004694b0] warn_slowpath_fmt+0x30/0x40
 [   14.266635]  [004d4ce4] __free_irq+0xa4/0x320
 [   14.266867]  [004d4ff8] free_irq+0x38/0x80
 [   14.267092]  [007b1874] pci_free_irq+0x14/0x40
 [   14.267327]  [008a5444] nvme_dev_disable+0xe4/0x520
 [   14.267576]  [008a69b8] nvme_reset_work+0x138/0x1c60
 [   14.267827]  [00488dd0] process_one_work+0x230/0x6e0
 [   14.268079]  [004894f4] worker_thread+0x274/0x520
 [   14.268321]  [00490624] kthread+0xe4/0x120
 [   14.268544]  [004060c4] ret_from_fork+0x1c/0x2c
 [   14.268825]  []   (null)
 [   14.269089] irq event stamp: 32796
 [   14.269350] hardirqs last  enabled at (32795): [<00b624a4>] 
 _raw_spin_unlock_irqrestore+0x24/0x80
 [   14.269757] hardirqs last disabled at (32796): [<00b622f4>] 
 _raw_spin_lock_irqsave+0x14/0x60
 [   14.270566] softirqs last  enabled at (32780): [<00b64c18>] 
 __do_softirq+0x238/0x520
 [   14.271206] softirqs last disabled at (32729): [<0042ceec>] 
 do_softirq_own_stack+0x2c/0x40
 [   14.272288] ---[ end trace cb79ccd2a0a03f3c ]---

 Looks like an error during probe followed by an error cleanup problem.
>>>
>>> Did it previous probe fine? Or is the new thing just the fact that
>>> we spew a warning on trying to free a non-existing vector?
>>>
>> This works fine in mainline, if that is your question.
> 
> Yeah, as soon as I sent the other email I realized that. Let me send
> you a quick patch.

How's this?


diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index ffbab5b01df4..fd73bfd2d1be 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -2088,15 +2088,11 @@ static int nvme_setup_irqs(struct nvme_dev *dev, int 
nr_io_queues)
affd.nr_sets = 1;
 
/*
-* Need IRQs for read+write queues, and one for the admin queue.
-* If we can't get more than one vector, we have to share the
-* admin queue and IO queue vector. For that case, don't add
-* an extra vector for the admin queue, or we'll continue
-* asking for 2 and get -ENOSPC in return.
+* If we got a failure and we're down to asking for just
+* 1 + 1 queues, just ask for a single vector. We'll share
+* that between the single IO queue and the admin queue.
 */
-   if (result == -ENOSPC && nr_io_queues == 1)
-   nr_io_queues = 1;
-   else
+   if (!(result < 0 && nr_io_queues == 1))
nr_io_queues = irq_sets[0] + irq_sets[1] + 1;
 
result = pci_alloc_irq_vectors_affinity(pdev, nr_io_queues,

-- 
Jens Axboe



Re: [PATCH] nvme: utilize two queue maps, one for reads and one for writes

2018-11-15 Thread Jens Axboe
On 11/15/18 12:38 PM, Guenter Roeck wrote:
> On Thu, Nov 15, 2018 at 12:29:04PM -0700, Jens Axboe wrote:
>> On 11/15/18 12:11 PM, Guenter Roeck wrote:
>>> On Wed, Nov 14, 2018 at 10:12:44AM -0700, Jens Axboe wrote:

 I think the below patch should fix it.

>>>
>>> I spoke too early. sparc64, next-20181115:
>>>
>>> [   14.204370] nvme nvme0: pci function :02:00.0
>>> [   14.249956] nvme nvme0: Removing after probe failure status: -5
>>> [   14.263496] [ cut here ]
>>> [   14.263913] WARNING: CPU: 0 PID: 15 at kernel/irq/manage.c:1597 
>>> __free_irq+0xa4/0x320
>>> [   14.264265] Trying to free already-free IRQ 9
>>> [   14.264519] Modules linked in:
>>> [   14.264961] CPU: 0 PID: 15 Comm: kworker/u2:1 Not tainted 
>>> 4.20.0-rc2-next-20181115 #1
>>> [   14.26] Workqueue: nvme-reset-wq nvme_reset_work
>>> [   14.265899] Call Trace:
>>> [   14.266118]  [0046944c] __warn+0xcc/0x100
>>> [   14.266375]  [004694b0] warn_slowpath_fmt+0x30/0x40
>>> [   14.266635]  [004d4ce4] __free_irq+0xa4/0x320
>>> [   14.266867]  [004d4ff8] free_irq+0x38/0x80
>>> [   14.267092]  [007b1874] pci_free_irq+0x14/0x40
>>> [   14.267327]  [008a5444] nvme_dev_disable+0xe4/0x520
>>> [   14.267576]  [008a69b8] nvme_reset_work+0x138/0x1c60
>>> [   14.267827]  [00488dd0] process_one_work+0x230/0x6e0
>>> [   14.268079]  [004894f4] worker_thread+0x274/0x520
>>> [   14.268321]  [00490624] kthread+0xe4/0x120
>>> [   14.268544]  [004060c4] ret_from_fork+0x1c/0x2c
>>> [   14.268825]  []   (null)
>>> [   14.269089] irq event stamp: 32796
>>> [   14.269350] hardirqs last  enabled at (32795): [<00b624a4>] 
>>> _raw_spin_unlock_irqrestore+0x24/0x80
>>> [   14.269757] hardirqs last disabled at (32796): [<00b622f4>] 
>>> _raw_spin_lock_irqsave+0x14/0x60
>>> [   14.270566] softirqs last  enabled at (32780): [<00b64c18>] 
>>> __do_softirq+0x238/0x520
>>> [   14.271206] softirqs last disabled at (32729): [<0042ceec>] 
>>> do_softirq_own_stack+0x2c/0x40
>>> [   14.272288] ---[ end trace cb79ccd2a0a03f3c ]---
>>>
>>> Looks like an error during probe followed by an error cleanup problem.
>>
>> Did it previous probe fine? Or is the new thing just the fact that
>> we spew a warning on trying to free a non-existing vector?
>>
> This works fine in mainline, if that is your question.

Yeah, as soon as I sent the other email I realized that. Let me send
you a quick patch.

-- 
Jens Axboe



Re: [PATCH] nvme: utilize two queue maps, one for reads and one for writes

2018-11-15 Thread Jens Axboe
On 11/15/18 12:38 PM, Guenter Roeck wrote:
> On Thu, Nov 15, 2018 at 12:29:04PM -0700, Jens Axboe wrote:
>> On 11/15/18 12:11 PM, Guenter Roeck wrote:
>>> On Wed, Nov 14, 2018 at 10:12:44AM -0700, Jens Axboe wrote:

 I think the below patch should fix it.

>>>
>>> I spoke too early. sparc64, next-20181115:
>>>
>>> [   14.204370] nvme nvme0: pci function :02:00.0
>>> [   14.249956] nvme nvme0: Removing after probe failure status: -5
>>> [   14.263496] [ cut here ]
>>> [   14.263913] WARNING: CPU: 0 PID: 15 at kernel/irq/manage.c:1597 
>>> __free_irq+0xa4/0x320
>>> [   14.264265] Trying to free already-free IRQ 9
>>> [   14.264519] Modules linked in:
>>> [   14.264961] CPU: 0 PID: 15 Comm: kworker/u2:1 Not tainted 
>>> 4.20.0-rc2-next-20181115 #1
>>> [   14.26] Workqueue: nvme-reset-wq nvme_reset_work
>>> [   14.265899] Call Trace:
>>> [   14.266118]  [0046944c] __warn+0xcc/0x100
>>> [   14.266375]  [004694b0] warn_slowpath_fmt+0x30/0x40
>>> [   14.266635]  [004d4ce4] __free_irq+0xa4/0x320
>>> [   14.266867]  [004d4ff8] free_irq+0x38/0x80
>>> [   14.267092]  [007b1874] pci_free_irq+0x14/0x40
>>> [   14.267327]  [008a5444] nvme_dev_disable+0xe4/0x520
>>> [   14.267576]  [008a69b8] nvme_reset_work+0x138/0x1c60
>>> [   14.267827]  [00488dd0] process_one_work+0x230/0x6e0
>>> [   14.268079]  [004894f4] worker_thread+0x274/0x520
>>> [   14.268321]  [00490624] kthread+0xe4/0x120
>>> [   14.268544]  [004060c4] ret_from_fork+0x1c/0x2c
>>> [   14.268825]  []   (null)
>>> [   14.269089] irq event stamp: 32796
>>> [   14.269350] hardirqs last  enabled at (32795): [<00b624a4>] 
>>> _raw_spin_unlock_irqrestore+0x24/0x80
>>> [   14.269757] hardirqs last disabled at (32796): [<00b622f4>] 
>>> _raw_spin_lock_irqsave+0x14/0x60
>>> [   14.270566] softirqs last  enabled at (32780): [<00b64c18>] 
>>> __do_softirq+0x238/0x520
>>> [   14.271206] softirqs last disabled at (32729): [<0042ceec>] 
>>> do_softirq_own_stack+0x2c/0x40
>>> [   14.272288] ---[ end trace cb79ccd2a0a03f3c ]---
>>>
>>> Looks like an error during probe followed by an error cleanup problem.
>>
>> Did it previous probe fine? Or is the new thing just the fact that
>> we spew a warning on trying to free a non-existing vector?
>>
> This works fine in mainline, if that is your question.

Yeah, as soon as I sent the other email I realized that. Let me send
you a quick patch.

-- 
Jens Axboe



Re: [PATCH] nvme: utilize two queue maps, one for reads and one for writes

2018-11-15 Thread Jens Axboe
On 11/15/18 12:36 PM, Guenter Roeck wrote:
> On Thu, Nov 15, 2018 at 11:11:26AM -0800, Guenter Roeck wrote:
>> On Wed, Nov 14, 2018 at 10:12:44AM -0700, Jens Axboe wrote:
>>>
>>> I think the below patch should fix it.
>>>
>>
>> I spoke too early. sparc64, next-20181115:
>>
>> [   14.204370] nvme nvme0: pci function :02:00.0
>> [   14.249956] nvme nvme0: Removing after probe failure status: -5
>> [   14.263496] [ cut here ]
>> [   14.263913] WARNING: CPU: 0 PID: 15 at kernel/irq/manage.c:1597 
>> __free_irq+0xa4/0x320
>> [   14.264265] Trying to free already-free IRQ 9
>> [   14.264519] Modules linked in:
>> [   14.264961] CPU: 0 PID: 15 Comm: kworker/u2:1 Not tainted 
>> 4.20.0-rc2-next-20181115 #1
>> [   14.26] Workqueue: nvme-reset-wq nvme_reset_work
>> [   14.265899] Call Trace:
>> [   14.266118]  [0046944c] __warn+0xcc/0x100
>> [   14.266375]  [004694b0] warn_slowpath_fmt+0x30/0x40
>> [   14.266635]  [004d4ce4] __free_irq+0xa4/0x320
>> [   14.266867]  [004d4ff8] free_irq+0x38/0x80
>> [   14.267092]  [007b1874] pci_free_irq+0x14/0x40
>> [   14.267327]  [008a5444] nvme_dev_disable+0xe4/0x520
>> [   14.267576]  [008a69b8] nvme_reset_work+0x138/0x1c60
>> [   14.267827]  [00488dd0] process_one_work+0x230/0x6e0
>> [   14.268079]  [004894f4] worker_thread+0x274/0x520
>> [   14.268321]  [00490624] kthread+0xe4/0x120
>> [   14.268544]  [004060c4] ret_from_fork+0x1c/0x2c
>> [   14.268825]  []   (null)
>> [   14.269089] irq event stamp: 32796
>> [   14.269350] hardirqs last  enabled at (32795): [<00b624a4>] 
>> _raw_spin_unlock_irqrestore+0x24/0x80
>> [   14.269757] hardirqs last disabled at (32796): [<00b622f4>] 
>> _raw_spin_lock_irqsave+0x14/0x60
>> [   14.270566] softirqs last  enabled at (32780): [<00b64c18>] 
>> __do_softirq+0x238/0x520
>> [   14.271206] softirqs last disabled at (32729): [<0042ceec>] 
>> do_softirq_own_stack+0x2c/0x40
>> [   14.272288] ---[ end trace cb79ccd2a0a03f3c ]---
>>
>> Looks like an error during probe followed by an error cleanup problem.
>>
> On sparc64, pci_alloc_irq_vectors_affinity() returns -EINVAL (possibly
> because the controller doesn't support MSI).
> 
> [   16.554753] nvme nvme0: pci function :02:00.0
> [   16.622894] nvme :02:00.0: pre alloc: nr_io_queues: 2 result: 0
> [   16.623814] nvme :02:00.0: post alloc: nr_io_queues: 2 result: -22
> [   16.625047] nvme nvme0: Removing after probe failure status: -5
> 
> ... and, as result, allocating a single (legacy) interrupt isn't even tried.
> 
> I didn't try to track down the cleanup failure.

OK, then this isn't a new failure in terms of whether the nvme device will
work, it's just a cleanup issue.

That's less severe than the previous hang :-)

-- 
Jens Axboe



Re: [PATCH] nvme: utilize two queue maps, one for reads and one for writes

2018-11-15 Thread Jens Axboe
On 11/15/18 12:36 PM, Guenter Roeck wrote:
> On Thu, Nov 15, 2018 at 11:11:26AM -0800, Guenter Roeck wrote:
>> On Wed, Nov 14, 2018 at 10:12:44AM -0700, Jens Axboe wrote:
>>>
>>> I think the below patch should fix it.
>>>
>>
>> I spoke too early. sparc64, next-20181115:
>>
>> [   14.204370] nvme nvme0: pci function :02:00.0
>> [   14.249956] nvme nvme0: Removing after probe failure status: -5
>> [   14.263496] [ cut here ]
>> [   14.263913] WARNING: CPU: 0 PID: 15 at kernel/irq/manage.c:1597 
>> __free_irq+0xa4/0x320
>> [   14.264265] Trying to free already-free IRQ 9
>> [   14.264519] Modules linked in:
>> [   14.264961] CPU: 0 PID: 15 Comm: kworker/u2:1 Not tainted 
>> 4.20.0-rc2-next-20181115 #1
>> [   14.26] Workqueue: nvme-reset-wq nvme_reset_work
>> [   14.265899] Call Trace:
>> [   14.266118]  [0046944c] __warn+0xcc/0x100
>> [   14.266375]  [004694b0] warn_slowpath_fmt+0x30/0x40
>> [   14.266635]  [004d4ce4] __free_irq+0xa4/0x320
>> [   14.266867]  [004d4ff8] free_irq+0x38/0x80
>> [   14.267092]  [007b1874] pci_free_irq+0x14/0x40
>> [   14.267327]  [008a5444] nvme_dev_disable+0xe4/0x520
>> [   14.267576]  [008a69b8] nvme_reset_work+0x138/0x1c60
>> [   14.267827]  [00488dd0] process_one_work+0x230/0x6e0
>> [   14.268079]  [004894f4] worker_thread+0x274/0x520
>> [   14.268321]  [00490624] kthread+0xe4/0x120
>> [   14.268544]  [004060c4] ret_from_fork+0x1c/0x2c
>> [   14.268825]  []   (null)
>> [   14.269089] irq event stamp: 32796
>> [   14.269350] hardirqs last  enabled at (32795): [<00b624a4>] 
>> _raw_spin_unlock_irqrestore+0x24/0x80
>> [   14.269757] hardirqs last disabled at (32796): [<00b622f4>] 
>> _raw_spin_lock_irqsave+0x14/0x60
>> [   14.270566] softirqs last  enabled at (32780): [<00b64c18>] 
>> __do_softirq+0x238/0x520
>> [   14.271206] softirqs last disabled at (32729): [<0042ceec>] 
>> do_softirq_own_stack+0x2c/0x40
>> [   14.272288] ---[ end trace cb79ccd2a0a03f3c ]---
>>
>> Looks like an error during probe followed by an error cleanup problem.
>>
> On sparc64, pci_alloc_irq_vectors_affinity() returns -EINVAL (possibly
> because the controller doesn't support MSI).
> 
> [   16.554753] nvme nvme0: pci function :02:00.0
> [   16.622894] nvme :02:00.0: pre alloc: nr_io_queues: 2 result: 0
> [   16.623814] nvme :02:00.0: post alloc: nr_io_queues: 2 result: -22
> [   16.625047] nvme nvme0: Removing after probe failure status: -5
> 
> ... and, as result, allocating a single (legacy) interrupt isn't even tried.
> 
> I didn't try to track down the cleanup failure.

OK, then this isn't a new failure in terms of whether the nvme device will
work, it's just a cleanup issue.

That's less severe than the previous hang :-)

-- 
Jens Axboe



Re: [PATCH] nvme: utilize two queue maps, one for reads and one for writes

2018-11-15 Thread Guenter Roeck
On Thu, Nov 15, 2018 at 12:29:04PM -0700, Jens Axboe wrote:
> On 11/15/18 12:11 PM, Guenter Roeck wrote:
> > On Wed, Nov 14, 2018 at 10:12:44AM -0700, Jens Axboe wrote:
> >>
> >> I think the below patch should fix it.
> >>
> > 
> > I spoke too early. sparc64, next-20181115:
> > 
> > [   14.204370] nvme nvme0: pci function :02:00.0
> > [   14.249956] nvme nvme0: Removing after probe failure status: -5
> > [   14.263496] [ cut here ]
> > [   14.263913] WARNING: CPU: 0 PID: 15 at kernel/irq/manage.c:1597 
> > __free_irq+0xa4/0x320
> > [   14.264265] Trying to free already-free IRQ 9
> > [   14.264519] Modules linked in:
> > [   14.264961] CPU: 0 PID: 15 Comm: kworker/u2:1 Not tainted 
> > 4.20.0-rc2-next-20181115 #1
> > [   14.26] Workqueue: nvme-reset-wq nvme_reset_work
> > [   14.265899] Call Trace:
> > [   14.266118]  [0046944c] __warn+0xcc/0x100
> > [   14.266375]  [004694b0] warn_slowpath_fmt+0x30/0x40
> > [   14.266635]  [004d4ce4] __free_irq+0xa4/0x320
> > [   14.266867]  [004d4ff8] free_irq+0x38/0x80
> > [   14.267092]  [007b1874] pci_free_irq+0x14/0x40
> > [   14.267327]  [008a5444] nvme_dev_disable+0xe4/0x520
> > [   14.267576]  [008a69b8] nvme_reset_work+0x138/0x1c60
> > [   14.267827]  [00488dd0] process_one_work+0x230/0x6e0
> > [   14.268079]  [004894f4] worker_thread+0x274/0x520
> > [   14.268321]  [00490624] kthread+0xe4/0x120
> > [   14.268544]  [004060c4] ret_from_fork+0x1c/0x2c
> > [   14.268825]  []   (null)
> > [   14.269089] irq event stamp: 32796
> > [   14.269350] hardirqs last  enabled at (32795): [<00b624a4>] 
> > _raw_spin_unlock_irqrestore+0x24/0x80
> > [   14.269757] hardirqs last disabled at (32796): [<00b622f4>] 
> > _raw_spin_lock_irqsave+0x14/0x60
> > [   14.270566] softirqs last  enabled at (32780): [<00b64c18>] 
> > __do_softirq+0x238/0x520
> > [   14.271206] softirqs last disabled at (32729): [<0042ceec>] 
> > do_softirq_own_stack+0x2c/0x40
> > [   14.272288] ---[ end trace cb79ccd2a0a03f3c ]---
> > 
> > Looks like an error during probe followed by an error cleanup problem.
> 
> Did it previous probe fine? Or is the new thing just the fact that
> we spew a warning on trying to free a non-existing vector?
> 
This works fine in mainline, if that is your question.

Guenter


Re: [PATCH] nvme: utilize two queue maps, one for reads and one for writes

2018-11-15 Thread Guenter Roeck
On Thu, Nov 15, 2018 at 12:29:04PM -0700, Jens Axboe wrote:
> On 11/15/18 12:11 PM, Guenter Roeck wrote:
> > On Wed, Nov 14, 2018 at 10:12:44AM -0700, Jens Axboe wrote:
> >>
> >> I think the below patch should fix it.
> >>
> > 
> > I spoke too early. sparc64, next-20181115:
> > 
> > [   14.204370] nvme nvme0: pci function :02:00.0
> > [   14.249956] nvme nvme0: Removing after probe failure status: -5
> > [   14.263496] [ cut here ]
> > [   14.263913] WARNING: CPU: 0 PID: 15 at kernel/irq/manage.c:1597 
> > __free_irq+0xa4/0x320
> > [   14.264265] Trying to free already-free IRQ 9
> > [   14.264519] Modules linked in:
> > [   14.264961] CPU: 0 PID: 15 Comm: kworker/u2:1 Not tainted 
> > 4.20.0-rc2-next-20181115 #1
> > [   14.26] Workqueue: nvme-reset-wq nvme_reset_work
> > [   14.265899] Call Trace:
> > [   14.266118]  [0046944c] __warn+0xcc/0x100
> > [   14.266375]  [004694b0] warn_slowpath_fmt+0x30/0x40
> > [   14.266635]  [004d4ce4] __free_irq+0xa4/0x320
> > [   14.266867]  [004d4ff8] free_irq+0x38/0x80
> > [   14.267092]  [007b1874] pci_free_irq+0x14/0x40
> > [   14.267327]  [008a5444] nvme_dev_disable+0xe4/0x520
> > [   14.267576]  [008a69b8] nvme_reset_work+0x138/0x1c60
> > [   14.267827]  [00488dd0] process_one_work+0x230/0x6e0
> > [   14.268079]  [004894f4] worker_thread+0x274/0x520
> > [   14.268321]  [00490624] kthread+0xe4/0x120
> > [   14.268544]  [004060c4] ret_from_fork+0x1c/0x2c
> > [   14.268825]  []   (null)
> > [   14.269089] irq event stamp: 32796
> > [   14.269350] hardirqs last  enabled at (32795): [<00b624a4>] 
> > _raw_spin_unlock_irqrestore+0x24/0x80
> > [   14.269757] hardirqs last disabled at (32796): [<00b622f4>] 
> > _raw_spin_lock_irqsave+0x14/0x60
> > [   14.270566] softirqs last  enabled at (32780): [<00b64c18>] 
> > __do_softirq+0x238/0x520
> > [   14.271206] softirqs last disabled at (32729): [<0042ceec>] 
> > do_softirq_own_stack+0x2c/0x40
> > [   14.272288] ---[ end trace cb79ccd2a0a03f3c ]---
> > 
> > Looks like an error during probe followed by an error cleanup problem.
> 
> Did it previous probe fine? Or is the new thing just the fact that
> we spew a warning on trying to free a non-existing vector?
> 
This works fine in mainline, if that is your question.

Guenter


Re: [PATCH] nvme: utilize two queue maps, one for reads and one for writes

2018-11-15 Thread Guenter Roeck
On Thu, Nov 15, 2018 at 11:11:26AM -0800, Guenter Roeck wrote:
> On Wed, Nov 14, 2018 at 10:12:44AM -0700, Jens Axboe wrote:
> > 
> > I think the below patch should fix it.
> > 
> 
> I spoke too early. sparc64, next-20181115:
> 
> [   14.204370] nvme nvme0: pci function :02:00.0
> [   14.249956] nvme nvme0: Removing after probe failure status: -5
> [   14.263496] [ cut here ]
> [   14.263913] WARNING: CPU: 0 PID: 15 at kernel/irq/manage.c:1597 
> __free_irq+0xa4/0x320
> [   14.264265] Trying to free already-free IRQ 9
> [   14.264519] Modules linked in:
> [   14.264961] CPU: 0 PID: 15 Comm: kworker/u2:1 Not tainted 
> 4.20.0-rc2-next-20181115 #1
> [   14.26] Workqueue: nvme-reset-wq nvme_reset_work
> [   14.265899] Call Trace:
> [   14.266118]  [0046944c] __warn+0xcc/0x100
> [   14.266375]  [004694b0] warn_slowpath_fmt+0x30/0x40
> [   14.266635]  [004d4ce4] __free_irq+0xa4/0x320
> [   14.266867]  [004d4ff8] free_irq+0x38/0x80
> [   14.267092]  [007b1874] pci_free_irq+0x14/0x40
> [   14.267327]  [008a5444] nvme_dev_disable+0xe4/0x520
> [   14.267576]  [008a69b8] nvme_reset_work+0x138/0x1c60
> [   14.267827]  [00488dd0] process_one_work+0x230/0x6e0
> [   14.268079]  [004894f4] worker_thread+0x274/0x520
> [   14.268321]  [00490624] kthread+0xe4/0x120
> [   14.268544]  [004060c4] ret_from_fork+0x1c/0x2c
> [   14.268825]  []   (null)
> [   14.269089] irq event stamp: 32796
> [   14.269350] hardirqs last  enabled at (32795): [<00b624a4>] 
> _raw_spin_unlock_irqrestore+0x24/0x80
> [   14.269757] hardirqs last disabled at (32796): [<00b622f4>] 
> _raw_spin_lock_irqsave+0x14/0x60
> [   14.270566] softirqs last  enabled at (32780): [<00b64c18>] 
> __do_softirq+0x238/0x520
> [   14.271206] softirqs last disabled at (32729): [<0042ceec>] 
> do_softirq_own_stack+0x2c/0x40
> [   14.272288] ---[ end trace cb79ccd2a0a03f3c ]---
> 
> Looks like an error during probe followed by an error cleanup problem.
> 
On sparc64, pci_alloc_irq_vectors_affinity() returns -EINVAL (possibly
because the controller doesn't support MSI).

[   16.554753] nvme nvme0: pci function :02:00.0
[   16.622894] nvme :02:00.0: pre alloc: nr_io_queues: 2 result: 0
[   16.623814] nvme :02:00.0: post alloc: nr_io_queues: 2 result: -22
[   16.625047] nvme nvme0: Removing after probe failure status: -5

... and, as result, allocating a single (legacy) interrupt isn't even tried.

I didn't try to track down the cleanup failure.

Guenter


Re: [PATCH] nvme: utilize two queue maps, one for reads and one for writes

2018-11-15 Thread Guenter Roeck
On Thu, Nov 15, 2018 at 11:11:26AM -0800, Guenter Roeck wrote:
> On Wed, Nov 14, 2018 at 10:12:44AM -0700, Jens Axboe wrote:
> > 
> > I think the below patch should fix it.
> > 
> 
> I spoke too early. sparc64, next-20181115:
> 
> [   14.204370] nvme nvme0: pci function :02:00.0
> [   14.249956] nvme nvme0: Removing after probe failure status: -5
> [   14.263496] [ cut here ]
> [   14.263913] WARNING: CPU: 0 PID: 15 at kernel/irq/manage.c:1597 
> __free_irq+0xa4/0x320
> [   14.264265] Trying to free already-free IRQ 9
> [   14.264519] Modules linked in:
> [   14.264961] CPU: 0 PID: 15 Comm: kworker/u2:1 Not tainted 
> 4.20.0-rc2-next-20181115 #1
> [   14.26] Workqueue: nvme-reset-wq nvme_reset_work
> [   14.265899] Call Trace:
> [   14.266118]  [0046944c] __warn+0xcc/0x100
> [   14.266375]  [004694b0] warn_slowpath_fmt+0x30/0x40
> [   14.266635]  [004d4ce4] __free_irq+0xa4/0x320
> [   14.266867]  [004d4ff8] free_irq+0x38/0x80
> [   14.267092]  [007b1874] pci_free_irq+0x14/0x40
> [   14.267327]  [008a5444] nvme_dev_disable+0xe4/0x520
> [   14.267576]  [008a69b8] nvme_reset_work+0x138/0x1c60
> [   14.267827]  [00488dd0] process_one_work+0x230/0x6e0
> [   14.268079]  [004894f4] worker_thread+0x274/0x520
> [   14.268321]  [00490624] kthread+0xe4/0x120
> [   14.268544]  [004060c4] ret_from_fork+0x1c/0x2c
> [   14.268825]  []   (null)
> [   14.269089] irq event stamp: 32796
> [   14.269350] hardirqs last  enabled at (32795): [<00b624a4>] 
> _raw_spin_unlock_irqrestore+0x24/0x80
> [   14.269757] hardirqs last disabled at (32796): [<00b622f4>] 
> _raw_spin_lock_irqsave+0x14/0x60
> [   14.270566] softirqs last  enabled at (32780): [<00b64c18>] 
> __do_softirq+0x238/0x520
> [   14.271206] softirqs last disabled at (32729): [<0042ceec>] 
> do_softirq_own_stack+0x2c/0x40
> [   14.272288] ---[ end trace cb79ccd2a0a03f3c ]---
> 
> Looks like an error during probe followed by an error cleanup problem.
> 
On sparc64, pci_alloc_irq_vectors_affinity() returns -EINVAL (possibly
because the controller doesn't support MSI).

[   16.554753] nvme nvme0: pci function :02:00.0
[   16.622894] nvme :02:00.0: pre alloc: nr_io_queues: 2 result: 0
[   16.623814] nvme :02:00.0: post alloc: nr_io_queues: 2 result: -22
[   16.625047] nvme nvme0: Removing after probe failure status: -5

... and, as result, allocating a single (legacy) interrupt isn't even tried.

I didn't try to track down the cleanup failure.

Guenter


Re: [PATCH] nvme: utilize two queue maps, one for reads and one for writes

2018-11-15 Thread Jens Axboe
On 11/15/18 12:11 PM, Guenter Roeck wrote:
> On Wed, Nov 14, 2018 at 10:12:44AM -0700, Jens Axboe wrote:
>>
>> I think the below patch should fix it.
>>
> 
> I spoke too early. sparc64, next-20181115:
> 
> [   14.204370] nvme nvme0: pci function :02:00.0
> [   14.249956] nvme nvme0: Removing after probe failure status: -5
> [   14.263496] [ cut here ]
> [   14.263913] WARNING: CPU: 0 PID: 15 at kernel/irq/manage.c:1597 
> __free_irq+0xa4/0x320
> [   14.264265] Trying to free already-free IRQ 9
> [   14.264519] Modules linked in:
> [   14.264961] CPU: 0 PID: 15 Comm: kworker/u2:1 Not tainted 
> 4.20.0-rc2-next-20181115 #1
> [   14.26] Workqueue: nvme-reset-wq nvme_reset_work
> [   14.265899] Call Trace:
> [   14.266118]  [0046944c] __warn+0xcc/0x100
> [   14.266375]  [004694b0] warn_slowpath_fmt+0x30/0x40
> [   14.266635]  [004d4ce4] __free_irq+0xa4/0x320
> [   14.266867]  [004d4ff8] free_irq+0x38/0x80
> [   14.267092]  [007b1874] pci_free_irq+0x14/0x40
> [   14.267327]  [008a5444] nvme_dev_disable+0xe4/0x520
> [   14.267576]  [008a69b8] nvme_reset_work+0x138/0x1c60
> [   14.267827]  [00488dd0] process_one_work+0x230/0x6e0
> [   14.268079]  [004894f4] worker_thread+0x274/0x520
> [   14.268321]  [00490624] kthread+0xe4/0x120
> [   14.268544]  [004060c4] ret_from_fork+0x1c/0x2c
> [   14.268825]  []   (null)
> [   14.269089] irq event stamp: 32796
> [   14.269350] hardirqs last  enabled at (32795): [<00b624a4>] 
> _raw_spin_unlock_irqrestore+0x24/0x80
> [   14.269757] hardirqs last disabled at (32796): [<00b622f4>] 
> _raw_spin_lock_irqsave+0x14/0x60
> [   14.270566] softirqs last  enabled at (32780): [<00b64c18>] 
> __do_softirq+0x238/0x520
> [   14.271206] softirqs last disabled at (32729): [<0042ceec>] 
> do_softirq_own_stack+0x2c/0x40
> [   14.272288] ---[ end trace cb79ccd2a0a03f3c ]---
> 
> Looks like an error during probe followed by an error cleanup problem.

Did it previous probe fine? Or is the new thing just the fact that
we spew a warning on trying to free a non-existing vector?

-- 
Jens Axboe



Re: [PATCH] nvme: utilize two queue maps, one for reads and one for writes

2018-11-15 Thread Jens Axboe
On 11/15/18 12:11 PM, Guenter Roeck wrote:
> On Wed, Nov 14, 2018 at 10:12:44AM -0700, Jens Axboe wrote:
>>
>> I think the below patch should fix it.
>>
> 
> I spoke too early. sparc64, next-20181115:
> 
> [   14.204370] nvme nvme0: pci function :02:00.0
> [   14.249956] nvme nvme0: Removing after probe failure status: -5
> [   14.263496] [ cut here ]
> [   14.263913] WARNING: CPU: 0 PID: 15 at kernel/irq/manage.c:1597 
> __free_irq+0xa4/0x320
> [   14.264265] Trying to free already-free IRQ 9
> [   14.264519] Modules linked in:
> [   14.264961] CPU: 0 PID: 15 Comm: kworker/u2:1 Not tainted 
> 4.20.0-rc2-next-20181115 #1
> [   14.26] Workqueue: nvme-reset-wq nvme_reset_work
> [   14.265899] Call Trace:
> [   14.266118]  [0046944c] __warn+0xcc/0x100
> [   14.266375]  [004694b0] warn_slowpath_fmt+0x30/0x40
> [   14.266635]  [004d4ce4] __free_irq+0xa4/0x320
> [   14.266867]  [004d4ff8] free_irq+0x38/0x80
> [   14.267092]  [007b1874] pci_free_irq+0x14/0x40
> [   14.267327]  [008a5444] nvme_dev_disable+0xe4/0x520
> [   14.267576]  [008a69b8] nvme_reset_work+0x138/0x1c60
> [   14.267827]  [00488dd0] process_one_work+0x230/0x6e0
> [   14.268079]  [004894f4] worker_thread+0x274/0x520
> [   14.268321]  [00490624] kthread+0xe4/0x120
> [   14.268544]  [004060c4] ret_from_fork+0x1c/0x2c
> [   14.268825]  []   (null)
> [   14.269089] irq event stamp: 32796
> [   14.269350] hardirqs last  enabled at (32795): [<00b624a4>] 
> _raw_spin_unlock_irqrestore+0x24/0x80
> [   14.269757] hardirqs last disabled at (32796): [<00b622f4>] 
> _raw_spin_lock_irqsave+0x14/0x60
> [   14.270566] softirqs last  enabled at (32780): [<00b64c18>] 
> __do_softirq+0x238/0x520
> [   14.271206] softirqs last disabled at (32729): [<0042ceec>] 
> do_softirq_own_stack+0x2c/0x40
> [   14.272288] ---[ end trace cb79ccd2a0a03f3c ]---
> 
> Looks like an error during probe followed by an error cleanup problem.

Did it previous probe fine? Or is the new thing just the fact that
we spew a warning on trying to free a non-existing vector?

-- 
Jens Axboe



Re: [PATCH] nvme: utilize two queue maps, one for reads and one for writes

2018-11-15 Thread Guenter Roeck
On Wed, Nov 14, 2018 at 10:12:44AM -0700, Jens Axboe wrote:
> 
> I think the below patch should fix it.
> 

I spoke too early. sparc64, next-20181115:

[   14.204370] nvme nvme0: pci function :02:00.0
[   14.249956] nvme nvme0: Removing after probe failure status: -5
[   14.263496] [ cut here ]
[   14.263913] WARNING: CPU: 0 PID: 15 at kernel/irq/manage.c:1597 
__free_irq+0xa4/0x320
[   14.264265] Trying to free already-free IRQ 9
[   14.264519] Modules linked in:
[   14.264961] CPU: 0 PID: 15 Comm: kworker/u2:1 Not tainted 
4.20.0-rc2-next-20181115 #1
[   14.26] Workqueue: nvme-reset-wq nvme_reset_work
[   14.265899] Call Trace:
[   14.266118]  [0046944c] __warn+0xcc/0x100
[   14.266375]  [004694b0] warn_slowpath_fmt+0x30/0x40
[   14.266635]  [004d4ce4] __free_irq+0xa4/0x320
[   14.266867]  [004d4ff8] free_irq+0x38/0x80
[   14.267092]  [007b1874] pci_free_irq+0x14/0x40
[   14.267327]  [008a5444] nvme_dev_disable+0xe4/0x520
[   14.267576]  [008a69b8] nvme_reset_work+0x138/0x1c60
[   14.267827]  [00488dd0] process_one_work+0x230/0x6e0
[   14.268079]  [004894f4] worker_thread+0x274/0x520
[   14.268321]  [00490624] kthread+0xe4/0x120
[   14.268544]  [004060c4] ret_from_fork+0x1c/0x2c
[   14.268825]  []   (null)
[   14.269089] irq event stamp: 32796
[   14.269350] hardirqs last  enabled at (32795): [<00b624a4>] 
_raw_spin_unlock_irqrestore+0x24/0x80
[   14.269757] hardirqs last disabled at (32796): [<00b622f4>] 
_raw_spin_lock_irqsave+0x14/0x60
[   14.270566] softirqs last  enabled at (32780): [<00b64c18>] 
__do_softirq+0x238/0x520
[   14.271206] softirqs last disabled at (32729): [<0042ceec>] 
do_softirq_own_stack+0x2c/0x40
[   14.272288] ---[ end trace cb79ccd2a0a03f3c ]---

Looks like an error during probe followed by an error cleanup problem.

Guenter


Re: [PATCH] nvme: utilize two queue maps, one for reads and one for writes

2018-11-15 Thread Guenter Roeck
On Wed, Nov 14, 2018 at 10:12:44AM -0700, Jens Axboe wrote:
> 
> I think the below patch should fix it.
> 

I spoke too early. sparc64, next-20181115:

[   14.204370] nvme nvme0: pci function :02:00.0
[   14.249956] nvme nvme0: Removing after probe failure status: -5
[   14.263496] [ cut here ]
[   14.263913] WARNING: CPU: 0 PID: 15 at kernel/irq/manage.c:1597 
__free_irq+0xa4/0x320
[   14.264265] Trying to free already-free IRQ 9
[   14.264519] Modules linked in:
[   14.264961] CPU: 0 PID: 15 Comm: kworker/u2:1 Not tainted 
4.20.0-rc2-next-20181115 #1
[   14.26] Workqueue: nvme-reset-wq nvme_reset_work
[   14.265899] Call Trace:
[   14.266118]  [0046944c] __warn+0xcc/0x100
[   14.266375]  [004694b0] warn_slowpath_fmt+0x30/0x40
[   14.266635]  [004d4ce4] __free_irq+0xa4/0x320
[   14.266867]  [004d4ff8] free_irq+0x38/0x80
[   14.267092]  [007b1874] pci_free_irq+0x14/0x40
[   14.267327]  [008a5444] nvme_dev_disable+0xe4/0x520
[   14.267576]  [008a69b8] nvme_reset_work+0x138/0x1c60
[   14.267827]  [00488dd0] process_one_work+0x230/0x6e0
[   14.268079]  [004894f4] worker_thread+0x274/0x520
[   14.268321]  [00490624] kthread+0xe4/0x120
[   14.268544]  [004060c4] ret_from_fork+0x1c/0x2c
[   14.268825]  []   (null)
[   14.269089] irq event stamp: 32796
[   14.269350] hardirqs last  enabled at (32795): [<00b624a4>] 
_raw_spin_unlock_irqrestore+0x24/0x80
[   14.269757] hardirqs last disabled at (32796): [<00b622f4>] 
_raw_spin_lock_irqsave+0x14/0x60
[   14.270566] softirqs last  enabled at (32780): [<00b64c18>] 
__do_softirq+0x238/0x520
[   14.271206] softirqs last disabled at (32729): [<0042ceec>] 
do_softirq_own_stack+0x2c/0x40
[   14.272288] ---[ end trace cb79ccd2a0a03f3c ]---

Looks like an error during probe followed by an error cleanup problem.

Guenter


Re: [PATCH] nvme: utilize two queue maps, one for reads and one for writes

2018-11-15 Thread Jens Axboe
On 11/15/18 11:28 AM, Guenter Roeck wrote:
> Hi Jens,
> 
>> I think the below patch should fix it.
>>
> Sorry I wasn't able to test this earlier. Looks like it does
> fix the problem; the problem is no longer seen in next-20181115.
> Minor comment below.

That's fine, thanks for testing!

>>  /*
>> - * Need IRQs for read+write queues, and one for the admin queue
>> + * Need IRQs for read+write queues, and one for the admin queue.
>> + * If we can't get more than one vector, we have to share the
>> + * admin queue and IO queue vector. For that case, don't add
>> + * an extra vector for the admin queue, or we'll continue
>> + * asking for 2 and get -ENOSPC in return.
>>   */
>> -nr_io_queues = irq_sets[0] + irq_sets[1] + 1;
>> +if (result == -ENOSPC && nr_io_queues == 1)
>> +nr_io_queues = 1;
> 
> Setting nr_io_queues to 1 when it already is set to 1 doesn't really do
> anything. Is this for clarification ?

Guess that does look a bit odd, alternative would be to flip the
condition, but I think this one is easier to read.

-- 
Jens Axboe



Re: [PATCH] nvme: utilize two queue maps, one for reads and one for writes

2018-11-15 Thread Jens Axboe
On 11/15/18 11:28 AM, Guenter Roeck wrote:
> Hi Jens,
> 
>> I think the below patch should fix it.
>>
> Sorry I wasn't able to test this earlier. Looks like it does
> fix the problem; the problem is no longer seen in next-20181115.
> Minor comment below.

That's fine, thanks for testing!

>>  /*
>> - * Need IRQs for read+write queues, and one for the admin queue
>> + * Need IRQs for read+write queues, and one for the admin queue.
>> + * If we can't get more than one vector, we have to share the
>> + * admin queue and IO queue vector. For that case, don't add
>> + * an extra vector for the admin queue, or we'll continue
>> + * asking for 2 and get -ENOSPC in return.
>>   */
>> -nr_io_queues = irq_sets[0] + irq_sets[1] + 1;
>> +if (result == -ENOSPC && nr_io_queues == 1)
>> +nr_io_queues = 1;
> 
> Setting nr_io_queues to 1 when it already is set to 1 doesn't really do
> anything. Is this for clarification ?

Guess that does look a bit odd, alternative would be to flip the
condition, but I think this one is easier to read.

-- 
Jens Axboe



Re: [PATCH] nvme: utilize two queue maps, one for reads and one for writes

2018-11-15 Thread Guenter Roeck
Hi Jens,

On Wed, Nov 14, 2018 at 10:12:44AM -0700, Jens Axboe wrote:
> On 11/13/18 9:52 PM, Guenter Roeck wrote:
> > On Tue, Nov 13, 2018 at 05:51:08PM -0700, Jens Axboe wrote:
> >> On 11/13/18 5:41 PM, Guenter Roeck wrote:
> >>> Hi,
> >>>
> >>> On Wed, Oct 31, 2018 at 08:36:31AM -0600, Jens Axboe wrote:
>  NVMe does round-robin between queues by default, which means that
>  sharing a queue map for both reads and writes can be problematic
>  in terms of read servicing. It's much easier to flood the queue
>  with writes and reduce the read servicing.
> 
>  Implement two queue maps, one for reads and one for writes. The
>  write queue count is configurable through the 'write_queues'
>  parameter.
> 
>  By default, we retain the previous behavior of having a single
>  queue set, shared between reads and writes. Setting 'write_queues'
>  to a non-zero value will create two queue sets, one for reads and
>  one for writes, the latter using the configurable number of
>  queues (hardware queue counts permitting).
> 
>  Reviewed-by: Hannes Reinecke 
>  Reviewed-by: Keith Busch 
>  Signed-off-by: Jens Axboe 
> >>>
> >>> This patch causes hangs when running recent versions of
> >>> -next with several architectures; see the -next column at
> >>> kerneltests.org/builders for details.  Bisect log below; this
> >>> was run with qemu on alpha. Reverting this patch as well as
> >>> "nvme: add separate poll queue map" fixes the problem.
> >>
> >> I don't see anything related to what hung, the trace, and so on.
> >> Can you clue me in? Where are the test results with dmesg?
> >>
> > alpha just stalls during boot. parisc reports a hung task
> > in nvme_reset_work. sparc64 reports EIO when instantiating
> > the nvme driver, called from nvme_reset_work, and then stalls.
> > In all three cases, reverting the two mentioned patches fixes
> > the problem.
> 
> I think the below patch should fix it.
> 
Sorry I wasn't able to test this earlier. Looks like it does
fix the problem; the problem is no longer seen in next-20181115.
Minor comment below.

Guenter

> > https://kerneltests.org/builders/qemu-parisc-next/builds/173/steps/qemubuildcommand_1/logs/stdio
> > 
> > is an example log for parisc.
> > 
> > I didn't check if the other boot failures (ppc looks bad)
> > have the same root cause.
> > 
> >> How to reproduce?
> >>
> > parisc:
> > 
> > qemu-system-hppa -kernel vmlinux -no-reboot \
> > -snapshot -device nvme,serial=foo,drive=d0 \
> > -drive file=rootfs.ext2,if=none,format=raw,id=d0 \
> > -append 'root=/dev/nvme0n1 rw rootwait panic=-1 console=ttyS0,115200 ' \
> > -nographic -monitor null
> > 
> > alpha:
> > 
> > qemu-system-alpha -M clipper -kernel arch/alpha/boot/vmlinux -no-reboot \
> > -snapshot -device nvme,serial=foo,drive=d0 \
> > -drive file=rootfs.ext2,if=none,format=raw,id=d0 \
> > -append 'root=/dev/nvme0n1 rw rootwait panic=-1 console=ttyS0' \
> > -m 128M -nographic -monitor null -serial stdio
> > 
> > sparc64:
> > 
> > qemu-system-sparc64 -M sun4u -cpu 'TI UltraSparc IIi' -m 512 \
> > -snapshot -device nvme,serial=foo,drive=d0,bus=pciB \
> > -drive file=rootfs.ext2,if=none,format=raw,id=d0 \
> > -kernel arch/sparc/boot/image -no-reboot \
> > -append 'root=/dev/nvme0n1 rw rootwait panic=-1 console=ttyS0' \
> > -nographic -monitor none
> > 
> > The root file systems are available from the respective subdirectories
> > of:
> > 
> > https://github.com/groeck/linux-build-test/tree/master/rootfs
> 
> This is useful, thanks! I haven't tried it yet, but I was able to
> reproduce on x86 with MSI turned off.
> 
> 
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index 8df868afa363..6c03461ad988 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -2098,7 +2098,7 @@ static int nvme_setup_irqs(struct nvme_dev *dev, int 
> nr_io_queues)
>   .nr_sets = ARRAY_SIZE(irq_sets),
>   .sets = irq_sets,
>   };
> - int result;
> + int result = 0;
>  
>   /*
>* For irq sets, we have to ask for minvec == maxvec. This passes
> @@ -2113,9 +2113,16 @@ static int nvme_setup_irqs(struct nvme_dev *dev, int 
> nr_io_queues)
>   affd.nr_sets = 1;
>  
>   /*
> -  * Need IRQs for read+write queues, and one for the admin queue
> +  * Need IRQs for read+write queues, and one for the admin queue.
> +  * If we can't get more than one vector, we have to share the
> +  * admin queue and IO queue vector. For that case, don't add
> +  * an extra vector for the admin queue, or we'll continue
> +  * asking for 2 and get -ENOSPC in return.
>*/
> - nr_io_queues = irq_sets[0] + irq_sets[1] + 1;
> + if (result == -ENOSPC && nr_io_queues == 1)
> + nr_io_queues = 1;

Setting nr_io_queues to 1 

Re: [PATCH] nvme: utilize two queue maps, one for reads and one for writes

2018-11-15 Thread Guenter Roeck
Hi Jens,

On Wed, Nov 14, 2018 at 10:12:44AM -0700, Jens Axboe wrote:
> On 11/13/18 9:52 PM, Guenter Roeck wrote:
> > On Tue, Nov 13, 2018 at 05:51:08PM -0700, Jens Axboe wrote:
> >> On 11/13/18 5:41 PM, Guenter Roeck wrote:
> >>> Hi,
> >>>
> >>> On Wed, Oct 31, 2018 at 08:36:31AM -0600, Jens Axboe wrote:
>  NVMe does round-robin between queues by default, which means that
>  sharing a queue map for both reads and writes can be problematic
>  in terms of read servicing. It's much easier to flood the queue
>  with writes and reduce the read servicing.
> 
>  Implement two queue maps, one for reads and one for writes. The
>  write queue count is configurable through the 'write_queues'
>  parameter.
> 
>  By default, we retain the previous behavior of having a single
>  queue set, shared between reads and writes. Setting 'write_queues'
>  to a non-zero value will create two queue sets, one for reads and
>  one for writes, the latter using the configurable number of
>  queues (hardware queue counts permitting).
> 
>  Reviewed-by: Hannes Reinecke 
>  Reviewed-by: Keith Busch 
>  Signed-off-by: Jens Axboe 
> >>>
> >>> This patch causes hangs when running recent versions of
> >>> -next with several architectures; see the -next column at
> >>> kerneltests.org/builders for details.  Bisect log below; this
> >>> was run with qemu on alpha. Reverting this patch as well as
> >>> "nvme: add separate poll queue map" fixes the problem.
> >>
> >> I don't see anything related to what hung, the trace, and so on.
> >> Can you clue me in? Where are the test results with dmesg?
> >>
> > alpha just stalls during boot. parisc reports a hung task
> > in nvme_reset_work. sparc64 reports EIO when instantiating
> > the nvme driver, called from nvme_reset_work, and then stalls.
> > In all three cases, reverting the two mentioned patches fixes
> > the problem.
> 
> I think the below patch should fix it.
> 
Sorry I wasn't able to test this earlier. Looks like it does
fix the problem; the problem is no longer seen in next-20181115.
Minor comment below.

Guenter

> > https://kerneltests.org/builders/qemu-parisc-next/builds/173/steps/qemubuildcommand_1/logs/stdio
> > 
> > is an example log for parisc.
> > 
> > I didn't check if the other boot failures (ppc looks bad)
> > have the same root cause.
> > 
> >> How to reproduce?
> >>
> > parisc:
> > 
> > qemu-system-hppa -kernel vmlinux -no-reboot \
> > -snapshot -device nvme,serial=foo,drive=d0 \
> > -drive file=rootfs.ext2,if=none,format=raw,id=d0 \
> > -append 'root=/dev/nvme0n1 rw rootwait panic=-1 console=ttyS0,115200 ' \
> > -nographic -monitor null
> > 
> > alpha:
> > 
> > qemu-system-alpha -M clipper -kernel arch/alpha/boot/vmlinux -no-reboot \
> > -snapshot -device nvme,serial=foo,drive=d0 \
> > -drive file=rootfs.ext2,if=none,format=raw,id=d0 \
> > -append 'root=/dev/nvme0n1 rw rootwait panic=-1 console=ttyS0' \
> > -m 128M -nographic -monitor null -serial stdio
> > 
> > sparc64:
> > 
> > qemu-system-sparc64 -M sun4u -cpu 'TI UltraSparc IIi' -m 512 \
> > -snapshot -device nvme,serial=foo,drive=d0,bus=pciB \
> > -drive file=rootfs.ext2,if=none,format=raw,id=d0 \
> > -kernel arch/sparc/boot/image -no-reboot \
> > -append 'root=/dev/nvme0n1 rw rootwait panic=-1 console=ttyS0' \
> > -nographic -monitor none
> > 
> > The root file systems are available from the respective subdirectories
> > of:
> > 
> > https://github.com/groeck/linux-build-test/tree/master/rootfs
> 
> This is useful, thanks! I haven't tried it yet, but I was able to
> reproduce on x86 with MSI turned off.
> 
> 
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index 8df868afa363..6c03461ad988 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -2098,7 +2098,7 @@ static int nvme_setup_irqs(struct nvme_dev *dev, int 
> nr_io_queues)
>   .nr_sets = ARRAY_SIZE(irq_sets),
>   .sets = irq_sets,
>   };
> - int result;
> + int result = 0;
>  
>   /*
>* For irq sets, we have to ask for minvec == maxvec. This passes
> @@ -2113,9 +2113,16 @@ static int nvme_setup_irqs(struct nvme_dev *dev, int 
> nr_io_queues)
>   affd.nr_sets = 1;
>  
>   /*
> -  * Need IRQs for read+write queues, and one for the admin queue
> +  * Need IRQs for read+write queues, and one for the admin queue.
> +  * If we can't get more than one vector, we have to share the
> +  * admin queue and IO queue vector. For that case, don't add
> +  * an extra vector for the admin queue, or we'll continue
> +  * asking for 2 and get -ENOSPC in return.
>*/
> - nr_io_queues = irq_sets[0] + irq_sets[1] + 1;
> + if (result == -ENOSPC && nr_io_queues == 1)
> + nr_io_queues = 1;

Setting nr_io_queues to 1 

Re: [PATCH] nvme: utilize two queue maps, one for reads and one for writes

2018-11-14 Thread Jens Axboe
On 11/13/18 9:52 PM, Guenter Roeck wrote:
> On Tue, Nov 13, 2018 at 05:51:08PM -0700, Jens Axboe wrote:
>> On 11/13/18 5:41 PM, Guenter Roeck wrote:
>>> Hi,
>>>
>>> On Wed, Oct 31, 2018 at 08:36:31AM -0600, Jens Axboe wrote:
 NVMe does round-robin between queues by default, which means that
 sharing a queue map for both reads and writes can be problematic
 in terms of read servicing. It's much easier to flood the queue
 with writes and reduce the read servicing.

 Implement two queue maps, one for reads and one for writes. The
 write queue count is configurable through the 'write_queues'
 parameter.

 By default, we retain the previous behavior of having a single
 queue set, shared between reads and writes. Setting 'write_queues'
 to a non-zero value will create two queue sets, one for reads and
 one for writes, the latter using the configurable number of
 queues (hardware queue counts permitting).

 Reviewed-by: Hannes Reinecke 
 Reviewed-by: Keith Busch 
 Signed-off-by: Jens Axboe 
>>>
>>> This patch causes hangs when running recent versions of
>>> -next with several architectures; see the -next column at
>>> kerneltests.org/builders for details.  Bisect log below; this
>>> was run with qemu on alpha. Reverting this patch as well as
>>> "nvme: add separate poll queue map" fixes the problem.
>>
>> I don't see anything related to what hung, the trace, and so on.
>> Can you clue me in? Where are the test results with dmesg?
>>
> alpha just stalls during boot. parisc reports a hung task
> in nvme_reset_work. sparc64 reports EIO when instantiating
> the nvme driver, called from nvme_reset_work, and then stalls.
> In all three cases, reverting the two mentioned patches fixes
> the problem.

I think the below patch should fix it.

> https://kerneltests.org/builders/qemu-parisc-next/builds/173/steps/qemubuildcommand_1/logs/stdio
> 
> is an example log for parisc.
> 
> I didn't check if the other boot failures (ppc looks bad)
> have the same root cause.
> 
>> How to reproduce?
>>
> parisc:
> 
> qemu-system-hppa -kernel vmlinux -no-reboot \
>   -snapshot -device nvme,serial=foo,drive=d0 \
>   -drive file=rootfs.ext2,if=none,format=raw,id=d0 \
>   -append 'root=/dev/nvme0n1 rw rootwait panic=-1 console=ttyS0,115200 ' \
>   -nographic -monitor null
> 
> alpha:
> 
> qemu-system-alpha -M clipper -kernel arch/alpha/boot/vmlinux -no-reboot \
>   -snapshot -device nvme,serial=foo,drive=d0 \
>   -drive file=rootfs.ext2,if=none,format=raw,id=d0 \
>   -append 'root=/dev/nvme0n1 rw rootwait panic=-1 console=ttyS0' \
>   -m 128M -nographic -monitor null -serial stdio
> 
> sparc64:
> 
> qemu-system-sparc64 -M sun4u -cpu 'TI UltraSparc IIi' -m 512 \
>   -snapshot -device nvme,serial=foo,drive=d0,bus=pciB \
>   -drive file=rootfs.ext2,if=none,format=raw,id=d0 \
>   -kernel arch/sparc/boot/image -no-reboot \
>   -append 'root=/dev/nvme0n1 rw rootwait panic=-1 console=ttyS0' \
>   -nographic -monitor none
> 
> The root file systems are available from the respective subdirectories
> of:
> 
> https://github.com/groeck/linux-build-test/tree/master/rootfs

This is useful, thanks! I haven't tried it yet, but I was able to
reproduce on x86 with MSI turned off.


diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 8df868afa363..6c03461ad988 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -2098,7 +2098,7 @@ static int nvme_setup_irqs(struct nvme_dev *dev, int 
nr_io_queues)
.nr_sets = ARRAY_SIZE(irq_sets),
.sets = irq_sets,
};
-   int result;
+   int result = 0;
 
/*
 * For irq sets, we have to ask for minvec == maxvec. This passes
@@ -2113,9 +2113,16 @@ static int nvme_setup_irqs(struct nvme_dev *dev, int 
nr_io_queues)
affd.nr_sets = 1;
 
/*
-* Need IRQs for read+write queues, and one for the admin queue
+* Need IRQs for read+write queues, and one for the admin queue.
+* If we can't get more than one vector, we have to share the
+* admin queue and IO queue vector. For that case, don't add
+* an extra vector for the admin queue, or we'll continue
+* asking for 2 and get -ENOSPC in return.
 */
-   nr_io_queues = irq_sets[0] + irq_sets[1] + 1;
+   if (result == -ENOSPC && nr_io_queues == 1)
+   nr_io_queues = 1;
+   else
+   nr_io_queues = irq_sets[0] + irq_sets[1] + 1;
 
result = pci_alloc_irq_vectors_affinity(pdev, nr_io_queues,
nr_io_queues,

-- 
Jens Axboe



Re: [PATCH] nvme: utilize two queue maps, one for reads and one for writes

2018-11-14 Thread Jens Axboe
On 11/13/18 9:52 PM, Guenter Roeck wrote:
> On Tue, Nov 13, 2018 at 05:51:08PM -0700, Jens Axboe wrote:
>> On 11/13/18 5:41 PM, Guenter Roeck wrote:
>>> Hi,
>>>
>>> On Wed, Oct 31, 2018 at 08:36:31AM -0600, Jens Axboe wrote:
 NVMe does round-robin between queues by default, which means that
 sharing a queue map for both reads and writes can be problematic
 in terms of read servicing. It's much easier to flood the queue
 with writes and reduce the read servicing.

 Implement two queue maps, one for reads and one for writes. The
 write queue count is configurable through the 'write_queues'
 parameter.

 By default, we retain the previous behavior of having a single
 queue set, shared between reads and writes. Setting 'write_queues'
 to a non-zero value will create two queue sets, one for reads and
 one for writes, the latter using the configurable number of
 queues (hardware queue counts permitting).

 Reviewed-by: Hannes Reinecke 
 Reviewed-by: Keith Busch 
 Signed-off-by: Jens Axboe 
>>>
>>> This patch causes hangs when running recent versions of
>>> -next with several architectures; see the -next column at
>>> kerneltests.org/builders for details.  Bisect log below; this
>>> was run with qemu on alpha. Reverting this patch as well as
>>> "nvme: add separate poll queue map" fixes the problem.
>>
>> I don't see anything related to what hung, the trace, and so on.
>> Can you clue me in? Where are the test results with dmesg?
>>
> alpha just stalls during boot. parisc reports a hung task
> in nvme_reset_work. sparc64 reports EIO when instantiating
> the nvme driver, called from nvme_reset_work, and then stalls.
> In all three cases, reverting the two mentioned patches fixes
> the problem.

I think the below patch should fix it.

> https://kerneltests.org/builders/qemu-parisc-next/builds/173/steps/qemubuildcommand_1/logs/stdio
> 
> is an example log for parisc.
> 
> I didn't check if the other boot failures (ppc looks bad)
> have the same root cause.
> 
>> How to reproduce?
>>
> parisc:
> 
> qemu-system-hppa -kernel vmlinux -no-reboot \
>   -snapshot -device nvme,serial=foo,drive=d0 \
>   -drive file=rootfs.ext2,if=none,format=raw,id=d0 \
>   -append 'root=/dev/nvme0n1 rw rootwait panic=-1 console=ttyS0,115200 ' \
>   -nographic -monitor null
> 
> alpha:
> 
> qemu-system-alpha -M clipper -kernel arch/alpha/boot/vmlinux -no-reboot \
>   -snapshot -device nvme,serial=foo,drive=d0 \
>   -drive file=rootfs.ext2,if=none,format=raw,id=d0 \
>   -append 'root=/dev/nvme0n1 rw rootwait panic=-1 console=ttyS0' \
>   -m 128M -nographic -monitor null -serial stdio
> 
> sparc64:
> 
> qemu-system-sparc64 -M sun4u -cpu 'TI UltraSparc IIi' -m 512 \
>   -snapshot -device nvme,serial=foo,drive=d0,bus=pciB \
>   -drive file=rootfs.ext2,if=none,format=raw,id=d0 \
>   -kernel arch/sparc/boot/image -no-reboot \
>   -append 'root=/dev/nvme0n1 rw rootwait panic=-1 console=ttyS0' \
>   -nographic -monitor none
> 
> The root file systems are available from the respective subdirectories
> of:
> 
> https://github.com/groeck/linux-build-test/tree/master/rootfs

This is useful, thanks! I haven't tried it yet, but I was able to
reproduce on x86 with MSI turned off.


diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 8df868afa363..6c03461ad988 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -2098,7 +2098,7 @@ static int nvme_setup_irqs(struct nvme_dev *dev, int 
nr_io_queues)
.nr_sets = ARRAY_SIZE(irq_sets),
.sets = irq_sets,
};
-   int result;
+   int result = 0;
 
/*
 * For irq sets, we have to ask for minvec == maxvec. This passes
@@ -2113,9 +2113,16 @@ static int nvme_setup_irqs(struct nvme_dev *dev, int 
nr_io_queues)
affd.nr_sets = 1;
 
/*
-* Need IRQs for read+write queues, and one for the admin queue
+* Need IRQs for read+write queues, and one for the admin queue.
+* If we can't get more than one vector, we have to share the
+* admin queue and IO queue vector. For that case, don't add
+* an extra vector for the admin queue, or we'll continue
+* asking for 2 and get -ENOSPC in return.
 */
-   nr_io_queues = irq_sets[0] + irq_sets[1] + 1;
+   if (result == -ENOSPC && nr_io_queues == 1)
+   nr_io_queues = 1;
+   else
+   nr_io_queues = irq_sets[0] + irq_sets[1] + 1;
 
result = pci_alloc_irq_vectors_affinity(pdev, nr_io_queues,
nr_io_queues,

-- 
Jens Axboe



Re: [PATCH] nvme: utilize two queue maps, one for reads and one for writes

2018-11-13 Thread Guenter Roeck
On Tue, Nov 13, 2018 at 05:51:08PM -0700, Jens Axboe wrote:
> On 11/13/18 5:41 PM, Guenter Roeck wrote:
> > Hi,
> > 
> > On Wed, Oct 31, 2018 at 08:36:31AM -0600, Jens Axboe wrote:
> >> NVMe does round-robin between queues by default, which means that
> >> sharing a queue map for both reads and writes can be problematic
> >> in terms of read servicing. It's much easier to flood the queue
> >> with writes and reduce the read servicing.
> >>
> >> Implement two queue maps, one for reads and one for writes. The
> >> write queue count is configurable through the 'write_queues'
> >> parameter.
> >>
> >> By default, we retain the previous behavior of having a single
> >> queue set, shared between reads and writes. Setting 'write_queues'
> >> to a non-zero value will create two queue sets, one for reads and
> >> one for writes, the latter using the configurable number of
> >> queues (hardware queue counts permitting).
> >>
> >> Reviewed-by: Hannes Reinecke 
> >> Reviewed-by: Keith Busch 
> >> Signed-off-by: Jens Axboe 
> > 
> > This patch causes hangs when running recent versions of
> > -next with several architectures; see the -next column at
> > kerneltests.org/builders for details.  Bisect log below; this
> > was run with qemu on alpha. Reverting this patch as well as
> > "nvme: add separate poll queue map" fixes the problem.
> 
> I don't see anything related to what hung, the trace, and so on.
> Can you clue me in? Where are the test results with dmesg?
> 
alpha just stalls during boot. parisc reports a hung task
in nvme_reset_work. sparc64 reports EIO when instantiating
the nvme driver, called from nvme_reset_work, and then stalls.
In all three cases, reverting the two mentioned patches fixes
the problem.

https://kerneltests.org/builders/qemu-parisc-next/builds/173/steps/qemubuildcommand_1/logs/stdio

is an example log for parisc.

I didn't check if the other boot failures (ppc looks bad)
have the same root cause.

> How to reproduce?
> 
parisc:

qemu-system-hppa -kernel vmlinux -no-reboot \
-snapshot -device nvme,serial=foo,drive=d0 \
-drive file=rootfs.ext2,if=none,format=raw,id=d0 \
-append 'root=/dev/nvme0n1 rw rootwait panic=-1 console=ttyS0,115200 ' \
-nographic -monitor null

alpha:

qemu-system-alpha -M clipper -kernel arch/alpha/boot/vmlinux -no-reboot \
-snapshot -device nvme,serial=foo,drive=d0 \
-drive file=rootfs.ext2,if=none,format=raw,id=d0 \
-append 'root=/dev/nvme0n1 rw rootwait panic=-1 console=ttyS0' \
-m 128M -nographic -monitor null -serial stdio

sparc64:

qemu-system-sparc64 -M sun4u -cpu 'TI UltraSparc IIi' -m 512 \
-snapshot -device nvme,serial=foo,drive=d0,bus=pciB \
-drive file=rootfs.ext2,if=none,format=raw,id=d0 \
-kernel arch/sparc/boot/image -no-reboot \
-append 'root=/dev/nvme0n1 rw rootwait panic=-1 console=ttyS0' \
-nographic -monitor none

The root file systems are available from the respective subdirectories
of:

https://github.com/groeck/linux-build-test/tree/master/rootfs

Guenter


Re: [PATCH] nvme: utilize two queue maps, one for reads and one for writes

2018-11-13 Thread Guenter Roeck
On Tue, Nov 13, 2018 at 05:51:08PM -0700, Jens Axboe wrote:
> On 11/13/18 5:41 PM, Guenter Roeck wrote:
> > Hi,
> > 
> > On Wed, Oct 31, 2018 at 08:36:31AM -0600, Jens Axboe wrote:
> >> NVMe does round-robin between queues by default, which means that
> >> sharing a queue map for both reads and writes can be problematic
> >> in terms of read servicing. It's much easier to flood the queue
> >> with writes and reduce the read servicing.
> >>
> >> Implement two queue maps, one for reads and one for writes. The
> >> write queue count is configurable through the 'write_queues'
> >> parameter.
> >>
> >> By default, we retain the previous behavior of having a single
> >> queue set, shared between reads and writes. Setting 'write_queues'
> >> to a non-zero value will create two queue sets, one for reads and
> >> one for writes, the latter using the configurable number of
> >> queues (hardware queue counts permitting).
> >>
> >> Reviewed-by: Hannes Reinecke 
> >> Reviewed-by: Keith Busch 
> >> Signed-off-by: Jens Axboe 
> > 
> > This patch causes hangs when running recent versions of
> > -next with several architectures; see the -next column at
> > kerneltests.org/builders for details.  Bisect log below; this
> > was run with qemu on alpha. Reverting this patch as well as
> > "nvme: add separate poll queue map" fixes the problem.
> 
> I don't see anything related to what hung, the trace, and so on.
> Can you clue me in? Where are the test results with dmesg?
> 
alpha just stalls during boot. parisc reports a hung task
in nvme_reset_work. sparc64 reports EIO when instantiating
the nvme driver, called from nvme_reset_work, and then stalls.
In all three cases, reverting the two mentioned patches fixes
the problem.

https://kerneltests.org/builders/qemu-parisc-next/builds/173/steps/qemubuildcommand_1/logs/stdio

is an example log for parisc.

I didn't check if the other boot failures (ppc looks bad)
have the same root cause.

> How to reproduce?
> 
parisc:

qemu-system-hppa -kernel vmlinux -no-reboot \
-snapshot -device nvme,serial=foo,drive=d0 \
-drive file=rootfs.ext2,if=none,format=raw,id=d0 \
-append 'root=/dev/nvme0n1 rw rootwait panic=-1 console=ttyS0,115200 ' \
-nographic -monitor null

alpha:

qemu-system-alpha -M clipper -kernel arch/alpha/boot/vmlinux -no-reboot \
-snapshot -device nvme,serial=foo,drive=d0 \
-drive file=rootfs.ext2,if=none,format=raw,id=d0 \
-append 'root=/dev/nvme0n1 rw rootwait panic=-1 console=ttyS0' \
-m 128M -nographic -monitor null -serial stdio

sparc64:

qemu-system-sparc64 -M sun4u -cpu 'TI UltraSparc IIi' -m 512 \
-snapshot -device nvme,serial=foo,drive=d0,bus=pciB \
-drive file=rootfs.ext2,if=none,format=raw,id=d0 \
-kernel arch/sparc/boot/image -no-reboot \
-append 'root=/dev/nvme0n1 rw rootwait panic=-1 console=ttyS0' \
-nographic -monitor none

The root file systems are available from the respective subdirectories
of:

https://github.com/groeck/linux-build-test/tree/master/rootfs

Guenter


Re: [PATCH] nvme: utilize two queue maps, one for reads and one for writes

2018-11-13 Thread Jens Axboe
On 11/13/18 5:41 PM, Guenter Roeck wrote:
> Hi,
> 
> On Wed, Oct 31, 2018 at 08:36:31AM -0600, Jens Axboe wrote:
>> NVMe does round-robin between queues by default, which means that
>> sharing a queue map for both reads and writes can be problematic
>> in terms of read servicing. It's much easier to flood the queue
>> with writes and reduce the read servicing.
>>
>> Implement two queue maps, one for reads and one for writes. The
>> write queue count is configurable through the 'write_queues'
>> parameter.
>>
>> By default, we retain the previous behavior of having a single
>> queue set, shared between reads and writes. Setting 'write_queues'
>> to a non-zero value will create two queue sets, one for reads and
>> one for writes, the latter using the configurable number of
>> queues (hardware queue counts permitting).
>>
>> Reviewed-by: Hannes Reinecke 
>> Reviewed-by: Keith Busch 
>> Signed-off-by: Jens Axboe 
> 
> This patch causes hangs when running recent versions of
> -next with several architectures; see the -next column at
> kerneltests.org/builders for details.  Bisect log below; this
> was run with qemu on alpha. Reverting this patch as well as
> "nvme: add separate poll queue map" fixes the problem.

I don't see anything related to what hung, the trace, and so on.
Can you clue me in? Where are the test results with dmesg?

How to reproduce?

-- 
Jens Axboe



Re: [PATCH] nvme: utilize two queue maps, one for reads and one for writes

2018-11-13 Thread Jens Axboe
On 11/13/18 5:41 PM, Guenter Roeck wrote:
> Hi,
> 
> On Wed, Oct 31, 2018 at 08:36:31AM -0600, Jens Axboe wrote:
>> NVMe does round-robin between queues by default, which means that
>> sharing a queue map for both reads and writes can be problematic
>> in terms of read servicing. It's much easier to flood the queue
>> with writes and reduce the read servicing.
>>
>> Implement two queue maps, one for reads and one for writes. The
>> write queue count is configurable through the 'write_queues'
>> parameter.
>>
>> By default, we retain the previous behavior of having a single
>> queue set, shared between reads and writes. Setting 'write_queues'
>> to a non-zero value will create two queue sets, one for reads and
>> one for writes, the latter using the configurable number of
>> queues (hardware queue counts permitting).
>>
>> Reviewed-by: Hannes Reinecke 
>> Reviewed-by: Keith Busch 
>> Signed-off-by: Jens Axboe 
> 
> This patch causes hangs when running recent versions of
> -next with several architectures; see the -next column at
> kerneltests.org/builders for details.  Bisect log below; this
> was run with qemu on alpha. Reverting this patch as well as
> "nvme: add separate poll queue map" fixes the problem.

I don't see anything related to what hung, the trace, and so on.
Can you clue me in? Where are the test results with dmesg?

How to reproduce?

-- 
Jens Axboe



Re: [PATCH] nvme: utilize two queue maps, one for reads and one for writes

2018-11-13 Thread Guenter Roeck
Hi,

On Wed, Oct 31, 2018 at 08:36:31AM -0600, Jens Axboe wrote:
> NVMe does round-robin between queues by default, which means that
> sharing a queue map for both reads and writes can be problematic
> in terms of read servicing. It's much easier to flood the queue
> with writes and reduce the read servicing.
> 
> Implement two queue maps, one for reads and one for writes. The
> write queue count is configurable through the 'write_queues'
> parameter.
> 
> By default, we retain the previous behavior of having a single
> queue set, shared between reads and writes. Setting 'write_queues'
> to a non-zero value will create two queue sets, one for reads and
> one for writes, the latter using the configurable number of
> queues (hardware queue counts permitting).
> 
> Reviewed-by: Hannes Reinecke 
> Reviewed-by: Keith Busch 
> Signed-off-by: Jens Axboe 

This patch causes hangs when running recent versions of
-next with several architectures; see the -next column at
kerneltests.org/builders for details.  Bisect log below; this
was run with qemu on alpha. Reverting this patch as well as
"nvme: add separate poll queue map" fixes the problem.

Guenter

---
# bad: [220dcf1c6fc97f8873b6d9fe121b80decd4b71a8] Add linux-next specific files 
for 20181113
# good: [ccda4af0f4b92f7b4c308d3acc262f4a7e3affad] Linux 4.20-rc2
git bisect start 'HEAD' 'v4.20-rc2'
# good: [b5ae1d7e1bd7cf5dfdef94da5cb69c60c911] Merge remote-tracking branch 
'crypto/master'
git bisect good b5ae1d7e1bd7cf5dfdef94da5cb69c60c911
# bad: [be877b847ffe96fb18e4925f05d5c2eb12b6b9f1] Merge remote-tracking branch 
'block/for-next'
git bisect bad be877b847ffe96fb18e4925f05d5c2eb12b6b9f1
# good: [088122c5028636643d566c89cfdc621beed79974] Merge remote-tracking branch 
'drm-intel/for-linux-next'
git bisect good 088122c5028636643d566c89cfdc621beed79974
# good: [3577f74d5ae898c34252c5cdc096a681910a919f] Merge remote-tracking branch 
'drm-misc/for-linux-next'
git bisect good 3577f74d5ae898c34252c5cdc096a681910a919f
# good: [72008e28c7bf500a03f09929176bd025de94861b] Merge remote-tracking branch 
'sound-asoc/for-next'
git bisect good 72008e28c7bf500a03f09929176bd025de94861b
# bad: [d1e36282b0bbd5de6a9c4d5275e94ef3b3438f48] block: add REQ_HIPRI and 
inherit it from IOCB_HIPRI
git bisect bad d1e36282b0bbd5de6a9c4d5275e94ef3b3438f48
# good: [4316b79e4321d4140164e42f228778e5bc66c84f] block: kill legacy parts of 
timeout handling
git bisect good 4316b79e4321d4140164e42f228778e5bc66c84f
# good: [a0fedc857dff457e123aeaa2039d28ac90e543df] Merge branch 'irq/for-block' 
of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into for-4.21/block
git bisect good a0fedc857dff457e123aeaa2039d28ac90e543df
# good: [b3c661b15d5ab11d982e58bee23e05c1780528a1] blk-mq: support multiple 
hctx maps
git bisect good b3c661b15d5ab11d982e58bee23e05c1780528a1
# good: [67cae4c948a5311121905a2a8740c50daf7f6478] blk-mq: cleanup and improve 
list insertion
git bisect good 67cae4c948a5311121905a2a8740c50daf7f6478
# good: [843477d4cc5c4bb4e346c561ecd3b9d0bd67e8c8] blk-mq: initial support for 
multiple queue maps
git bisect good 843477d4cc5c4bb4e346c561ecd3b9d0bd67e8c8
# bad: [3b6592f70ad7b4c24dd3eb2ac9bbe3353d02c992] nvme: utilize two queue maps, 
one for reads and one for writes
git bisect bad 3b6592f70ad7b4c24dd3eb2ac9bbe3353d02c992
# first bad commit: [3b6592f70ad7b4c24dd3eb2ac9bbe3353d02c992] nvme: utilize 
two queue maps, one for reads and one for writes

> ---
>  drivers/nvme/host/pci.c | 200 
> +++-
>  1 file changed, 181 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index 49ad854d1b91..1987df13b73e 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -74,11 +74,29 @@ static int io_queue_depth = 1024;
>  module_param_cb(io_queue_depth, _queue_depth_ops, _queue_depth, 0644);
>  MODULE_PARM_DESC(io_queue_depth, "set io queue depth, should >= 2");
>  
> +static int queue_count_set(const char *val, const struct kernel_param *kp);
> +static const struct kernel_param_ops queue_count_ops = {
> + .set = queue_count_set,
> + .get = param_get_int,
> +};
> +
> +static int write_queues;
> +module_param_cb(write_queues, _count_ops, _queues, 0644);
> +MODULE_PARM_DESC(write_queues,
> + "Number of queues to use for writes. If not set, reads and writes "
> + "will share a queue set.");
> +
>  struct nvme_dev;
>  struct nvme_queue;
>  
>  static void nvme_dev_disable(struct nvme_dev *dev, bool shutdown);
>  
> +enum {
> + NVMEQ_TYPE_READ,
> + NVMEQ_TYPE_WRITE,
> + NVMEQ_TYPE_NR,
> +};
> +
>  /*
>   * Represents an NVM Express device.  Each nvme_dev is a PCI function.
>   */
> @@ -92,6 +110,7 @@ struct nvme_dev {
>   struct dma_pool *prp_small_pool;
>   unsigned online_queues;
>   unsigned max_qid;
> + unsigned io_queues[NVMEQ_TYPE_NR];
>   unsigned int num_vecs;
>   int q_depth;
>   u32 db_stride;
> @@ -134,6 +153,17 @@ static 

Re: [PATCH] nvme: utilize two queue maps, one for reads and one for writes

2018-11-13 Thread Guenter Roeck
Hi,

On Wed, Oct 31, 2018 at 08:36:31AM -0600, Jens Axboe wrote:
> NVMe does round-robin between queues by default, which means that
> sharing a queue map for both reads and writes can be problematic
> in terms of read servicing. It's much easier to flood the queue
> with writes and reduce the read servicing.
> 
> Implement two queue maps, one for reads and one for writes. The
> write queue count is configurable through the 'write_queues'
> parameter.
> 
> By default, we retain the previous behavior of having a single
> queue set, shared between reads and writes. Setting 'write_queues'
> to a non-zero value will create two queue sets, one for reads and
> one for writes, the latter using the configurable number of
> queues (hardware queue counts permitting).
> 
> Reviewed-by: Hannes Reinecke 
> Reviewed-by: Keith Busch 
> Signed-off-by: Jens Axboe 

This patch causes hangs when running recent versions of
-next with several architectures; see the -next column at
kerneltests.org/builders for details.  Bisect log below; this
was run with qemu on alpha. Reverting this patch as well as
"nvme: add separate poll queue map" fixes the problem.

Guenter

---
# bad: [220dcf1c6fc97f8873b6d9fe121b80decd4b71a8] Add linux-next specific files 
for 20181113
# good: [ccda4af0f4b92f7b4c308d3acc262f4a7e3affad] Linux 4.20-rc2
git bisect start 'HEAD' 'v4.20-rc2'
# good: [b5ae1d7e1bd7cf5dfdef94da5cb69c60c911] Merge remote-tracking branch 
'crypto/master'
git bisect good b5ae1d7e1bd7cf5dfdef94da5cb69c60c911
# bad: [be877b847ffe96fb18e4925f05d5c2eb12b6b9f1] Merge remote-tracking branch 
'block/for-next'
git bisect bad be877b847ffe96fb18e4925f05d5c2eb12b6b9f1
# good: [088122c5028636643d566c89cfdc621beed79974] Merge remote-tracking branch 
'drm-intel/for-linux-next'
git bisect good 088122c5028636643d566c89cfdc621beed79974
# good: [3577f74d5ae898c34252c5cdc096a681910a919f] Merge remote-tracking branch 
'drm-misc/for-linux-next'
git bisect good 3577f74d5ae898c34252c5cdc096a681910a919f
# good: [72008e28c7bf500a03f09929176bd025de94861b] Merge remote-tracking branch 
'sound-asoc/for-next'
git bisect good 72008e28c7bf500a03f09929176bd025de94861b
# bad: [d1e36282b0bbd5de6a9c4d5275e94ef3b3438f48] block: add REQ_HIPRI and 
inherit it from IOCB_HIPRI
git bisect bad d1e36282b0bbd5de6a9c4d5275e94ef3b3438f48
# good: [4316b79e4321d4140164e42f228778e5bc66c84f] block: kill legacy parts of 
timeout handling
git bisect good 4316b79e4321d4140164e42f228778e5bc66c84f
# good: [a0fedc857dff457e123aeaa2039d28ac90e543df] Merge branch 'irq/for-block' 
of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into for-4.21/block
git bisect good a0fedc857dff457e123aeaa2039d28ac90e543df
# good: [b3c661b15d5ab11d982e58bee23e05c1780528a1] blk-mq: support multiple 
hctx maps
git bisect good b3c661b15d5ab11d982e58bee23e05c1780528a1
# good: [67cae4c948a5311121905a2a8740c50daf7f6478] blk-mq: cleanup and improve 
list insertion
git bisect good 67cae4c948a5311121905a2a8740c50daf7f6478
# good: [843477d4cc5c4bb4e346c561ecd3b9d0bd67e8c8] blk-mq: initial support for 
multiple queue maps
git bisect good 843477d4cc5c4bb4e346c561ecd3b9d0bd67e8c8
# bad: [3b6592f70ad7b4c24dd3eb2ac9bbe3353d02c992] nvme: utilize two queue maps, 
one for reads and one for writes
git bisect bad 3b6592f70ad7b4c24dd3eb2ac9bbe3353d02c992
# first bad commit: [3b6592f70ad7b4c24dd3eb2ac9bbe3353d02c992] nvme: utilize 
two queue maps, one for reads and one for writes

> ---
>  drivers/nvme/host/pci.c | 200 
> +++-
>  1 file changed, 181 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index 49ad854d1b91..1987df13b73e 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -74,11 +74,29 @@ static int io_queue_depth = 1024;
>  module_param_cb(io_queue_depth, _queue_depth_ops, _queue_depth, 0644);
>  MODULE_PARM_DESC(io_queue_depth, "set io queue depth, should >= 2");
>  
> +static int queue_count_set(const char *val, const struct kernel_param *kp);
> +static const struct kernel_param_ops queue_count_ops = {
> + .set = queue_count_set,
> + .get = param_get_int,
> +};
> +
> +static int write_queues;
> +module_param_cb(write_queues, _count_ops, _queues, 0644);
> +MODULE_PARM_DESC(write_queues,
> + "Number of queues to use for writes. If not set, reads and writes "
> + "will share a queue set.");
> +
>  struct nvme_dev;
>  struct nvme_queue;
>  
>  static void nvme_dev_disable(struct nvme_dev *dev, bool shutdown);
>  
> +enum {
> + NVMEQ_TYPE_READ,
> + NVMEQ_TYPE_WRITE,
> + NVMEQ_TYPE_NR,
> +};
> +
>  /*
>   * Represents an NVM Express device.  Each nvme_dev is a PCI function.
>   */
> @@ -92,6 +110,7 @@ struct nvme_dev {
>   struct dma_pool *prp_small_pool;
>   unsigned online_queues;
>   unsigned max_qid;
> + unsigned io_queues[NVMEQ_TYPE_NR];
>   unsigned int num_vecs;
>   int q_depth;
>   u32 db_stride;
> @@ -134,6 +153,17 @@ static