On 02/08/18 13:33, Ming Lei wrote:
> On Thu, Aug 02, 2018 at 01:09:31PM +0300, Adrian Hunter wrote:
>> On 31/07/18 19:25, Ming Lei wrote:
>>> Hi Peter,
>>>
>>> On Tue, Jul 31, 2018 at 08:47:45AM -0400, Peter Geis wrote:
>>>> Good Morning,
>>>>
>>>> On 07/30/2018 09:38 PM, Ming Lei wrote:
>>>>> Hi Peter,
>>>>>
>>>>> Thanks for collecting the log.
>>>>>
>>>>> On Mon, Jul 30, 2018 at 02:55:42PM -0400, Peter Geis wrote:
>>>>>>
>>>>>>
>>>>>> On 07/28/2018 09:37 AM, Ming Lei wrote:
>>>>>
>>>>> ...
>>>>>
>>>>>> [ 10.887209] systemd--112 0.n.1 2411122us : blk_mq_make_request:
>>>>>> make
>>>>>> rq -1
>>>>>> [ 10.890274] kworker/-98 0...1 2411506us : blk_mq_free_request:
>>>>>> complete: rq -1
>>>>>> [ 10.893313] systemd--107 0...1 2412025us : blk_mq_make_request:
>>>>>> make
>>>>>> rq -1
>>>>>> [ 10.896354] systemd--107 0.... 2412323us : mmc_mq_queue_rq: queue
>>>>>> rq
>>>>>> -1, 0
>>>>>> [ 10.899388] systemd--107 0.... 2412327us :
>>>>>> blk_mq_try_issue_list_directly: issue direct: rq -1, ret 0
>>>>>> [ 10.902463] (direxec-111 1...1 2413829us : blk_mq_make_request:
>>>>>> make
>>>>>> rq -1
>>>>>> [ 10.905513] systemd--114 1...1 2415159us : blk_mq_make_request:
>>>>>> make
>>>>>> rq -1
>>>>>
>>>>> Above is the most interesting part in the log. MMC sets hw queue depth
>>>>> as 1, and you are using none scheduler, that means the max number of
>>>>> in-flight requests should be one, but the above log shows that there may
>>>>> be 3 in-flight requests.
>>>>
>>>> That's odd, I have CFQ set as the default, is something changing this
>>>> during
>>>> boot?
>>>> CONFIG_CFQ_GROUP_IOSCHED=y
>>>> CONFIG_DEFAULT_IOSCHED="cfq"
>>>
>>> No, now mmc has been converted to blk-mq, and the default mq io sched
>>> should have been mq-deadline for mmc, but not sure why it is none in
>>> your case.
>>>
>>>>
>>>>>
>>>>> That seems really weird, but it shouldn't be related with my two patches,
>>>>> which won't change the tag allocation behaviour at all. However, what
>>>>> matters
>>>>> may be that the patch speeds up the request dispatch. Maybe one bug
>>>>> in lib/sbitmap.c block/blk-mq-tag.c.
>>>>>
>>>>> Unfortunately rq->tag wasn't shown in the log because I forget to dump
>>>>> it in the debug patch, so could you apply the following new debug patch
>>>>> and
>>>>> provide us the log again? BTW, please attach the ftrace log in the reply
>>>>> mail directly, then it may be parsed/looked easily.
>>>>
>>>> I have resynced to the latest linux-next and applied your new patch.
>>>> The log is attached.
>>>>
>>>> Of note, it took several boots this time before it would progress to where
>>>> I
>>>> could grab the log.
>>>> Instead it was blocking the moment RW was requested.
>>>> Also of note, it seems only the emmc is affected, the SD card (mmcblk2)
>>>> does
>>>> not trigger any errors.
>>>
>>> >From the log you captured, seems there are three requests(33, 34, 60)
>>> >blocked,
>>> which should have been inside mmc, but still need to confirm, could you
>>> apply the
>>> following one-line patch against the last debug patch and collect the log
>>> again?
>>>
>>> diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c
>>> index d35f265cd5e0..ac2ffc5a8ed4 100644
>>> --- a/drivers/mmc/core/queue.c
>>> +++ b/drivers/mmc/core/queue.c
>>> @@ -321,6 +321,8 @@ static blk_status_t mmc_mq_queue_rq(struct
>>> blk_mq_hw_ctx *hctx,
>>> !host->hold_retune;
>>> }
>>>
>>> + if (blk_queue_debug(q))
>>> + trace_printk("mmc before issue rq %d %d\n",
>>> req->internal_tag, req->tag);
>>> blk_mq_start_request(req);
>>>
>>> issued = mmc_blk_mq_issue_rq(mq, req);
>>>
>>>
>>> Also see comments in mmc_mq_queue_rq():
>>>
>>> /*
>>> * We use BLK_MQ_F_BLOCKING and have only 1 hardware queue, which means
>>> requests
>>> * will not be dispatched in parallel.
>>> */
>>>
>>> which isn't correct, given blk-mq has multiple sw queue and mmc sets the
>>> queue depth
>>> as > 1, and requests may be dispatched to the unique hw queue in parallel.
>>>
>>> Adrian, Ulf Hansson and anyone, could you take a look at the warning of
>>> WARN_ON(host->cmd)
>>> in sdhci_send_command()? Seems you only allow to queue one command, but not
>>> sure how you
>>> guarantee that.
>>
>> We didn't guarantee it, but it didn't happen before "blk-mq: issue directly
>> if hw queue isn't busy in case of 'none'".
>
> OK, thanks for clarifying that, and as I mentioned what matters is the
> timing change.
>
>> We did consider adding a mutex, refer
>> https://lore.kernel.org/lkml/CAPDyKFr8tiJXSL-weQjGJ3DfRrfv8ZAFY8=zeclngse_43s...@mail.gmail.com/
>>
>> However the following might do, do you think?
>
> If dispatch in parallel isn't supported, just wondering why not set hw
> queue depth as 1? That way should be simple to fix this issue.
First, it isn't 1. It is 2 for devices with no command queue because we
prepare a request while the previous one completes. Otherwise it is the
command queue depth.
Secondly, we expect an elevator to be used, and the elevator needs a decent
number of requests to work with, and the default number of requests is
currently tied to the queue depth.
>
>>
>>
>> diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c
>> index 648eb6743ed5..6edffeed9953 100644
>> --- a/drivers/mmc/core/queue.c
>> +++ b/drivers/mmc/core/queue.c
>> @@ -238,10 +238,6 @@ static void mmc_mq_exit_request(struct blk_mq_tag_set
>> *set, struct request *req,
>> mmc_exit_request(mq->queue, req);
>> }
>>
>> -/*
>> - * We use BLK_MQ_F_BLOCKING and have only 1 hardware queue, which means
>> requests
>> - * will not be dispatched in parallel.
>> - */
>> static blk_status_t mmc_mq_queue_rq(struct blk_mq_hw_ctx *hctx,
>> const struct blk_mq_queue_data *bd)
>> {
>> @@ -264,7 +260,7 @@ static blk_status_t mmc_mq_queue_rq(struct blk_mq_hw_ctx
>> *hctx,
>>
>> spin_lock_irq(q->queue_lock);
>>
>> - if (mq->recovery_needed) {
>> + if (mq->recovery_needed || mq->busy) {
>> spin_unlock_irq(q->queue_lock);
>> return BLK_STS_RESOURCE;
>> }
>> @@ -291,6 +287,9 @@ static blk_status_t mmc_mq_queue_rq(struct blk_mq_hw_ctx
>> *hctx,
>> break;
>> }
>>
>> + /* Parallel dispatch of requests is not supported at the moment */
>> + mq->busy = true;
>> +
>> mq->in_flight[issue_type] += 1;
>> get_card = (mmc_tot_in_flight(mq) == 1);
>> cqe_retune_ok = (mmc_cqe_qcnt(mq) == 1);
>> @@ -333,9 +332,12 @@ static blk_status_t mmc_mq_queue_rq(struct
>> blk_mq_hw_ctx *hctx,
>> mq->in_flight[issue_type] -= 1;
>> if (mmc_tot_in_flight(mq) == 0)
>> put_card = true;
>> + mq->busy = false;
>> spin_unlock_irq(q->queue_lock);
>> if (put_card)
>> mmc_put_card(card, &mq->ctx);
>> + } else {
>> + WRITE_ONCE(mq->busy, false);
>> }
>>
>> return ret;
>> diff --git a/drivers/mmc/core/queue.h b/drivers/mmc/core/queue.h
>> index 17e59d50b496..9bf3c9245075 100644
>> --- a/drivers/mmc/core/queue.h
>> +++ b/drivers/mmc/core/queue.h
>> @@ -81,6 +81,7 @@ struct mmc_queue {
>> unsigned int cqe_busy;
>> #define MMC_CQE_DCMD_BUSY BIT(0)
>> #define MMC_CQE_QUEUE_FULL BIT(1)
>> + bool busy;
>> bool use_cqe;
>> bool recovery_needed;
>> bool in_recovery;
>
> Sorry, I am not familiar with mmc code, so can't comment on the above
> patch.
Right, but if we return BLK_STS_RESOURCE to avoid parallel dispatch, do we
need to worry about ensuring the queue gets run later?