On Thu, Oct 25, 2012 at 8:32 PM, Per Förlin <[email protected]> wrote:
> On 10/25/2012 03:28 PM, Konstantin Dorfman wrote:
>> On 10/24/2012 07:07 PM, Per Förlin wrote:
>>> On 10/24/2012 11:41 AM, Konstantin Dorfman wrote:
>>>> Hello Per,
>>>>
>>>> On Mon, October 22, 2012 1:02 am, Per Forlin wrote:
>>>>>> When mmcqt reports on completion of a request there should be
>>>>>> a context switch to allow the insertion of the next read ahead BIOs
>>>>>> to the block layer. Since the mmcqd tries to fetch another request
>>>>>> immediately after the completion of the previous request it gets NULL
>>>>>> and starts waiting for the completion of the previous request.
>>>>>> This wait on completion gives the FS the opportunity to insert the next
>>>>>> request but the MMC layer is already blocked on the previous request
>>>>>> completion and is not aware of the new request waiting to be fetched.
>>>>> I thought that I could trigger a context switch in order to give
>>>>> execution time for FS to add the new request to the MMC queue.
>>>>> I made a simple hack to call yield() in case the request gets NULL. I
>>>>> thought it may give the FS layer enough time to add a new request to
>>>>> the MMC queue. This would not delay the MMC transfer since the yield()
>>>>> is done in parallel with an ongoing transfer. Anyway it was just meant
>>>>> to be a simple test.
>>>>>
>>>>> One yield was not enough. Just for sanity check I added a msleep as
>>>>> well and that was enough to let FS add a new request,
>>>>> Would it be possible to gain throughput by delaying the fetch of new
>>>>> request? Too avoid unnecessary NULL requests
>>>>>
>>>>> If (ongoing request is read AND size is max read ahead AND new request
>>>>> is NULL) yield();
>>>>>
>>>>> BR
>>>>> Per
>>>> We did the same experiment and it will not give maximum possible
>>>> performance. There is no guarantee that the context switch which was
>>>> manually caused by the MMC layer comes just in time: when it was early
>>>> then next fetch still results in NULL, when it was later, then we miss
>>>> possibility to fetch/prepare new request.
>>>>
>>>> Any delay in fetch of the new request that comes after the new request has
>>>> arrived hits throughput and latency.
>>>>
>>>> The solution we are talking about here will fix not only situation with FS
>>>> read ahead mechanism, but also it will remove penalty of the MMC context
>>>> waiting on completion while any new request arrives.
>>>>
>>>> Thanks,
>>>>
>>> It seems strange that the block layer cannot keep up with relatively slow 
>>> flash media devices. There must be a limitation on number of outstanding 
>>> request towards MMC.
>>> I need to make up my mind if it's the best way to address this issue in the 
>>> MMC framework or block layer. I have started to look into the block layer 
>>> code but it will take some time to dig out the relevant parts.
>>>
>>> BR
>>> Per
>>>
>> The root cause of the issue in incompletion of the current design with
>> well known producer-consumer problem solution (producer is block layer,
>> consumer is mmc layer).
>> Classic definitions states that the buffer is fix size, in our case we
>> have queue, so Producer always capable to put new request into the queue.
>> Consumer context blocked when both buffers (curr and prev) are busy
>> (first started its execution on the bus, second is fetched and waiting
>> for the first).
> This happens but I thought that the block layer would continue to add request 
> to the MMC queue while the consumer is busy.
> When consumer fetches request from the queue again there should be several 
> requests available in the queue, but there is only one.
>
>> Producer context considered to be blocked when FS (or others bio
>> sources) has no requests to put into queue.
> Does the block layer ever wait for outstanding request to finish? Could this 
> be another reason why the producer doesn't add new requests on the MMC queue?
>

Actually there could a lot of reasons why block layer or CFQ would not have
inserted the request into the queue. i.e. you can see a lot of exit paths
where blk_peek_request returns NULL, even though there could be any request
pending in one of the CFQ requests queues.

Essentially you need to check what makes blk_fetch_request
(cfq_dispatch_requests() ) return NULL when there is an element in
queue, if at all.

>> To maximize performance there are 2 notifications should be used:
>> 1. Producer notifies Consumer about new item to proceed.
>> 2. Consumer notifies Producer about free place.
>>
>> In our case 2nd notification not need since as I said before - it is
>> always free space in the queue.
>> There is no such notification as 1st, i.e. block layer has no way to
>> notify mmc layer about new request arrived.
>>
>> What you suggesting is to resolve specific case, when FS READ_AHEAD
>> mechanism behavior causes delays in producing new requests.
>> Probably you can resolve this specific case, but do you have guarantee
>> that this is only case that causes delays between new requests events?
>> Flash memory devices these days constantly improved on all levels: NAND,
>> firmware, bus speed and host controller capabilities, this makes any
>> yield/sleep/timeouts solution only temporary hacks.
> I never meant yield or sleep to be a permanent fix. I was only curious of how 
> if would affect the performance in order to gain a better knowledge of the 
> root cause.
> My impression is that even if the SD card is very slow you will see the same 
> affect. The behavior of the block layer in this case is not related to the 
> speed for the flash memory.
> On a slow card the MMC-queue runs empty just like it does for a fast eMMC.
> According to you the block layer should have a better chance to feed the MMC 
> queue if the card is slow (more time for the block layer to prepare next 
> requests).
>
> BR
> Per
--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to