Re: [PATCH V9 00/15] mmc: Add Command Queue support

2017-10-11 Thread Ulf Hansson
On 27 September 2017 at 00:25, Ulf Hansson  wrote:
> On 22 September 2017 at 14:36, Adrian Hunter  wrote:
>> Hi
>>
>> Here is V9 of the hardware command queue patches without the software
>> command queue patches, now using blk-mq and now with blk-mq support for
>> non-CQE I/O.
>>
>> HW CMDQ offers 25% - 50% better random multi-threaded I/O.  I see a slight
>> 2% drop in sequential read speed but no change to sequential write.
>>
>> Non-CQE blk-mq showed a 3% decrease in sequential read performance.  This
>> seemed to be coming from the inferior latency of running work items compared
>> with a dedicated thread.  Hacking blk-mq workqueue to be unbound reduced the
>> performance degradation from 3% to 1%.
>>
>> While we should look at changing blk-mq to give better workqueue performance,
>> a bigger gain is likely to be made by adding a new host API to enable the
>> next already-prepared request to be issued directly from within ->done()
>> callback of the current request.
>
> I have looked at patch 1->8, and those looks nice to me so I have
> applied those for next. I will do my best to review the rest asap,
> however I am currently traveling so in worst case it will have wait
> until next week.

Adrian, I decided to drop patch6, the one which adds the Kconfig
option for blkmq for now. Please re-post it when you get to the point
of sending a version of the series.

Kind regards
Uffe


Re: [PATCH V9 00/15] mmc: Add Command Queue support

2017-09-26 Thread Ulf Hansson
On 22 September 2017 at 14:36, Adrian Hunter  wrote:
> Hi
>
> Here is V9 of the hardware command queue patches without the software
> command queue patches, now using blk-mq and now with blk-mq support for
> non-CQE I/O.
>
> HW CMDQ offers 25% - 50% better random multi-threaded I/O.  I see a slight
> 2% drop in sequential read speed but no change to sequential write.
>
> Non-CQE blk-mq showed a 3% decrease in sequential read performance.  This
> seemed to be coming from the inferior latency of running work items compared
> with a dedicated thread.  Hacking blk-mq workqueue to be unbound reduced the
> performance degradation from 3% to 1%.
>
> While we should look at changing blk-mq to give better workqueue performance,
> a bigger gain is likely to be made by adding a new host API to enable the
> next already-prepared request to be issued directly from within ->done()
> callback of the current request.

I have looked at patch 1->8, and those looks nice to me so I have
applied those for next. I will do my best to review the rest asap,
however I am currently traveling so in worst case it will have wait
until next week.

Thanks and kind regards!
Uffe

>
>
> Changes since V8:
> Re-based
>   mmc: core: Introduce host claiming by context
> Slightly simplified as per Ulf
>   mmc: core: Export mmc_retune_hold_now() and mmc_retune_release()
> New patch.
>   mmc: block: Add CQE and blk-mq support
> Fix missing ->post_req() on the error path
>
> Changes since V7:
> Re-based
>   mmc: core: Introduce host claiming by context
> Slightly simplified
>   mmc: core: Add parameter use_blk_mq
> New patch.
>   mmc: core: Remove unnecessary host claim
> New patch.
>   mmc: core: Export mmc_start_bkops()
> New patch.
>   mmc: core: Export mmc_start_request()
> New patch.
>   mmc: block: Add CQE and blk-mq support
> Add blk-mq support for non_CQE requests
>
> Changes since V6:
>   mmc: core: Introduce host claiming by context
> New patch.
>   mmc: core: Move mmc_start_areq() declaration
> Dropped because it has been applied
>   mmc: block: Fix block status codes
> Dropped because it has been applied
>   mmc: host: Add CQE interface
> Dropped because it has been applied
>   mmc: core: Turn off CQE before sending commands
> Dropped because it has been applied
>   mmc: block: Factor out mmc_setup_queue()
> New patch.
>   mmc: block: Add CQE support
> Drop legacy support and add blk-mq support
>
> Changes since V5:
> Re-based
>   mmc: core: Add mmc_retune_hold_now()
> Dropped because it has been applied
>   mmc: core: Add members to mmc_request and mmc_data for CQE's
> Dropped because it has been applied
>   mmc: core: Move mmc_start_areq() declaration
> New patch at Ulf's request
>   mmc: block: Fix block status codes
> Another un-related patch
>   mmc: host: Add CQE interface
> Move recovery_notifier() callback to struct mmc_request
>   mmc: core: Add support for handling CQE requests
> Roll __mmc_cqe_request_done() into mmc_cqe_request_done()
> Move function declarations requested by Ulf
>   mmc: core: Remove unused MMC_CAP2_PACKED_CMD
> Dropped because it has been applied
>   mmc: block: Add CQE support
> Add explanation to commit message
> Adjustment for changed recovery_notifier() callback
>   mmc: cqhci: support for command queue enabled host
> Adjustment for changed recovery_notifier() callback
>   mmc: sdhci-pci: Add CQHCI support for Intel GLK
> Add DCMD capability for Intel controllers except GLK
>
> Changes since V4:
>   mmc: core: Add mmc_retune_hold_now()
> Add explanation to commit message.
>   mmc: host: Add CQE interface
> Add comments to callback declarations.
>   mmc: core: Turn off CQE before sending commands
> Add explanation to commit message.
>   mmc: core: Add support for handling CQE requests
> Add comments as requested by Ulf.
>   mmc: core: Remove unused MMC_CAP2_PACKED_CMD
> New patch.
>   mmc: mmc: Enable Command Queuing
> Adjust for removal of MMC_CAP2_PACKED_CMD.
> Add a comment about Packed Commands.
>   mmc: mmc: Enable CQE's
> Remove un-necessary check for MMC_CAP2_CQE
>   mmc: block: Use local variables in mmc_blk_data_prep()
> New patch.
>   mmc: block: Prepare CQE data
> Adjust due to "mmc: block: Use local variables in mmc_blk_data_prep()"
> Remove priority setting.
> Add explanation to commit message.
>   mmc: cqhci: support for command queue enabled host
> Fix transfer descriptor setting in cqhci_set_tran_desc() for 32-bit 
> DMA
>
> Changes since V3:
> Adjusted 

[PATCH V9 00/15] mmc: Add Command Queue support

2017-09-22 Thread Adrian Hunter
Hi

Here is V9 of the hardware command queue patches without the software
command queue patches, now using blk-mq and now with blk-mq support for
non-CQE I/O.

HW CMDQ offers 25% - 50% better random multi-threaded I/O.  I see a slight
2% drop in sequential read speed but no change to sequential write.

Non-CQE blk-mq showed a 3% decrease in sequential read performance.  This
seemed to be coming from the inferior latency of running work items compared
with a dedicated thread.  Hacking blk-mq workqueue to be unbound reduced the
performance degradation from 3% to 1%.

While we should look at changing blk-mq to give better workqueue performance,
a bigger gain is likely to be made by adding a new host API to enable the
next already-prepared request to be issued directly from within ->done()
callback of the current request.


Changes since V8:
Re-based
  mmc: core: Introduce host claiming by context
Slightly simplified as per Ulf
  mmc: core: Export mmc_retune_hold_now() and mmc_retune_release()
New patch.
  mmc: block: Add CQE and blk-mq support
Fix missing ->post_req() on the error path

Changes since V7:
Re-based
  mmc: core: Introduce host claiming by context
Slightly simplified
  mmc: core: Add parameter use_blk_mq
New patch.
  mmc: core: Remove unnecessary host claim
New patch.
  mmc: core: Export mmc_start_bkops()
New patch.
  mmc: core: Export mmc_start_request()
New patch.
  mmc: block: Add CQE and blk-mq support
Add blk-mq support for non_CQE requests

Changes since V6:
  mmc: core: Introduce host claiming by context
New patch.
  mmc: core: Move mmc_start_areq() declaration
Dropped because it has been applied
  mmc: block: Fix block status codes
Dropped because it has been applied
  mmc: host: Add CQE interface
Dropped because it has been applied
  mmc: core: Turn off CQE before sending commands
Dropped because it has been applied
  mmc: block: Factor out mmc_setup_queue()
New patch.
  mmc: block: Add CQE support
Drop legacy support and add blk-mq support

Changes since V5:
Re-based
  mmc: core: Add mmc_retune_hold_now()
Dropped because it has been applied
  mmc: core: Add members to mmc_request and mmc_data for CQE's
Dropped because it has been applied
  mmc: core: Move mmc_start_areq() declaration
New patch at Ulf's request
  mmc: block: Fix block status codes
Another un-related patch
  mmc: host: Add CQE interface
Move recovery_notifier() callback to struct mmc_request
  mmc: core: Add support for handling CQE requests
Roll __mmc_cqe_request_done() into mmc_cqe_request_done()
Move function declarations requested by Ulf
  mmc: core: Remove unused MMC_CAP2_PACKED_CMD
Dropped because it has been applied
  mmc: block: Add CQE support
Add explanation to commit message
Adjustment for changed recovery_notifier() callback
  mmc: cqhci: support for command queue enabled host
Adjustment for changed recovery_notifier() callback
  mmc: sdhci-pci: Add CQHCI support for Intel GLK
Add DCMD capability for Intel controllers except GLK

Changes since V4:
  mmc: core: Add mmc_retune_hold_now()
Add explanation to commit message.
  mmc: host: Add CQE interface
Add comments to callback declarations.
  mmc: core: Turn off CQE before sending commands
Add explanation to commit message.
  mmc: core: Add support for handling CQE requests
Add comments as requested by Ulf.
  mmc: core: Remove unused MMC_CAP2_PACKED_CMD
New patch.
  mmc: mmc: Enable Command Queuing
Adjust for removal of MMC_CAP2_PACKED_CMD.
Add a comment about Packed Commands.
  mmc: mmc: Enable CQE's
Remove un-necessary check for MMC_CAP2_CQE
  mmc: block: Use local variables in mmc_blk_data_prep()
New patch.
  mmc: block: Prepare CQE data
Adjust due to "mmc: block: Use local variables in mmc_blk_data_prep()"
Remove priority setting.
Add explanation to commit message.
  mmc: cqhci: support for command queue enabled host
Fix transfer descriptor setting in cqhci_set_tran_desc() for 32-bit DMA

Changes since V3:
Adjusted ...blk_end_request...() for new block status codes
Fixed CQHCI transaction descriptor for "no DCMD" case

Changes since V2:
Dropped patches that have been applied.
Re-based
Added "mmc: sdhci-pci: Add CQHCI support for Intel GLK"

Changes since V1:

"Share mmc request array between partitions" is dependent
on changes in "Introduce queue semantics", so added that
and block fixes:

Added "Fix is_waiting_last_req set incorrectly"
Added "Fix cmd error reset failure path"
Added