On 12/06/2019 19:25, Bjorn Andersson wrote:

> On Wed 12 Jun 09:24 PDT 2019, Marc Gonzalez wrote:
> 
>> On 05/06/2019 01:24, Bjorn Andersson wrote:
>>
>>> After issuing a PHY_START request to the QMP, the hardware documentation
>>> states that the software should wait for the PCS_READY_STATUS to become 1.
>>>
>>> With the introduction of c9b589791fc1 ("phy: qcom: Utilize UFS reset
>>> controller") an additional 1ms delay was introduced between the start
>>> request and the check of the status bit. This greatly increases the
>>> chances for the hardware to actually becoming ready before the status
>>> bit is read.
>>>
>>> The result can be seen in that UFS PHY enabling is now reported as a
>>> failure in 10% of the boots on SDM845, which is a clear regression from
>>> the previous rare/occasional failure.
>>>
>>> This patch fixes the "break condition" of the poll to check for the
>>> correct state of the status bit.
>>>
>>> Unfortunately PCIe on 8996 and 8998 does not specify the mask_pcs_ready
>>> register, which means that the code checks a bit that's always 0. So the
>>> patch also fixes these, in order to not regress these targets.
>>>
>>> Cc: sta...@vger.kernel.org
>>> Cc: Evan Green <evgr...@chromium.org>
>>> Cc: Marc Gonzalez <marc.w.gonza...@free.fr>
>>> Cc: Vivek Gautam <vivek.gau...@codeaurora.org>
>>> Fixes: 73d7ec899bd8 ("phy: qcom-qmp: Add msm8998 PCIe QMP PHY support")
>>> Fixes: e78f3d15e115 ("phy: qcom-qmp: new qmp phy driver for qcom-chipsets")
>>> Signed-off-by: Bjorn Andersson <bjorn.anders...@linaro.org>
>>> ---
>>>
>>> @Kishon, this is a regression spotted in v5.2-rc1, so please consider 
>>> applying
>>> this towards v5.2.
>>>
>>>  drivers/phy/qualcomm/phy-qcom-qmp.c | 4 +++-
>>>  1 file changed, 3 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/phy/qualcomm/phy-qcom-qmp.c 
>>> b/drivers/phy/qualcomm/phy-qcom-qmp.c
>>> index cd91b4179b10..43abdfd0deed 100644
>>> --- a/drivers/phy/qualcomm/phy-qcom-qmp.c
>>> +++ b/drivers/phy/qualcomm/phy-qcom-qmp.c
>>> @@ -1074,6 +1074,7 @@ static const struct qmp_phy_cfg msm8996_pciephy_cfg = 
>>> {
>>>  
>>>     .start_ctrl             = PCS_START | PLL_READY_GATE_EN,
>>>     .pwrdn_ctrl             = SW_PWRDN | REFCLK_DRV_DSBL,
>>> +   .mask_pcs_ready         = PHYSTATUS,
>>>     .mask_com_pcs_ready     = PCS_READY,
>>>  
>>>     .has_phy_com_ctrl       = true,
>>> @@ -1253,6 +1254,7 @@ static const struct qmp_phy_cfg msm8998_pciephy_cfg = 
>>> {
>>>  
>>>     .start_ctrl             = SERDES_START | PCS_START,
>>>     .pwrdn_ctrl             = SW_PWRDN | REFCLK_DRV_DSBL,
>>> +   .mask_pcs_ready         = PHYSTATUS,
>>>     .mask_com_pcs_ready     = PCS_READY,
>>>  };
>>>  
>>> @@ -1547,7 +1549,7 @@ static int qcom_qmp_phy_enable(struct phy *phy)
>>>     status = pcs + cfg->regs[QPHY_PCS_READY_STATUS];
>>>     mask = cfg->mask_pcs_ready;
>>>  
>>> -   ret = readl_poll_timeout(status, val, !(val & mask), 1,
>>> +   ret = readl_poll_timeout(status, val, val & mask, 1,
>>>                              PHY_INIT_COMPLETE_TIMEOUT);
>>>     if (ret) {
>>>             dev_err(qmp->dev, "phy initialization timed-out\n");
>>
>> Your patch made me realize that:
>> msm8998_pciephy_cfg.has_phy_com_ctrl = false
>> thus
>> msm8998_pciephy_cfg.mask_com_pcs_ready is useless, AFAICT.
> 
> While 8998 has a COM block, it does (among other things) not have a
> ready bit. So afaict has_phy_com_ctrl = false is correct.

Pfff... Working blind without the HPG sucks...

> The addition of mask_pcs_ready is part of resolving the regression in
> 5.2, so I suggest that we remove mask_com_pcs_ready separately.

I agree that it should be done separately.
I'll send a patch on top of yours.

>> (I copied msm8996_pciephy_cfg for msm8998_pciephy_cfg)
>>
>> Does msm8996_pciephy_cfg really need both mask_pcs_ready AND
>> mask_com_pcs_ready?
> 
> 8996 has a COM block and it contains both the control bits and the
> status bits, so that looks correct.

Thanks for checking.

>> I'll test your patch tomorrow.
> 
> I appreciate that.

Here are my observations for a 8998 board:

1) If I apply only the readl_poll_timeout() fix (not the mask_pcs_ready fixup)
qcom_pcie_probe() fails with a timeout in phy_init.
=> this is in line with your regression analysis.

2) Your patch also fixes a long-standing bug in UFS init whereby sending
lots of information to the console during phy init would lead to an
incorrectly diagnosed time-out.

Good stuff!

Reviewed-by: Marc Gonzalez <marc.w.gonza...@free.fr>
Tested-by: Marc Gonzalez <marc.w.gonza...@free.fr>

Regards.

Reply via email to