Quoting Kahola, Mika (2023-08-29 06:35:17-03:00)
>> -----Original Message-----
>> From: Intel-gfx <[email protected]> On Behalf Of
>> Sripada, Radhakrishna
>> Sent: Tuesday, August 29, 2023 1:54 AM
>> To: Sousa, Gustavo <[email protected]>; [email protected]
>> Subject: Re: [Intel-gfx] [PATCH] drm/i915/cx0: Check and increase msgbus
>> timeout threshold
>>
>> Hi Gustavo,
>>
>> > -----Original Message-----
>> > From: Sousa, Gustavo <[email protected]>
>> > Sent: Monday, August 28, 2023 2:46 PM
>> > To: Sripada, Radhakrishna <[email protected]>; intel-
>> > [email protected]
>> > Subject: RE: [Intel-gfx] [PATCH] drm/i915/cx0: Check and increase
>> > msgbus timeout threshold
>> >
>> > Quoting Sripada, Radhakrishna (2023-08-28 17:30:19-03:00)
>> > >Hi Gustavo,
>> >
>> > Hi, RK.
>> >
>> > Thanks for the feedback! Please, see my replies below.
>> >
>> > >
>> > >> -----Original Message-----
>> > >> From: Intel-gfx <[email protected]> On Behalf
>> > >> Of
>> > Gustavo
>> > >> Sousa
>> > >> Sent: Monday, August 28, 2023 10:34 AM
>> > >> To: [email protected]
>> > >> Subject: [Intel-gfx] [PATCH] drm/i915/cx0: Check and increase
>> > >> msgbus
>> > timeout
>> > >> threshold
>> > >>
>> > >> We have experienced timeout issues when accessing C20 SRAM registers.
>> > >This wording is misleading. It seems to imply that we directly access
>> > >SRAM registers via msg bus but the reads/writes go through
>> > >intermediate registers and it is this read/write that is failing. Adding
>> > >extra clarity here would be helpful.
>> >
>> > Hm... Okay. I thought that would already be implied to someone
>> > familiar with the code. What about:
>> >
>> > s/when accessing/when going through the sequence to access/
>> This is better to indicate that it is not direct access.
>>
>> >
>> > ?
>> >
>> > >
>> > >> Experimentation showed that bumping the message bus timer threshold
>> > >> helped on getting display Type-C connection on the C20 PHY to work.
>> > >>
>> > >> While the timeout is still under investigation with the HW team,
>> > >> having logic to allow forward progress (with the proper warnings) seems
>> > >> useful.
>> > >> Thus, let's bump the threshold when a timeout is detected.
>> > >>
>> > >> The maximum value of 0x200 pclk cycles was somewhat arbitrary - 2x
>> > >> the default value. That value was successfully tested on real
>> > >> hardware that was displaying timeouts otherwise.
>> > >>
>> > >> BSpec: 65156
>> > >> Signed-off-by: Gustavo Sousa <[email protected]>
>> > >> ---
>> > >> drivers/gpu/drm/i915/display/intel_cx0_phy.c | 41
>> > >> +++++++++++++++++++ .../gpu/drm/i915/display/intel_cx0_phy_regs.h
>> > >> | 12 ++++++
>> > >> 2 files changed, 53 insertions(+)
>> > >>
>> > >> diff --git a/drivers/gpu/drm/i915/display/intel_cx0_phy.c
>> > >> b/drivers/gpu/drm/i915/display/intel_cx0_phy.c
>> > >> index dd489b50ad60..55d2333032e3 100644
>> > >> --- a/drivers/gpu/drm/i915/display/intel_cx0_phy.c
>> > >> +++ b/drivers/gpu/drm/i915/display/intel_cx0_phy.c
>> > >> @@ -5,6 +5,7 @@
>> > >>
>> > >> #include <linux/log2.h>
>> > >> #include <linux/math64.h>
>> > >> +#include <linux/minmax.h>
>> > >> #include "i915_reg.h"
>> > >> #include "intel_cx0_phy.h"
>> > >> #include "intel_cx0_phy_regs.h"
>> > >> @@ -29,6 +30,8 @@
>> > >> #define INTEL_CX0_LANE1 BIT(1)
>> > >> #define INTEL_CX0_BOTH_LANES (INTEL_CX0_LANE1 |
>> > >> INTEL_CX0_LANE0)
>> > >>
>> > >> +#define INTEL_CX0_MSGBUS_TIMER_VAL_MAX 0x200
>> > >> +
>> > >> bool intel_is_c10phy(struct drm_i915_private *i915, enum phy phy)
>> > >> {
>> > >> if (DISPLAY_VER_FULL(i915) == IP_VER(14, 0) && phy <
>> > >> PHY_C) @@ -119,6 +122,43 @@ static void intel_cx0_bus_reset(struct
>> > drm_i915_private
>> > >> *i915, enum port port, i
>> > >> intel_clear_response_ready_flag(i915, port, lane); }
>> > >>
>> > >> +/*
>> > >> + * Check if there was a timeout detected by the hardware in the
>> > >> +message bus
>> > >> + * and bump the threshold if so.
>
>Just a thought but wouldn't it be simpler if we just set the timeout to it's
>maximum and discard the check here?
I'm not sure which exactly check you are talking about here:
1) The check on the XELPDP_PORT_MSGBUS_TIMER_TIMED_OUT.
I think this could give us useful debugging info, for example, if our code
thinks the message bus timed out while the bus hardware did not say so. I
would prefer to keep this one, if you are okay with it.
2) The check on the current value of the threshold and the exponential
increase up to the maximum.
I would agree that I did a bit of over-engineering here. I guess I could
simply to a rmw using INTEL_CX0_MSGBUS_TIMER_VAL_MAX directly as you
suggested; and maybe rename INTEL_CX0_MSGBUS_TIMER_VAL_MAX to
INTEL_CX0_MSGBUS_TIMER_BUMPED_VAL, just not to give the impression that
0x200 is the maximum accepted by the hardware.
What do you think?
--
Gustavo Sousa
>
>-Mika-
>
>> > >> + */
>> > >> +static void intel_cx0_bus_check_and_bump_timer(struct
>> > >> +drm_i915_private
>> > >> *i915,
>> > >> + enum port port, int
>> > >> +lane) {
>> > >> + enum phy phy = intel_port_to_phy(i915, port);
>> > >> + i915_reg_t reg;
>> > >> + u32 val;
>> > >> + u32 timer_val;
>> > >> +
>> > >> + reg = XELPDP_PORT_MSGBUS_TIMER(port, lane);
>> > >> + val = intel_de_read(i915, reg);
>> > >> +
>> > >> + if (!(val & XELPDP_PORT_MSGBUS_TIMER_TIMED_OUT)) {
>> > >> + drm_warn(&i915->drm,
>> > >> + "PHY %c lane %d: hardware did not detect
>> > >> + a
>> > >> timeout\n",
>> > >> + phy_name(phy), lane);
>> > >> + return;
>> > >> + }
>> > >> +
>> > >> + timer_val =
>> > >> REG_FIELD_GET(XELPDP_PORT_MSGBUS_TIMER_VAL_MASK, val);
>> > >> +
>> > >> + if (timer_val == INTEL_CX0_MSGBUS_TIMER_VAL_MAX)
>> > >> + return;
>> > >> +
>> > >> + timer_val = min(2 * timer_val,
>> > >> (u32)INTEL_CX0_MSGBUS_TIMER_VAL_MAX);
>> > > ^ is this cast necessary?
>> >
>> > Yes. I remember getting a warning without it. Let me remove it to
>> > remember...
>> Got it thanks for the quick check.
>>
>> >
>> > ...done! I think it complains because the arguments are of different types:
>> >
>> > In file included from drivers/gpu/drm/i915/display/intel_cx0_phy.c:8:
>> > drivers/gpu/drm/i915/display/intel_cx0_phy.c: In function
>> > ‘intel_cx0_bus_check_and_bump_timer’:
>> > ./include/linux/minmax.h:20:35: error: comparison of distinct
>> > pointer types lacks a cast [-Werror]
>> > 20 | (!!(sizeof((typeof(x) *)1 == (typeof(y) *)1)))
>> > | ^~
>> > ./include/linux/minmax.h:26:18: note: in expansion of macro
>> > ‘__typecheck’
>> > 26 | (__typecheck(x, y) && __no_side_effects(x, y))
>> > | ^~~~~~~~~~~
>> > ./include/linux/minmax.h:36:31: note: in expansion of macro
>> > ‘__safe_cmp’
>> > 36 | __builtin_choose_expr(__safe_cmp(x, y), \
>> > | ^~~~~~~~~~
>> > ./include/linux/minmax.h:67:25: note: in expansion of macro
>> > ‘__careful_cmp’
>> > 67 | #define min(x, y) __careful_cmp(x, y, <)
>> > | ^~~~~~~~~~~~~
>> > drivers/gpu/drm/i915/display/intel_cx0_phy.c:152:21: note: in
>> > expansion of macro ‘min’
>> > 152 | timer_val = min(2 * timer_val,
>> > INTEL_CX0_MSGBUS_TIMER_VAL_MAX);
>> > |
>> >
>> > >
>> > >> + val &= ~XELPDP_PORT_MSGBUS_TIMER_VAL_MASK;
>> > >> + val |= REG_FIELD_PREP(XELPDP_PORT_MSGBUS_TIMER_VAL_MASK,
>> > >> timer_val);
>> > >We do not use REG_FIELD_PREP in the function. Instead we use it in
>> > >register definition in intel_cx0_phy_regs.h.
>> >
>> > I think it makes sense have definitions using REG_FIELD_PREP() in
>> > header files and use those definitions in .c files for fields whose
>> > values are are enumerated.
>> >
>> > Now, for fields without enumeration, like this one in discussion, I
>> > think it is fine to use REG_FIELD_PREP() directly. The MASK is already
>> > exposed anyway...
>> >
>> > Besides, it seems we already have lines of code in *.c files using the
>> > pattern
>> > above:
>> >
>> > git grep REG_FIELD_PREP drm-tip/drm-tip --
>> > ':/drivers/gpu/drm/i915/**/*.c'
>> >
>> > Let me know what you think. I could add a
>> > XELPDP_PORT_MSGBUS_TIMER_VAL() macro if you think it is necessary. My
>> > personal take is that using REG_FIELD_PREP() for this case is fine.
>> Let us conform with the norms for the macro-fields used in this file instead
>> of starting a new pattern.
>>
>> --Radhakrishna(RK) Sripada
>> >
>> > --
>> > Gustavo Sousa
>> >
>> > >
>> > >Thanks,
>> > >Radhakrishna Sripada
>> > >
>> > >> +
>> > >> + drm_dbg_kms(&i915->drm,
>> > >> + "PHY %c lane %d: increasing msgbus timer
>> > >> + threshold to
>> > >> %#x\n",
>> > >> + phy_name(phy), lane, timer_val);
>> > >> + intel_de_write(i915, reg, val); }
>> > >> +
>> > >> static int intel_cx0_wait_for_ack(struct drm_i915_private *i915,
>> > >> enum port
>> > port,
>> > >> int command, int lane, u32 *val)
>> > >> { @@ -132,6 +172,7 @@ static int intel_cx0_wait_for_ack(struct
>> > >> drm_i915_private *i915, enum port port,
>> > >>
>> > >> XELPDP_MSGBUS_TIMEOUT_SLOW,
>> > >> val)) {
>> > >> drm_dbg_kms(&i915->drm, "PHY %c Timeout waiting
>> > >> for message ACK. Status: 0x%x\n",
>> > >> phy_name(phy), *val);
>> > >> + intel_cx0_bus_check_and_bump_timer(i915, port,
>> > >> + lane);
>> > >> intel_cx0_bus_reset(i915, port, lane);
>> > >> return -ETIMEDOUT;
>> > >> }
>> > >> diff --git a/drivers/gpu/drm/i915/display/intel_cx0_phy_regs.h
>> > >> b/drivers/gpu/drm/i915/display/intel_cx0_phy_regs.h
>> > >> index cb5d1be2ba19..17cc3385aabb 100644
>> > >> --- a/drivers/gpu/drm/i915/display/intel_cx0_phy_regs.h
>> > >> +++ b/drivers/gpu/drm/i915/display/intel_cx0_phy_regs.h
>> > >> @@ -110,6 +110,18 @@
>> > >> #define CX0_P4PG_STATE_DISABLE 0xC
>> > >> #define CX0_P2_STATE_RESET 0x2
>> > >>
>> > >> +#define _XELPDP_PORT_MSGBUS_TIMER_LN0_A
>> > >> 0x640d8
>> > >> +#define _XELPDP_PORT_MSGBUS_TIMER_LN0_B
>> > >> 0x641d8
>> > >> +#define _XELPDP_PORT_MSGBUS_TIMER_LN0_USBC1 0x16f258
>> > >> +#define _XELPDP_PORT_MSGBUS_TIMER_LN0_USBC2 0x16f458
>> > >> +#define XELPDP_PORT_MSGBUS_TIMER(port, lane)
>> > >> _MMIO(_PICK_EVEN_2RANGES(port, PORT_TC1, \
>> > >> +
>> > >> _XELPDP_PORT_MSGBUS_TIMER_LN0_A, \
>> > >> +
>> > >> _XELPDP_PORT_MSGBUS_TIMER_LN0_B, \
>> > >> +
>> > >> _XELPDP_PORT_MSGBUS_TIMER_LN0_USBC1, \
>> > >> +
>> > >> _XELPDP_PORT_MSGBUS_TIMER_LN0_USBC2) + (lane) * 4)
>> > >> +#define XELPDP_PORT_MSGBUS_TIMER_TIMED_OUT REG_BIT(31)
>> > >> +#define XELPDP_PORT_MSGBUS_TIMER_VAL_MASK
>> > >> REG_GENMASK(23, 0)
>> > >> +
>> > >> #define _XELPDP_PORT_CLOCK_CTL_A 0x640E0
>> > >> #define _XELPDP_PORT_CLOCK_CTL_B 0x641E0
>> > >> #define _XELPDP_PORT_CLOCK_CTL_USBC1 0x16F260
>> > >> --
>> > >> 2.41.0
>> > >