AMD General Hi Lijo,
Thanks for the review. I want to make sure the trade-off is genuinely favourable before standing by the patch. The case for ACK polling rests on a couple of assumptions: - The AT24CM02 datasheet specifies a maximum write-cycle time of 10 ms, but the typical time is considerably shorter (often in the 2-5 ms range). With a fixed msleep(10), we always pay the worst-case cost, whereas ACK polling allows the write path to proceed as soon as the device is ready. - With sleep_us = 200, the polling loop runs at most ~50 times over the full 10 ms window. If the write typically completes around ~3 ms, this reduces to ~15 polling attempts. Whether this is beneficial depends on the per-transaction FW overhead, as you pointed out. That said, if the firmware I2C path has non-trivial per-message overhead (e.g., > ~100 us per round-trip), or if bus contention is a concern on these platforms, then the fixed delay could indeed be the better choice. Do you have a sense of the typical FW round-trip latency on this path? That would help settle this. Thanks, Kunal -----Original Message----- From: Lazar, Lijo <[email protected]> Sent: 01 June 2026 18:35 To: Devanand Zodape, Kunal <[email protected]>; [email protected] Cc: [email protected]; [email protected]; Deucher, Alexander <[email protected]>; Koenig, Christian <[email protected]>; David Airlie <[email protected]>; Simona Vetter <[email protected]>; Kumar1, Rahul <[email protected]>; Gupta, Prateek1 <[email protected]> Subject: Re: [PATCH v2] drm/amdgpu: use ACK polling for page-write completion On 01-Jun-26 4:53 PM, Kunal Zodape wrote: > [Some people who received this message don't often get email from > [email protected]. Learn why this is important at > https://aka.ms/LearnAboutSenderIdentification ] > > The EEPROM write path currently waits a fixed 10 ms after each page > write to cover the maximum write-cycle time. > > Replace the fixed delay with ACK polling so the driver can continue as > soon as the EEPROM finishes its internal write cycle. Since the SMU > I2C adapter used for these EEPROM accesses does not support > zero-length transfers, poll readiness with an offset-only dummy write. > > Keep the existing 10 ms timeout as the upper bound for the polling loop. > > Tested on MI200 (ALDEBARAN) with ras_eeprom_reset confirming clean > write/read-back with no I2C errors. The current sleep logic may be better than sending a dummy transfter through firmware. That has the overhead of FW message logic and other clients accessing i2c bus. The original comments in the code logic are valid for optimization only if driver has direct access to the i2c bus. Thanks, Lijo > > Suggested-by: Jani Nikula <[email protected]> > Signed-off-by: Kunal Zodape <[email protected]> > --- > v2: Use read_poll_timeout() instead of open-coded ktime + do-while loop > as suggested > > drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c | 27 +++++++++++++++------- > 1 file changed, 19 insertions(+), 8 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c > index 8cd69836dd99..9dc538073bb8 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_eeprom.c > @@ -21,6 +21,7 @@ > * > */ > > +#include <linux/iopoll.h> > #include "amdgpu_eeprom.h" > #include "amdgpu.h" > > @@ -153,15 +154,25 @@ static int __amdgpu_eeprom_xfer(struct i2c_adapter > *i2c_adap, u32 eeprom_addr, > break; > > if (!read) { > - /* According to EEPROM specs the length of the > - * self-writing cycle, tWR (tW), is 10 ms. > - * > - * TODO: Use polling on ACK, aka Acknowledge > - * Polling, to minimize waiting for the > - * internal write cycle to complete, as it is > - * usually smaller than tWR (tW). > + int ret; > + > + /* Poll for ACK to detect when the self-timed > + * internal write cycle has completed, as per > + * Acknowledge Polling described in the AT24CM02 > + * datasheet, Section 7.4. The SMU I2C adapter > + * used by these EEPROM paths does not support > + * zero-length messages, so use an offset-only > + * dummy write to probe for the ACK. The address > + * pointer update is harmless because each real > + * transfer reprograms it before use. > */ > - msleep(10); > + ret = read_poll_timeout(i2c_transfer, r, > + r == 1, > + 200, 10 * USEC_PER_MSEC, > + false, > + i2c_adap, &msgs[0], 1); > + if (ret) > + break; > } > } > > -- > 2.17.1 >
