Hi, Jacob Keller <[email protected]> writes:
> There have been sporadic reports of PTM timeouts using i225/i226 devices > > These timeouts have been root caused to: > > 1) Manipulating the PTM status register while PTM is enabled and triggered > 2) The hardware retrying too quickly when an inappropriate response is > received from the upstream device > > The issue can be reproduced with the following: > > $ sudo phc2sys -R 1000 -O 0 -i tsn0 -m > > Note: 1000 Hz (-R 1000) is unrealistically large, but provides a way to > quickly reproduce the issue. > > PHC2SYS exits with: > > "ioctl PTP_OFFSET_PRECISE: Connection timed out" when the PTM transaction > fails > > The first patch in this series also resolves an issue reported by Corinna > Vinschen relating to kdump: > > This patch also fixes a hang in igc_probe() when loading the igc > driver in the kdump kernel on systems supporting PTM. > > The igc driver running in the base kernel enables PTM trigger in > igc_probe(). Therefore the driver is always in PTM trigger mode, > except in brief periods when manually triggering a PTM cycle. > > When a crash occurs, the NIC is reset while PTM trigger is enabled. > Due to a hardware problem, the NIC is subsequently in a bad busmaster > state and doesn't handle register reads/writes. When running > igc_probe() in the kdump kernel, the first register access to a NIC > register hangs driver probing and ultimately breaks kdump. > > With this patch, igc has PTM trigger disabled most of the time, > and the trigger is only enabled for very brief (10 - 100 us) periods > when manually triggering a PTM cycle. Chances that a crash occurs > during a PTM trigger are not zero, but extremly reduced. > > Signed-off-by: Jacob Keller <[email protected]> > --- > Changes in v4: > - Jacob taking over sending v4 due to lack of time on Chris's part. > - Updated commit messages based on review feedback from v3 > - Updated commit titles to slightly more imperative wording > - Link to v3: > https://lore.kernel.org/r/[email protected] > Changes in v3: > - Added mutex_destroy() to clean up PTM lock. > - Added missing checks for PTP enabled flag called from igc_main.c. > - Cleanup PTP module if probe fails. > - Wrap all access to PTM registers with PTM lock/unlock. > - Link to v2: > https://lore.kernel.org/netdev/[email protected]/ > Changes in v2: > - Removed patch modifying PTM retry loop count. > - Moved PTM mutex initialization from igc_reset() to igc_ptp_init(), called > once during igc_probe(). > - Link to v1: > https://lore.kernel.org/netdev/[email protected]/ > > --- > Christopher S M Hall (6): > igc: fix PTM cycle trigger logic > igc: increase wait time before retrying PTM > igc: move ktime snapshot into PTM retry loop > igc: handle the IGC_PTP_ENABLED flag correctly > igc: cleanup PTP module if probe fails > igc: add lock preventing multiple simultaneous PTM transactions > For the series: Acked-by: Vinicius Costa Gomes <[email protected]> Cheers, -- Vinicius
