Any pointers on how to triage/investigate the problem further?
I'm willing to dig deeper into the driver, but I won't be able to do it without
some tips on where to start looking and what to look for.
Thanks,
Anatoli
On 17/7/20 18:49, Anatoli wrote:
> Hi Stuart,
>
> Thanks for your suggestion.
>
> Unfortunately, it had no effect. I made this change:
>
> index 5cc8bd6862b..47e639fdc6c 100644
> --- sys/dev/pci/if_re_pci.c
> +++ sys/dev/pci/if_re_pci.c
> @@ -156,9 +156,10 @@ re_pci_attach(struct device *parent, struct device
> *self, void *aux)
> }
>
> /* Allocate interrupt */
> - if (pci_intr_map_msi(pa, &ih) == 0)
> - sc->rl_flags |= RL_FLAG_MSI;
> - else if (pci_intr_map(pa, &ih) != 0) {
> +// if (pci_intr_map_msi(pa, &ih) == 0)
> +// sc->rl_flags |= RL_FLAG_MSI;
> +// else
> + if (pci_intr_map(pa, &ih) != 0) {
> printf(": couldn't map interrupt\n");
> return;
> }
>
> recompiled the kernel and booted it.
>
> Upon cable disconnect and reconnect, the re0 nic entered "hibernate"
> state and came back only with outgoing pings as before. + "re0: watchdog
> timeout" in dmesg.
>
> I suppose that on inactivity (or when detecting "no carrier" status) it
> somehow disables interrupts or they timeout and become disabled. That's
> why it doesn't see any incoming packets. And just upon outgoing packets
> it somehow reactivates the interrupts. Does it make any sense?
>
> What else could I try?
>
> Thanks,
> Anatoli
>
>
> On 17/7/20 07:12, Stuart Henderson wrote:
>> As hinted in the Reddit post, try disabling MSI. Unlikely to be the
>> permanent fix but it will give more information.
>>
>> In if_re_pci.c:
>>
>> /* Allocate interrupt */
>> if (pci_intr_map_msi(pa, &ih) == 0)
>> sc->rl_flags |= RL_FLAG_MSI; else if (pci_intr_map(pa, &ih)
>> != 0) { printf(": couldn't map interrupt\n"); return; }
>>
>> Remove "if (pci_intr_map_msi ... else" and keep "if (pci_intr_map(..."
>>
>> There are plenty of systems with re(4) that don't have this problem, it
>> definitely doesn't affect every machine and/or every re(4).
>>
>