在 2025/7/26 15:51, Lukas Wunner 写道:
On Fri, Jul 25, 2025 at 04:09:21PM -0500, Bjorn Helgaas wrote:
@@ -319,8 +319,7 @@ int pciehp_check_link_status(struct controller *ctrl)
                return -1;
        }
- pcie_capability_read_word(pdev, PCI_EXP_LNKSTA2, &linksta2);
-       __pcie_update_link_speed(ctrl->pcie->port->subordinate, lnk_status, 
linksta2);
+       pcie_update_link_speed(ctrl->pcie->port->subordinate, PCIE_HOTPLUG);

It kind of bugs me that the hot-add flow reads LNKSTA three times and
generates both pci_hp_event LINK_UP and link_event tracepoints:

   pciehp_handle_presence_or_link_change
     link_active = pciehp_check_link_active()
       pcie_capability_read_word(PCI_EXP_LNKSTA)
     if (link_active)
       ctrl_info(ctrl, "Slot(%s): Link Up\n")

This LNKSTA read decides whether to bring up the slot.
It can't be eliminated.

       trace_pci_hp_event(PCI_HOTPLUG_LINK_UP)
       pciehp_enable_slot
         __pciehp_enable_slot
           board_added
             pciehp_check_link_status
               pcie_capability_read_word(PCI_EXP_LNKSTA)

This is sort of a final check whether the link is (still) active
before commencing device enumeration.  Doesn't look like it can
safely be eliminated either.

               pcie_update_link_speed
                 pcie_capability_read_word(PCI_EXP_LNKSTA)
                 pcie_capability_read_word(PCI_EXP_LNKSTA2)
                 trace_pcie_link_event(<REASON>)

This third register read is introduced by the present patch and is
indeed somewhat a step back, given that pciehp_check_link_status()
currently deliberately calls __pcie_update_link_speed() to pass
the already read LNKSTA value.

Hi Lukas, and Bjorn:

Thanks for the excellent technical analysis!

You're absolutely right. I introduced an unnecessary regression by
adding a third LNKSTA read when pciehp_check_link_status() already has
the LNKSTA value and was specifically designed to pass it to
__pcie_update_link_speed().


I'm wondering if the tracepoint can be moved down to
__pcie_update_link_speed()?

Yes, that's a much better approach. Will fix it in next version.


And maybe we need both a bare LINK_UP event and a link_event with all
the details, but again it seems a little weird to me that there are
two tracepoints when there's really only one event and we know all the
link_event information from the very first LNKSTA read.

One of the reasons is that a "Link Down" event would have to
contain dummy values for link speed etc, so it seemed cleaner
to separate the hotplug event from the link speed event.

Thanks,

Lukas

I agree with Lukas and I completely agree with this separation. The two
tracepoints serve different purposes:

- pci_hp_event: Pure hotplug state changes (LINK_UP/LINK_DOWN,
  CARD_PRESENT/CARD_NOT_PRESENT)
- pcie_link_event: Actual link parameter information when meaningful
  values exist

For LINK_DOWN events, we don't have meaningful speed/width values, so
forcing them into a single tracepoint would indeed require dummy/invalid
values, making the interface confusing.

Thanks for the clear technical guidance and for catching my regression!

Best regards,
Shuai

Reply via email to