As we know, hot plug is an importance feature, either use for the datacenter device’s fail-safe, or use for SRIOV Live Migration in SDN/NFV. It could bring the higher flexibility and continuality to the networking services in multiple use cases in industry. So let we see, dpdk as an importance networking framework, what can it help to implement hot plug solution for users.
We already have a general device event detect mechanism, failsafe driver, bonding driver and hot plug/unplug api in framework, app could use these to develop their hot plug solution. let’s see the case of hot unplug, it can happen when a hardware device is be removed physically, or when the software disables it. App need to call ether dev API to detach the device, to unplug the device at the bus level and make access to the device invalid. But the problem is that, the removal of the device from the software lists is not going to be instantaneous, at this time if the data(fast) path still read/write the device, it will cause MMIO error and result of the app crash out. Seems that we have got fail-safe driver(or app) + RTE_ETH_EVENT_INTR_RMV + kernel core driver solution to handle it, but still not have failsafe driver (or app) + RTE_DEV_EVENT_REMOVE + PCIe pmd driver failure handle solution. So there is an absence in dpdk hot plug solution right now. Also, we know that kernel only guaranty hot plug on the kernel side, but not for the user mode side. Firstly we can hardly have a gatekeeper for any MMIO for multiple PMD driver. Secondly, no more specific 3rd tools such as udev/driverctl have especially cover these hot plug failure processing. Third, the feasibility of app’s implement for multiple user mode PMD driver is still a problem. Here, a general hot plug failure handle mechanism in dpdk framework would be proposed, it aim to guaranty that, when hot unplug occur, the system will not crash and app will not be break out, and user space can normally stop and release any relevant resources, then unplug of the device at the bus level cleanly. The mechanism should be come across as bellow: Firstly, app enabled the device event monitor and register the hot plug event’s callback before running data path. Once the hot unplug behave occur, the mechanism will detect the removal event and then accordingly do the failure handle. In order to do that, below functional will be bring in. - Add a new bus ops “handle_hot_unplug” to handle bus read/write error, it is bus-specific and each kind of bus can implement its own logic. - Implement pci bus specific ops “pci_handle_hot_unplug”. It will base on the failure address to remap memory for the corresponding device that unplugged. For the data path or other unexpected control from the control path when hot unplug occur. - Implement a new sigbus handler, it is registered when start device even monitoring. The handler is per process. Base on the signal event principle, control path thread and data path thread will randomly receive the sigbus error, but will go to the common sigbus handler. Once the MMIO sigbus error exposure, it will trigger the above hot unplug operation. The sigbus will be check if it is cause of the hot unplug or not, if not will info exception as the original sigbus handler. If yes, will do memory remapping. For the control path and the igb uio release: - When hot unplug device, the kernel will release the device resource in the kernel side, such as the fd sys file will disappear, and the irq will be released. At this time, if igb uio driver still try to release this resource, it will cause kernel crash. On the other hand, something like interrupt disable do not automatically process in kernel side. If not handler it, this redundancy and dirty thing will affect the interrupt resource be used by other device. So the igb_uio driver have to check the hot plug status and corresponding process should be taken in igb uio deriver. This patch propose to add structure of rte_udev_state into rte_uio_pci_dev of igb_uio kernel driver, which will record the state of uio device, such as probed/opened/released/removed/unplug. When detect the unexpected removal which cause of hot unplug behavior, it will corresponding disable interrupt resource, while for the part of releasement which kernel have already handle, just skip it to avoid double free or null pointer kernel crash issue. The mechanism could be use for fail-safe driver and app which want to use hot plug solution. let testpmd for example: - Enable device event monitor->device unplug->failure handle->stop forwarding-> stop port->close port->detach port. This process will not breaking the app/fail-safe running, and will not break other irrelevance device. And app could plug in the device and restart the date path again by below. - Device plug in->bind igb_uio driver ->attached device->start port-> start forwarding. patchset history: v9->v8: refine commit log to be more readable. v8->v7: refine errno process in sigbus handler. refine igb uio release process v7->v6: delete some unused part v6->v5: refine some description about bus ops refine commit log add some entry check. v5->v4: split patches to focus on the failure handle, remove the event usage by testpmd to another patch. change the hotplug failure handler name refine the sigbus handle logic add lock for udev state in igb uio driver v4->v3: split patches to be small and clear change to use new parameter "--hotplug-mode" in testpmd to identify the eal hotplug and ethdev hotplug v3->v2: change bus ops name to bus_hotplug_handler. add new API and bus ops of bus_signal_handler distingush handle generic sigbus and hotplug sigbus v2->v1(v21): refine some doc and commit log fix igb uio kernel issue for control path failure rebase testpmd code Since the hot plug solution be discussed serval around in the public, the scope be changed and the patch set be split into many times. Coming to the recently RFC and feature design, it just focus on the hot unplug failure handler at this patch set, so in order let this topic more clear and focus, summarize privours patch set in history “v1(v21)”, the v2 here go ahead for further track. "v1(21)" == v21 as below: v21->v20: split function in hot unplug ops sync failure hanlde to fix multiple process issue fix attach port issue for multiple devices case. combind rmv callback function to be only one. v20->v19: clean the code refine the remap logic for multiple device. remove the auto binding v19->18: note for limitation of multiple hotplug,fix some typo, sqeeze patch. v18->v15: add document, add signal bus handler, refine the code to be more clear. the prior patch history please check the patch set "add device event monitor framework". Jeff Guo (7): bus: add hotplug failure handler bus/pci: implement hotplug failure handler ops bus: add sigbus handler bus/pci: implement sigbus handler operation bus: add helper to handle sigbus eal: add failure handle mechanism for hotplug igb_uio: fix unexpected remove issue for hotplug drivers/bus/pci/pci_common.c | 77 ++++++++++++++++++++++ drivers/bus/pci/pci_common_uio.c | 33 ++++++++++ drivers/bus/pci/private.h | 12 ++++ kernel/linux/igb_uio/igb_uio.c | 69 +++++++++++++++---- lib/librte_eal/common/eal_common_bus.c | 43 ++++++++++++ lib/librte_eal/common/eal_private.h | 12 ++++ lib/librte_eal/common/include/rte_bus.h | 33 ++++++++++ lib/librte_eal/linuxapp/eal/eal_dev.c | 113 +++++++++++++++++++++++++++++++- 8 files changed, 377 insertions(+), 15 deletions(-) -- 2.7.4