Hi All, I used linux 4.4.74 with ethernet card ixgbe 10G with network topology as following:
Switch +---------------+ | | Computer 1 | | Computer 2 +------------+ | | +------------+ | ixgbe-eth1<----------->P1.1 P1.2 <---------> ixgbe-eth1 | | | | | | | | | | | | | | ixgbe-eth2<----------->P2.1 P2.2 <---------> ixgbe-eth2 | | | | | | | +------------+ | | +------------+ | | | | +---------------+ When I do the test as following: 1) Send traffic on both link (ixgbe-eth1<----------->P1.1 P1.2 <---------> ixgbe-eth1 ) and (ixgbe-eth1<----------->P1.1 P1.2 <---------> ixgbe-eth1 ) 2) Turn off both ports P1.1 and P1.2 suddenly 3) I saw the message "reset adapter" in the kernel log: [81999.016151] ixgbe 0000:01:00.1 eth1: Reset adapter My question is: the log "reset adapter" in this case is normal behavior or a bug of ixgbe? When I look in to the code of ixgbe driver, I understand the flow of code: workqueue thread -> ixgbe_service_task() -> ixgbe_reset_subtask() -> ixgbe_reinit_locked() -> ixgbe_up() -> ixgbe_up_complete() -> ixgbe_non_sfp_link_config -> hw->mac.ops.setup_link() -> ixgbe_setup_mac_link_smartspeed() In function ixgbe_setup_mac_link_smartspeed(), the mdelay() call 15 times in first loop and 6 times in seconds loop: ... for () { ... for () { ... mdelay(100); ... } ... } ... for () { ... mdelay(100); ... } When I add jiffies to calculate the total time and core running on each loop: First loop: Nov 9 03:00:52 ixgbe_setup_mac_link_smartspeed(1) on cpu 11 sleeping for 1488 ms, 3 retries, 5 check_links Second loop: Nov 9 03:00:53 ixgbe_setup_mac_link_smartspeed(2) on cpu 11 sleeping for 592 ms, 6 check_links Total time: Nov 9 03:00:53 ixgbe: **** ixgbe_reset_subtask run in cores 11 and take 2452 ms I test same hardware with two linux distros and print the core and time in each iteration of loop. I saw that: 1) In the case scheduler does not switch core, for example it always run on core 11: it run on core 11; mdelay(); it run on core 11; mdelay(); ... we got problem starvation of application set affinity with the same this core. (mdelay take cpu busy) Because ixgbe_wq create by function create_singlethread_workqueueun(), it mean create UN_BOUND workqueue so we don't know what core it will run and make application starvation on this core. 2) If in the case the core switch, the starvation doest not happen, for example: it run on core 10; mdelay(100); it run on core 11; mdelay(); ... So my second question is: Do you have any suggestion for me to solve the starvation problem? What is kernel config parameter can be make mdelay() affect to scheduler, and make starvation other task on same core? The test run on same hardware, different linux distro and kernel but I compile the same ixgbe driver source code from kernel version 4.4.74. I check code mdelay() will be call udelay() with MAX_UDELAY_MS = 5: #ifndef mdelay #define mdelay(n) (\ (__builtin_constant_p(n) && (n)<=MAX_UDELAY_MS) ? udelay((n)*1000) : \ ({unsigned long __ms=(n); while (__ms--) udelay(1000);})) #endif Thanks and regards, Canh ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel® Ethernet, visit http://communities.intel.com/community/wired