Re: [PATCH V2] ipq40xx: fix BDF file for pcie wifi chip on the GL.Inet GL-B2200
On Wed, 27 Apr 2022 at 19:31, Enrico Mioso wrote: > > When the GL-B200's PCI device was switched to pre-calibration, the ath1'k > wasn't able to find the pre-calibration data. This is not really correct, ath10k was unable to find the BDF in the served board-2.bin For future reference, post the relevant driver log before and after, it shows the issue really clearly. Regards, Robert > Infact, the BDF files was missing the correct BMI IDs for this device, > resulting in a failure to start it. > Repackage the BDF file after renaming relevant fields and files correctly, > allowing thePCIE Wi-Fi to continue working. > > Fixes: 80d34d9d593 ("ipq40xx: document pcie wifi chip on the GL.Inet > GL-B2200") > CC: Christian Lamparter > CC: Robert Marko > Signed-off-by: Enrico Mioso > --- > .../ipq-wifi/board-glinet_gl-b2200.qca9888| Bin 12200 -> 12164 bytes > 1 file changed, 0 insertions(+), 0 deletions(-) > > diff --git a/package/firmware/ipq-wifi/board-glinet_gl-b2200.qca9888 > b/package/firmware/ipq-wifi/board-glinet_gl-b2200.qca9888 > index > 4f0a521f35935d4be1fb01b66dd9954a48eade52..f0a493ace340de0fe84d89d46cc07209df1b6883 > 100644 > GIT binary patch > delta 79 > zcmZ1x-x5DT!X-nW0SwH5WKwCdZ9#ITPEu~BZgNIufo^7stpS*ql%H6X0^%B)>69fF > YWhUm8*t+}ZIvE)m806+|Y|zjL0EmDWzyJUM > > delta 115 > zcmZpPUlBh+BBMv20Ssb*WKwCdZ9#ITPFZSRN`8^8p_xUpPD*N7W^$^nfq|)+PH|~c > qab > -- > 2.36.0 > ___ openwrt-devel mailing list openwrt-devel@lists.openwrt.org https://lists.openwrt.org/mailman/listinfo/openwrt-devel
Re: [PATCH v2] realtek: do not reset SerDes on link change
Hi, there are presently no working 1GBit SFP modules in master for RTL9300 (this patch only affects RTL93xx SoCs). On the Ubiquiti USW switch only the 10GBit modules are set up by u-boot and they continue to work. The setup really only does a setup of the link not the entire serdes. The initial reset was done on initialization of the internal PHY associated with the SerDes via rtl9300_configure_serdes() calling rtl9300_sds_rst() during the PHY probe. So calling rtl9300_sds_rst() for every link change was anyway too much. Complete control over SFP+ ports to allow 10G, 1G, Copper modules, and DAC cables will only be available with the latest developments which were posted in the forum recently and should lead to a PR soon. For this to work the SerDes-MAC link needs to be switched and then this link re-calibrated which I only figured out recently. Cheers, Birger On 27.04.22 18:06, Sander Vanheule wrote: > Hi Birger, > > On Sun, 2022-04-24 at 22:01 +0200, Birger Koblitz wrote: >> Do not reset the RTL930x SerDes on link changes, instead set up >> the SDS with internal PHYs for the SFP+ ports only. >> This fixes the 8 1GBit ports on the Zyxel XGS1250 which >> do not work without this patch. >> >> Tested-by: Stijn Segers >> Signed-off-by: Birger Koblitz >> --- >> v2: A different patch was previously sent with this subject. >> This is the correct patch. >> target/linux/realtek/files-5.10/drivers/net/dsa/rtl83xx/dsa.c | 3 ++- >> .../linux/realtek/files-5.10/drivers/net/dsa/rtl83xx/rtl83xx.h | 1 + >> 2 files changed, 3 insertions(+), 1 deletion(-) >> >> diff --git a/target/linux/realtek/files-5.10/drivers/net/dsa/rtl83xx/dsa.c >> b/target/linux/realtek/files-5.10/drivers/net/dsa/rtl83xx/dsa.c >> index 858b692640..5f19a1f590 100644 >> --- a/target/linux/realtek/files-5.10/drivers/net/dsa/rtl83xx/dsa.c >> +++ b/target/linux/realtek/files-5.10/drivers/net/dsa/rtl83xx/dsa.c >> @@ -814,7 +814,8 @@ static void rtl93xx_phylink_mac_config(struct dsa_switch >> *ds, int >> port, >> __func__, phy_modes(state->interface)); >> return; >> } >> - rtl9300_sds_rst(sds_num, sds_mode); >> + if (state->interface == PHY_INTERFACE_MODE_10GBASER) >> + rtl9300_serdes_setup(sds_num, state->interface); > > > Resetting the SerDes(-es?) makes it end up in a state where the 1Gb (copper) > ports don't > work. So with fixed phy-s, I can see how skipping a reset could help. > > Instead of a _reset_, you now only do a mode change on 10GBASER ports, using > a _setup_ > call. The reset and setup also are not entirely equivalent, so why change to > rtl9300_serdes_setup()? Do 1G SFP modules still work if you only change modes > for > 10GBASER? > > Best, > Sander > ___ openwrt-devel mailing list openwrt-devel@lists.openwrt.org https://lists.openwrt.org/mailman/listinfo/openwrt-devel
[PATCH] ltq-vdsl-app: disconnect when service is stopped
Stop the connection when the control daemon is terminated. The code is a modified version of the termination routine in version 4.23.1 of the daemon (which doesn't support VR9 modems anymore). This could also be implemented by calling the acos and acs commands via dsl_cpe_pipe.sh in the init script. However, doing it in the daemon itself has the advantage of also working if it is terminated in another way (for example during sysupgrade). Signed-off-by: Jan Hoffmann --- .../ltq-vdsl-app/patches/200-autoboot.patch | 75 +++ .../ltq-vdsl-app/patches/201-sigterm.patch| 2 +- .../ltq-vdsl-app/patches/300-ubus.patch | 4 +- 3 files changed, 78 insertions(+), 3 deletions(-) diff --git a/package/network/config/ltq-vdsl-app/patches/200-autoboot.patch b/package/network/config/ltq-vdsl-app/patches/200-autoboot.patch index 5b882bf30ff4..cc6feb94aa9f 100644 --- a/package/network/config/ltq-vdsl-app/patches/200-autoboot.patch +++ b/package/network/config/ltq-vdsl-app/patches/200-autoboot.patch @@ -1,3 +1,10 @@ +This enables automatic connection after the control daemon is started, +and also stops the connection on termination. + +Using the autoboot restart command is necessary because the stop command +doesn't actually stop the connection, and would also leave the driver in +a state where an explicit start command is necessary to connect again. + --- a/src/dsl_cpe_init_cfg.c +++ b/src/dsl_cpe_init_cfg.c @@ -27,7 +27,7 @@ DSL_InitData_t gInitCfgData = @@ -9,3 +16,71 @@ DSL_CPE_AUTOBOOT_CFG_SET(DSL_FALSE, DSL_FALSE, DSL_FALSE), DSL_CPE_TEST_MODE_CTRL_SET(DSL_TESTMODE_DISABLE), DSL_CPE_LINE_ACTIVATE_CTRL_SET(DSL_G997_INHIBIT_LDSF, DSL_G997_INHIBIT_ACSF, DSL_G997_NORMAL_STARTUP), +--- a/src/dsl_cpe_control.c b/src/dsl_cpe_control.c +@@ -6515,10 +6515,13 @@ DSL_CPE_STATIC void DSL_CPE_Termination + DSL_CPE_STATIC DSL_void_t DSL_CPE_Termination (void) + { + #ifdef INCLUDE_DSL_CPE_CLI_SUPPORT +- DSL_int_t nDevice = 0; +DSL_char_t buf[32] = "quit"; + #endif + ++ DSL_Error_t nRet = DSL_SUCCESS; ++ DSL_int_t nDevice = 0; ++ DSL_AutobootConfig_t sAutobootCfg; ++ DSL_AutobootControl_t sAutobootCtl; +DSL_CPE_Control_Context_t *pCtrlCtx; + +pCtrlCtx = DSL_CPE_GetGlobalContext(); +@@ -6527,6 +6530,50 @@ DSL_CPE_STATIC DSL_void_t DSL_CPE_Termi + pCtrlCtx->bRun = DSL_FALSE; +} + ++ for (nDevice = 0; nDevice < DSL_CPE_MAX_DSL_ENTITIES; ++nDevice) ++ { ++ g_bWaitBeforeConfigWrite[nDevice]= DSL_TRUE; ++ g_bWaitBeforeLinkActivation[nDevice] = DSL_TRUE; ++ g_bWaitBeforeRestart[nDevice]= DSL_TRUE; ++ ++ g_bAutoContinueWaitBeforeConfigWrite[nDevice]= DSL_FALSE; ++ g_bAutoContinueWaitBeforeLinkActivation[nDevice] = DSL_FALSE; ++ g_bAutoContinueWaitBeforeRestart[nDevice]= DSL_FALSE; ++ ++ memset(, 0x0, sizeof(DSL_AutobootConfig_t)); ++ sAutobootCfg.data.nStateMachineOptions.bWaitBeforeConfigWrite= DSL_TRUE; ++ sAutobootCfg.data.nStateMachineOptions.bWaitBeforeLinkActivation = DSL_TRUE; ++ sAutobootCfg.data.nStateMachineOptions.bWaitBeforeRestart= DSL_TRUE; ++ ++ nRet = (DSL_Error_t)DSL_CPE_Ioctl( ++ DSL_CPE_GetGlobalContext()->fd[nDevice], ++ DSL_FIO_AUTOBOOT_CONFIG_SET, (DSL_int_t)); ++ ++ if (nRet < DSL_SUCCESS) ++ { ++ DSL_CCA_DEBUG(DSL_CCA_DBG_ERR, (DSL_CPE_PREFIX ++"Autoboot configuration for device (%d) failed!, nRet = %d!" ++DSL_CPE_CRLF, nDevice, sAutobootCtl.accessCtl.nReturn)); ++ } ++ ++ memset(, 0, sizeof(DSL_AutobootControl_t)); ++ sAutobootCtl.data.nCommand = DSL_AUTOBOOT_CTRL_RESTART; ++ ++ nRet = (DSL_Error_t)DSL_CPE_Ioctl( ++ DSL_CPE_GetGlobalContext()->fd[nDevice], ++ DSL_FIO_AUTOBOOT_CONTROL_SET, (DSL_int_t)); ++ ++ if (nRet < DSL_SUCCESS) ++ { ++ DSL_CCA_DEBUG(DSL_CCA_DBG_ERR, (DSL_CPE_PREFIX ++"Autoboot restart for device (%d) failed!, nRet = %d!" ++DSL_CPE_CRLF, nDevice, sAutobootCtl.accessCtl.nReturn)); ++ } ++ } ++ ++ DSL_CCA_DEBUG(DSL_CCA_DBG_MSG, (DSL_CPE_PREFIX ++ "Autoboot restart executed" DSL_CPE_CRLF)); ++ + #ifdef INCLUDE_DSL_CPE_CLI_SUPPORT +for (nDevice = 0; nDevice < DSL_CPE_MAX_DSL_ENTITIES; nDevice++) +{ diff --git a/package/network/config/ltq-vdsl-app/patches/201-sigterm.patch b/package/network/config/ltq-vdsl-app/patches/201-sigterm.patch index 68a416ce24fb..4e978359835e 100644 --- a/package/network/config/ltq-vdsl-app/patches/201-sigterm.patch +++ b/package/network/config/ltq-vdsl-app/patches/201-sigterm.patch @@ -9,7 +9,7 @@ { DSL_CCA_DEBUG(DSL_CCA_DBG_MSG, (DSL_CPE_PREFIX "terminated" DSL_CPE_CRLF)); DSL_CPE_Termination (); -@@ -6756,6 +6756,7 @@ DSL_int_t dsl_cpe_daemon ( +@@ -6803,6 +6803,7 @@ DSL_int_t dsl_cpe_daemon ( #ifndef RTEMS signal (SIGINT, DSL_CPE_TerminationHandler); diff --git
Re: Optimizing kernel compilation / alignments for network performance
On 27.04.2022 14:56, Alexander Lobakin wrote: From: Rafał Miłecki Date: Wed, 27 Apr 2022 14:04:54 +0200 I noticed years ago that kernel changes touching code - that I don't use at all - can affect network performance for me. I work with home routers based on Broadcom Northstar platform. Those are SoCs with not-so-powerful 2 x ARM Cortex-A9 CPU cores. Main task of those devices is NAT masquerade and that is what I test with iperf running on two x86 machines. *** Example of such unused code change: ce5013ff3bec ("mtd: spi-nor: Add support for XM25QH64A and XM25QH128A"). https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ce5013ff3bec05cf2a8a05c75fcd520d9914d92b It lowered my NAT speed from 381 Mb/s to 367 Mb/s (-3,5%). I first reported that issue it in the e-mail thread: ARM router NAT performance affected by random/unrelated commits https://lkml.org/lkml/2019/5/21/349 https://www.spinics.net/lists/linux-block/msg40624.html Back then it was commit 5b0890a97204 ("flow_dissector: Parse batman-adv unicast headers") https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9316a9ed6895c4ad2f0cde171d486f80c55d8283 that increased my NAT speed from 741 Mb/s to 773 Mb/s (+4,3%). *** It appears Northstar CPUs have little cache size and so any change in location of kernel symbols can affect NAT performance. That explains why changing unrelated code affects anything & it has been partially proven aligning some of cache-v7.S code. My question is: is there a way to find out & force an optimal symbols locations? Take a look at CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_64B[0]. I've been fighting with the same issue on some Realtek MIPS boards: random code changes in random kernel core parts were affecting NAT / network performance. This option resolved this I'd say, for the cost of slightly increased vmlinux size (almost no change in vmlinuz size). The only thing is that it was recently restricted to a set of architectures and MIPS and ARM32 are not included now lol. So it's either a matter of expanding the list (since it was restricted only because `-falign-functions=` is not supported on some architectures) or you can just do: make KCFLAGS=-falign-functions=64 # replace 64 with your I-cache size The actual alignment is something to play with, I stopped on the cacheline size, 32 in my case. Also, this does not provide any guarantees that you won't suffer from random data cacheline changes. There were some initiatives to introduce debug alignment of data as well, but since function are often bigger than 32, while variables are usually much smaller, it was increasing the vmlinux size by a ton (imagine each u32 variable occupying 32-64 bytes instead of 4). But the chance of catching this is much lower than to suffer from I-cache function misplacement. Thank you Alexander, this appears to be helpful! I decided to ignore CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_64B for now and just adjust CFLAGS manually. 1. Without ce5013ff3bec and with -falign-functions=32 387 Mb/s 2. Without ce5013ff3bec and with -falign-functions=64 377 Mb/s 3. With ce5013ff3bec and with -falign-functions=32 384 Mb/s 4. With ce5013ff3bec and with -falign-functions=64 377 Mb/s So it seems that: 1. -falign-functions=32 = pretty stable high speed 2. -falign-functions=64 = very stable slightly lower speed I'm going to perform tests on more commits but if it stays so reliable as above that will be a huge success for me. ___ openwrt-devel mailing list openwrt-devel@lists.openwrt.org https://lists.openwrt.org/mailman/listinfo/openwrt-devel
[PATCH V2] ipq40xx: fix BDF file for pcie wifi chip on the GL.Inet GL-B2200
When the GL-B200's PCI device was switched to pre-calibration, the ath1'k wasn't able to find the pre-calibration data. Infact, the BDF files was missing the correct BMI IDs for this device, resulting in a failure to start it. Repackage the BDF file after renaming relevant fields and files correctly, allowing thePCIE Wi-Fi to continue working. Fixes: 80d34d9d593 ("ipq40xx: document pcie wifi chip on the GL.Inet GL-B2200") CC: Christian Lamparter CC: Robert Marko Signed-off-by: Enrico Mioso --- .../ipq-wifi/board-glinet_gl-b2200.qca9888| Bin 12200 -> 12164 bytes 1 file changed, 0 insertions(+), 0 deletions(-) diff --git a/package/firmware/ipq-wifi/board-glinet_gl-b2200.qca9888 b/package/firmware/ipq-wifi/board-glinet_gl-b2200.qca9888 index 4f0a521f35935d4be1fb01b66dd9954a48eade52..f0a493ace340de0fe84d89d46cc07209df1b6883 100644 GIT binary patch delta 79 zcmZ1x-x5DT!X-nW0SwH5WKwCdZ9#ITPEu~BZgNIufo^7stpS*ql%H6X0^%B)>69fF YWhUm8*t+}ZIvE)m806+|Y|zjL0EmDWzyJUM delta 115 zcmZpPUlBh+BBMv20Ssb*WKwCdZ9#ITPFZSRN`8^8p_xUpPD*N7W^$^nfq|)+PH|~c qabhttps://lists.openwrt.org/mailman/listinfo/openwrt-devel
Re: [PATCH] ipq40xx: fix BDF file for pcie wifi chip on the GL.Inet GL-B2200
On Wed, 27 Apr 2022 at 19:12, Enrico Mioso wrote: > > Fix the BDF file to include the expected BMI IDs, so that the PCI Wi-Fi > device will work again after switching to pre-calibration. Hi, Can you expand the description a bit more, namely include why is this needed after the blamed commit. Regards, Robert > > Fixes: 80d34d9d593 ("ipq40xx: document pcie wifi chip on the GL.Inet > GL-B2200") > CC: Christian Lamparter > CC: Robert Marko > Signed-off-by: Enrico Mioso > --- > .../ipq-wifi/board-glinet_gl-b2200.qca9888| Bin 12200 -> 12164 bytes > 1 file changed, 0 insertions(+), 0 deletions(-) > > diff --git a/package/firmware/ipq-wifi/board-glinet_gl-b2200.qca9888 > b/package/firmware/ipq-wifi/board-glinet_gl-b2200.qca9888 > index > 4f0a521f35935d4be1fb01b66dd9954a48eade52..f0a493ace340de0fe84d89d46cc07209df1b6883 > 100644 > GIT binary patch > delta 79 > zcmZ1x-x5DT!X-nW0SwH5WKwCdZ9#ITPEu~BZgNIufo^7stpS*ql%H6X0^%B)>69fF > YWhUm8*t+}ZIvE)m806+|Y|zjL0EmDWzyJUM > > delta 115 > zcmZpPUlBh+BBMv20Ssb*WKwCdZ9#ITPFZSRN`8^8p_xUpPD*N7W^$^nfq|)+PH|~c > qab > -- > 2.36.0 > ___ openwrt-devel mailing list openwrt-devel@lists.openwrt.org https://lists.openwrt.org/mailman/listinfo/openwrt-devel
[PATCH] ipq40xx: fix BDF file for pcie wifi chip on the GL.Inet GL-B2200
Fix the BDF file to include the expected BMI IDs, so that the PCI Wi-Fi device will work again after switching to pre-calibration. Fixes: 80d34d9d593 ("ipq40xx: document pcie wifi chip on the GL.Inet GL-B2200") CC: Christian Lamparter CC: Robert Marko Signed-off-by: Enrico Mioso --- .../ipq-wifi/board-glinet_gl-b2200.qca9888| Bin 12200 -> 12164 bytes 1 file changed, 0 insertions(+), 0 deletions(-) diff --git a/package/firmware/ipq-wifi/board-glinet_gl-b2200.qca9888 b/package/firmware/ipq-wifi/board-glinet_gl-b2200.qca9888 index 4f0a521f35935d4be1fb01b66dd9954a48eade52..f0a493ace340de0fe84d89d46cc07209df1b6883 100644 GIT binary patch delta 79 zcmZ1x-x5DT!X-nW0SwH5WKwCdZ9#ITPEu~BZgNIufo^7stpS*ql%H6X0^%B)>69fF YWhUm8*t+}ZIvE)m806+|Y|zjL0EmDWzyJUM delta 115 zcmZpPUlBh+BBMv20Ssb*WKwCdZ9#ITPFZSRN`8^8p_xUpPD*N7W^$^nfq|)+PH|~c qabhttps://lists.openwrt.org/mailman/listinfo/openwrt-devel
Re: [PATCH v2] realtek: do not reset SerDes on link change
Hi Birger, On Sun, 2022-04-24 at 22:01 +0200, Birger Koblitz wrote: > Do not reset the RTL930x SerDes on link changes, instead set up > the SDS with internal PHYs for the SFP+ ports only. > This fixes the 8 1GBit ports on the Zyxel XGS1250 which > do not work without this patch. > > Tested-by: Stijn Segers > Signed-off-by: Birger Koblitz > --- > v2: A different patch was previously sent with this subject. > This is the correct patch. > target/linux/realtek/files-5.10/drivers/net/dsa/rtl83xx/dsa.c | 3 ++- > .../linux/realtek/files-5.10/drivers/net/dsa/rtl83xx/rtl83xx.h | 1 + > 2 files changed, 3 insertions(+), 1 deletion(-) > > diff --git a/target/linux/realtek/files-5.10/drivers/net/dsa/rtl83xx/dsa.c > b/target/linux/realtek/files-5.10/drivers/net/dsa/rtl83xx/dsa.c > index 858b692640..5f19a1f590 100644 > --- a/target/linux/realtek/files-5.10/drivers/net/dsa/rtl83xx/dsa.c > +++ b/target/linux/realtek/files-5.10/drivers/net/dsa/rtl83xx/dsa.c > @@ -814,7 +814,8 @@ static void rtl93xx_phylink_mac_config(struct dsa_switch > *ds, int > port, > __func__, phy_modes(state->interface)); > return; > } > - rtl9300_sds_rst(sds_num, sds_mode); > + if (state->interface == PHY_INTERFACE_MODE_10GBASER) > + rtl9300_serdes_setup(sds_num, state->interface); Resetting the SerDes(-es?) makes it end up in a state where the 1Gb (copper) ports don't work. So with fixed phy-s, I can see how skipping a reset could help. Instead of a _reset_, you now only do a mode change on 10GBASER ports, using a _setup_ call. The reset and setup also are not entirely equivalent, so why change to rtl9300_serdes_setup()? Do 1G SFP modules still work if you only change modes for 10GBASER? Best, Sander ___ openwrt-devel mailing list openwrt-devel@lists.openwrt.org https://lists.openwrt.org/mailman/listinfo/openwrt-devel
Re: Optimizing kernel compilation / alignments for network performance
From: Rafał Miłecki Date: Wed, 27 Apr 2022 14:04:54 +0200 > Hi, Hej, > > I noticed years ago that kernel changes touching code - that I don't use > at all - can affect network performance for me. > > I work with home routers based on Broadcom Northstar platform. Those > are SoCs with not-so-powerful 2 x ARM Cortex-A9 CPU cores. Main task of > those devices is NAT masquerade and that is what I test with iperf > running on two x86 machines. > > *** > > Example of such unused code change: > ce5013ff3bec ("mtd: spi-nor: Add support for XM25QH64A and XM25QH128A"). > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ce5013ff3bec05cf2a8a05c75fcd520d9914d92b > It lowered my NAT speed from 381 Mb/s to 367 Mb/s (-3,5%). > > I first reported that issue it in the e-mail thread: > ARM router NAT performance affected by random/unrelated commits > https://lkml.org/lkml/2019/5/21/349 > https://www.spinics.net/lists/linux-block/msg40624.html > > Back then it was commit 5b0890a97204 ("flow_dissector: Parse batman-adv > unicast headers") > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9316a9ed6895c4ad2f0cde171d486f80c55d8283 > that increased my NAT speed from 741 Mb/s to 773 Mb/s (+4,3%). > > *** > > It appears Northstar CPUs have little cache size and so any change in > location of kernel symbols can affect NAT performance. That explains why > changing unrelated code affects anything & it has been partially proven > aligning some of cache-v7.S code. > > My question is: is there a way to find out & force an optimal symbols > locations? Take a look at CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_64B[0]. I've been fighting with the same issue on some Realtek MIPS boards: random code changes in random kernel core parts were affecting NAT / network performance. This option resolved this I'd say, for the cost of slightly increased vmlinux size (almost no change in vmlinuz size). The only thing is that it was recently restricted to a set of architectures and MIPS and ARM32 are not included now lol. So it's either a matter of expanding the list (since it was restricted only because `-falign-functions=` is not supported on some architectures) or you can just do: make KCFLAGS=-falign-functions=64 # replace 64 with your I-cache size The actual alignment is something to play with, I stopped on the cacheline size, 32 in my case. Also, this does not provide any guarantees that you won't suffer from random data cacheline changes. There were some initiatives to introduce debug alignment of data as well, but since function are often bigger than 32, while variables are usually much smaller, it was increasing the vmlinux size by a ton (imagine each u32 variable occupying 32-64 bytes instead of 4). But the chance of catching this is much lower than to suffer from I-cache function misplacement. > > Adding .align 5 to the cache-v7.S is a partial success. I'd like to find > out what other functions are worth optimizing (aligning) and force that > (I guess __attribute__((aligned(32))) could be used). > > I can't really draw any conclusions from comparing System.map before and > after above commits as they relocate thousands of symbols in one go. > > Optimizing is pretty important for me for two reasons: > 1. I want to reach maximum possible NAT masquerade performance > 2. I need stable performance across random commits to detect regressions [0] https://elixir.bootlin.com/linux/v5.18-rc4/K/ident/CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_64B Thanks, Al ___ openwrt-devel mailing list openwrt-devel@lists.openwrt.org https://lists.openwrt.org/mailman/listinfo/openwrt-devel
Optimizing kernel compilation / alignments for network performance
Hi, I noticed years ago that kernel changes touching code - that I don't use at all - can affect network performance for me. I work with home routers based on Broadcom Northstar platform. Those are SoCs with not-so-powerful 2 x ARM Cortex-A9 CPU cores. Main task of those devices is NAT masquerade and that is what I test with iperf running on two x86 machines. *** Example of such unused code change: ce5013ff3bec ("mtd: spi-nor: Add support for XM25QH64A and XM25QH128A"). https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ce5013ff3bec05cf2a8a05c75fcd520d9914d92b It lowered my NAT speed from 381 Mb/s to 367 Mb/s (-3,5%). I first reported that issue it in the e-mail thread: ARM router NAT performance affected by random/unrelated commits https://lkml.org/lkml/2019/5/21/349 https://www.spinics.net/lists/linux-block/msg40624.html Back then it was commit 5b0890a97204 ("flow_dissector: Parse batman-adv unicast headers") https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9316a9ed6895c4ad2f0cde171d486f80c55d8283 that increased my NAT speed from 741 Mb/s to 773 Mb/s (+4,3%). *** It appears Northstar CPUs have little cache size and so any change in location of kernel symbols can affect NAT performance. That explains why changing unrelated code affects anything & it has been partially proven aligning some of cache-v7.S code. My question is: is there a way to find out & force an optimal symbols locations? Adding .align 5 to the cache-v7.S is a partial success. I'd like to find out what other functions are worth optimizing (aligning) and force that (I guess __attribute__((aligned(32))) could be used). I can't really draw any conclusions from comparing System.map before and after above commits as they relocate thousands of symbols in one go. Optimizing is pretty important for me for two reasons: 1. I want to reach maximum possible NAT masquerade performance 2. I need stable performance across random commits to detect regressions ___ openwrt-devel mailing list openwrt-devel@lists.openwrt.org https://lists.openwrt.org/mailman/listinfo/openwrt-devel
Re: [PATCH] Revert "ipq40xx: document pcie wifi chip on the GL.Inet GL-B2200"
Dear Rafał, first of all, thanks a lot for taking the time and the patience to review this patch. I explained the situation in more detail here: http://lists.openwrt.org/pipermail/openwrt-devel/2022-April/038451.html ... admittedly, the subject was too generic. Thanks! Enrico___ openwrt-devel mailing list openwrt-devel@lists.openwrt.org https://lists.openwrt.org/mailman/listinfo/openwrt-devel
Re: [PATCH] Revert "ipq40xx: document pcie wifi chip on the GL.Inet GL-B2200"
On 17.04.2022 02:42, Enrico Mioso wrote: This reverts commit 80d34d9d593865248bf5a23794e9163895140de7. This brings back the PCI Wi-Fi interface on the GL-B2200. CC: Christian Lamparter Signed-off-by: Enrico Mioso This description doesn't tell anything really. Please explain what happens and why do we need this patch. How does it fix things. ___ openwrt-devel mailing list openwrt-devel@lists.openwrt.org https://lists.openwrt.org/mailman/listinfo/openwrt-devel