Re: [PATCH V2] ipq40xx: fix BDF file for pcie wifi chip on the GL.Inet GL-B2200

2022-04-27 Thread Robert Marko
On Wed, 27 Apr 2022 at 19:31, Enrico Mioso  wrote:
>
> When the GL-B200's PCI device was switched to pre-calibration, the ath1'k 
> wasn't able to find the pre-calibration data.
This is not really correct, ath10k was unable to find the BDF in the
served board-2.bin

For future reference, post the relevant driver log before and after,
it shows the issue really clearly.

Regards,
Robert

> Infact, the BDF files was missing the correct BMI IDs for this device, 
> resulting in a failure to start it.
> Repackage the BDF file after renaming relevant fields and files correctly, 
> allowing thePCIE Wi-Fi to continue working.
>
> Fixes: 80d34d9d593 ("ipq40xx: document pcie wifi chip on the GL.Inet 
> GL-B2200")
> CC: Christian Lamparter 
> CC: Robert Marko 
> Signed-off-by: Enrico Mioso 
> ---
>  .../ipq-wifi/board-glinet_gl-b2200.qca9888| Bin 12200 -> 12164 bytes
>  1 file changed, 0 insertions(+), 0 deletions(-)
>
> diff --git a/package/firmware/ipq-wifi/board-glinet_gl-b2200.qca9888 
> b/package/firmware/ipq-wifi/board-glinet_gl-b2200.qca9888
> index 
> 4f0a521f35935d4be1fb01b66dd9954a48eade52..f0a493ace340de0fe84d89d46cc07209df1b6883
>  100644
> GIT binary patch
> delta 79
> zcmZ1x-x5DT!X-nW0SwH5WKwCdZ9#ITPEu~BZgNIufo^7stpS*ql%H6X0^%B)>69fF
> YWhUm8*t+}ZIvE)m806+|Y|zjL0EmDWzyJUM
>
> delta 115
> zcmZpPUlBh+BBMv20Ssb*WKwCdZ9#ITPFZSRN`8^8p_xUpPD*N7W^$^nfq|)+PH|~c
> qab
> --
> 2.36.0
>

___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/mailman/listinfo/openwrt-devel


Re: [PATCH v2] realtek: do not reset SerDes on link change

2022-04-27 Thread Birger Koblitz
Hi,

there are presently no working 1GBit SFP modules in master for RTL9300
(this patch only affects RTL93xx SoCs). On the Ubiquiti USW switch
only the 10GBit modules are set up by u-boot and they continue to work.
The setup really only does a setup of the link not the entire serdes. The 
initial reset
was done on initialization of the internal PHY associated with the SerDes
via rtl9300_configure_serdes() calling rtl9300_sds_rst() during the PHY
probe. So calling rtl9300_sds_rst() for every link change was anyway too much.

Complete control over SFP+ ports to allow 10G, 1G, Copper modules, and
DAC cables will only be available with the latest developments which were
posted in the forum recently and should lead to a PR soon. For this to work
the SerDes-MAC link needs to be switched and then this link re-calibrated
which I only figured out recently.

Cheers,
  Birger

On 27.04.22 18:06, Sander Vanheule wrote:
> Hi Birger,
> 
> On Sun, 2022-04-24 at 22:01 +0200, Birger Koblitz wrote:
>> Do not reset the RTL930x SerDes on link changes, instead set up
>> the SDS with internal PHYs for the SFP+ ports only.
>> This fixes the 8 1GBit ports on the Zyxel XGS1250 which
>> do not work without this patch.
>>
>> Tested-by: Stijn Segers 
>> Signed-off-by: Birger Koblitz 
>> ---
>> v2: A different patch was previously sent with this subject.
>>     This is the correct patch.
>>  target/linux/realtek/files-5.10/drivers/net/dsa/rtl83xx/dsa.c  | 3 ++-
>>  .../linux/realtek/files-5.10/drivers/net/dsa/rtl83xx/rtl83xx.h | 1 +
>>  2 files changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/target/linux/realtek/files-5.10/drivers/net/dsa/rtl83xx/dsa.c
>> b/target/linux/realtek/files-5.10/drivers/net/dsa/rtl83xx/dsa.c
>> index 858b692640..5f19a1f590 100644
>> --- a/target/linux/realtek/files-5.10/drivers/net/dsa/rtl83xx/dsa.c
>> +++ b/target/linux/realtek/files-5.10/drivers/net/dsa/rtl83xx/dsa.c
>> @@ -814,7 +814,8 @@ static void rtl93xx_phylink_mac_config(struct dsa_switch 
>> *ds, int
>> port,
>>    __func__, phy_modes(state->interface));
>> return;
>> }
>> -   rtl9300_sds_rst(sds_num, sds_mode);
>> +   if (state->interface == PHY_INTERFACE_MODE_10GBASER)
>> +   rtl9300_serdes_setup(sds_num, state->interface);
> 
> 
> Resetting the SerDes(-es?) makes it end up in a state where the 1Gb (copper) 
> ports don't
> work. So with fixed phy-s, I can see how skipping a reset could help.
> 
> Instead of a _reset_, you now only do a mode change on 10GBASER ports, using 
> a _setup_
> call. The reset and setup also are not entirely equivalent, so why change to
> rtl9300_serdes_setup()? Do 1G SFP modules still work if you only change modes 
> for
> 10GBASER?
> 
> Best,
> Sander
> 

___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/mailman/listinfo/openwrt-devel


[PATCH] ltq-vdsl-app: disconnect when service is stopped

2022-04-27 Thread Jan Hoffmann
Stop the connection when the control daemon is terminated. The code is
a modified version of the termination routine in version 4.23.1 of the
daemon (which doesn't support VR9 modems anymore).

This could also be implemented by calling the acos and acs commands via
dsl_cpe_pipe.sh in the init script. However, doing it in the daemon
itself has the advantage of also working if it is terminated in another
way (for example during sysupgrade).

Signed-off-by: Jan Hoffmann 
---
 .../ltq-vdsl-app/patches/200-autoboot.patch   | 75 +++
 .../ltq-vdsl-app/patches/201-sigterm.patch|  2 +-
 .../ltq-vdsl-app/patches/300-ubus.patch   |  4 +-
 3 files changed, 78 insertions(+), 3 deletions(-)

diff --git a/package/network/config/ltq-vdsl-app/patches/200-autoboot.patch 
b/package/network/config/ltq-vdsl-app/patches/200-autoboot.patch
index 5b882bf30ff4..cc6feb94aa9f 100644
--- a/package/network/config/ltq-vdsl-app/patches/200-autoboot.patch
+++ b/package/network/config/ltq-vdsl-app/patches/200-autoboot.patch
@@ -1,3 +1,10 @@
+This enables automatic connection after the control daemon is started,
+and also stops the connection on termination.
+
+Using the autoboot restart command is necessary because the stop command
+doesn't actually stop the connection, and would also leave the driver in
+a state where an explicit start command is necessary to connect again.
+
 --- a/src/dsl_cpe_init_cfg.c
 +++ b/src/dsl_cpe_init_cfg.c
 @@ -27,7 +27,7 @@ DSL_InitData_t gInitCfgData =
@@ -9,3 +16,71 @@
 DSL_CPE_AUTOBOOT_CFG_SET(DSL_FALSE, DSL_FALSE, DSL_FALSE),
 DSL_CPE_TEST_MODE_CTRL_SET(DSL_TESTMODE_DISABLE),
 DSL_CPE_LINE_ACTIVATE_CTRL_SET(DSL_G997_INHIBIT_LDSF, 
DSL_G997_INHIBIT_ACSF, DSL_G997_NORMAL_STARTUP),
+--- a/src/dsl_cpe_control.c
 b/src/dsl_cpe_control.c
+@@ -6515,10 +6515,13 @@ DSL_CPE_STATIC  void DSL_CPE_Termination
+ DSL_CPE_STATIC  DSL_void_t DSL_CPE_Termination (void)
+ {
+ #ifdef INCLUDE_DSL_CPE_CLI_SUPPORT
+-   DSL_int_t nDevice = 0;
+DSL_char_t buf[32] = "quit";
+ #endif
+ 
++   DSL_Error_t nRet = DSL_SUCCESS;
++   DSL_int_t nDevice = 0;
++   DSL_AutobootConfig_t sAutobootCfg;
++   DSL_AutobootControl_t sAutobootCtl;
+DSL_CPE_Control_Context_t *pCtrlCtx;
+ 
+pCtrlCtx = DSL_CPE_GetGlobalContext();
+@@ -6527,6 +6530,50 @@ DSL_CPE_STATIC  DSL_void_t DSL_CPE_Termi
+   pCtrlCtx->bRun = DSL_FALSE;
+}
+ 
++   for (nDevice = 0; nDevice < DSL_CPE_MAX_DSL_ENTITIES; ++nDevice)
++   {
++  g_bWaitBeforeConfigWrite[nDevice]= DSL_TRUE;
++  g_bWaitBeforeLinkActivation[nDevice] = DSL_TRUE;
++  g_bWaitBeforeRestart[nDevice]= DSL_TRUE;
++
++  g_bAutoContinueWaitBeforeConfigWrite[nDevice]= DSL_FALSE;
++  g_bAutoContinueWaitBeforeLinkActivation[nDevice] = DSL_FALSE;
++  g_bAutoContinueWaitBeforeRestart[nDevice]= DSL_FALSE;
++
++  memset(, 0x0, sizeof(DSL_AutobootConfig_t));
++  sAutobootCfg.data.nStateMachineOptions.bWaitBeforeConfigWrite= 
DSL_TRUE;
++  sAutobootCfg.data.nStateMachineOptions.bWaitBeforeLinkActivation = 
DSL_TRUE;
++  sAutobootCfg.data.nStateMachineOptions.bWaitBeforeRestart= 
DSL_TRUE;
++
++  nRet = (DSL_Error_t)DSL_CPE_Ioctl(
++ DSL_CPE_GetGlobalContext()->fd[nDevice],
++ DSL_FIO_AUTOBOOT_CONFIG_SET, (DSL_int_t));
++
++  if (nRet < DSL_SUCCESS)
++  {
++ DSL_CCA_DEBUG(DSL_CCA_DBG_ERR, (DSL_CPE_PREFIX
++"Autoboot configuration for device (%d) failed!, nRet = %d!"
++DSL_CPE_CRLF, nDevice, sAutobootCtl.accessCtl.nReturn));
++  }
++
++  memset(, 0, sizeof(DSL_AutobootControl_t));
++  sAutobootCtl.data.nCommand = DSL_AUTOBOOT_CTRL_RESTART;
++
++  nRet = (DSL_Error_t)DSL_CPE_Ioctl(
++ DSL_CPE_GetGlobalContext()->fd[nDevice],
++ DSL_FIO_AUTOBOOT_CONTROL_SET, (DSL_int_t));
++
++  if (nRet < DSL_SUCCESS)
++  {
++ DSL_CCA_DEBUG(DSL_CCA_DBG_ERR, (DSL_CPE_PREFIX
++"Autoboot restart for device (%d) failed!, nRet = %d!"
++DSL_CPE_CRLF, nDevice, sAutobootCtl.accessCtl.nReturn));
++  }
++   }
++
++   DSL_CCA_DEBUG(DSL_CCA_DBG_MSG, (DSL_CPE_PREFIX
++  "Autoboot restart executed" DSL_CPE_CRLF));
++
+ #ifdef INCLUDE_DSL_CPE_CLI_SUPPORT
+for (nDevice = 0; nDevice < DSL_CPE_MAX_DSL_ENTITIES; nDevice++)
+{
diff --git a/package/network/config/ltq-vdsl-app/patches/201-sigterm.patch 
b/package/network/config/ltq-vdsl-app/patches/201-sigterm.patch
index 68a416ce24fb..4e978359835e 100644
--- a/package/network/config/ltq-vdsl-app/patches/201-sigterm.patch
+++ b/package/network/config/ltq-vdsl-app/patches/201-sigterm.patch
@@ -9,7 +9,7 @@
 {
DSL_CCA_DEBUG(DSL_CCA_DBG_MSG, (DSL_CPE_PREFIX "terminated" 
DSL_CPE_CRLF));
DSL_CPE_Termination ();
-@@ -6756,6 +6756,7 @@ DSL_int_t dsl_cpe_daemon (
+@@ -6803,6 +6803,7 @@ DSL_int_t dsl_cpe_daemon (
  
  #ifndef RTEMS
 signal (SIGINT, DSL_CPE_TerminationHandler);
diff --git 

Re: Optimizing kernel compilation / alignments for network performance

2022-04-27 Thread Rafał Miłecki

On 27.04.2022 14:56, Alexander Lobakin wrote:

From: Rafał Miłecki 
Date: Wed, 27 Apr 2022 14:04:54 +0200


I noticed years ago that kernel changes touching code - that I don't use
at all - can affect network performance for me.

I work with home routers based on Broadcom Northstar platform. Those
are SoCs with not-so-powerful 2 x ARM Cortex-A9 CPU cores. Main task of
those devices is NAT masquerade and that is what I test with iperf
running on two x86 machines.

***

Example of such unused code change:
ce5013ff3bec ("mtd: spi-nor: Add support for XM25QH64A and XM25QH128A").
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ce5013ff3bec05cf2a8a05c75fcd520d9914d92b
It lowered my NAT speed from 381 Mb/s to 367 Mb/s (-3,5%).

I first reported that issue it in the e-mail thread:
ARM router NAT performance affected by random/unrelated commits
https://lkml.org/lkml/2019/5/21/349
https://www.spinics.net/lists/linux-block/msg40624.html

Back then it was commit 5b0890a97204 ("flow_dissector: Parse batman-adv
unicast headers")
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9316a9ed6895c4ad2f0cde171d486f80c55d8283
that increased my NAT speed from 741 Mb/s to 773 Mb/s (+4,3%).

***

It appears Northstar CPUs have little cache size and so any change in
location of kernel symbols can affect NAT performance. That explains why
changing unrelated code affects anything & it has been partially proven
aligning some of cache-v7.S code.

My question is: is there a way to find out & force an optimal symbols
locations?


Take a look at CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_64B[0]. I've been
fighting with the same issue on some Realtek MIPS boards: random
code changes in random kernel core parts were affecting NAT /
network performance. This option resolved this I'd say, for the cost
of slightly increased vmlinux size (almost no change in vmlinuz
size).
The only thing is that it was recently restricted to a set of
architectures and MIPS and ARM32 are not included now lol. So it's
either a matter of expanding the list (since it was restricted only
because `-falign-functions=` is not supported on some architectures)
or you can just do:

make KCFLAGS=-falign-functions=64 # replace 64 with your I-cache size

The actual alignment is something to play with, I stopped on the
cacheline size, 32 in my case.
Also, this does not provide any guarantees that you won't suffer
from random data cacheline changes. There were some initiatives to
introduce debug alignment of data as well, but since function are
often bigger than 32, while variables are usually much smaller, it
was increasing the vmlinux size by a ton (imagine each u32 variable
occupying 32-64 bytes instead of 4). But the chance of catching this
is much lower than to suffer from I-cache function misplacement.


Thank you Alexander, this appears to be helpful! I decided to ignore
CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_64B for now and just adjust CFLAGS
manually.


1. Without ce5013ff3bec and with -falign-functions=32
387 Mb/s

2. Without ce5013ff3bec and with -falign-functions=64
377 Mb/s

3. With ce5013ff3bec and with -falign-functions=32
384 Mb/s

4. With ce5013ff3bec and with -falign-functions=64
377 Mb/s


So it seems that:
1. -falign-functions=32 = pretty stable high speed
2. -falign-functions=64 = very stable slightly lower speed


I'm going to perform tests on more commits but if it stays so reliable
as above that will be a huge success for me.

___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/mailman/listinfo/openwrt-devel


[PATCH V2] ipq40xx: fix BDF file for pcie wifi chip on the GL.Inet GL-B2200

2022-04-27 Thread Enrico Mioso
When the GL-B200's PCI device was switched to pre-calibration, the ath1'k 
wasn't able to find the pre-calibration data.
Infact, the BDF files was missing the correct BMI IDs for this device, 
resulting in a failure to start it.
Repackage the BDF file after renaming relevant fields and files correctly, 
allowing thePCIE Wi-Fi to continue working.

Fixes: 80d34d9d593 ("ipq40xx: document pcie wifi chip on the GL.Inet GL-B2200")
CC: Christian Lamparter 
CC: Robert Marko 
Signed-off-by: Enrico Mioso 
---
 .../ipq-wifi/board-glinet_gl-b2200.qca9888| Bin 12200 -> 12164 bytes
 1 file changed, 0 insertions(+), 0 deletions(-)

diff --git a/package/firmware/ipq-wifi/board-glinet_gl-b2200.qca9888 
b/package/firmware/ipq-wifi/board-glinet_gl-b2200.qca9888
index 
4f0a521f35935d4be1fb01b66dd9954a48eade52..f0a493ace340de0fe84d89d46cc07209df1b6883
 100644
GIT binary patch
delta 79
zcmZ1x-x5DT!X-nW0SwH5WKwCdZ9#ITPEu~BZgNIufo^7stpS*ql%H6X0^%B)>69fF
YWhUm8*t+}ZIvE)m806+|Y|zjL0EmDWzyJUM

delta 115
zcmZpPUlBh+BBMv20Ssb*WKwCdZ9#ITPFZSRN`8^8p_xUpPD*N7W^$^nfq|)+PH|~c
qabhttps://lists.openwrt.org/mailman/listinfo/openwrt-devel


Re: [PATCH] ipq40xx: fix BDF file for pcie wifi chip on the GL.Inet GL-B2200

2022-04-27 Thread Robert Marko
On Wed, 27 Apr 2022 at 19:12, Enrico Mioso  wrote:
>
> Fix the BDF file to include the expected BMI IDs, so that the PCI Wi-Fi 
> device will work again after switching to pre-calibration.

Hi,

Can you expand the description a bit more, namely include why is this
needed after the blamed commit.

Regards,
Robert
>
> Fixes: 80d34d9d593 ("ipq40xx: document pcie wifi chip on the GL.Inet 
> GL-B2200")
> CC: Christian Lamparter 
> CC: Robert Marko 
> Signed-off-by: Enrico Mioso 
> ---
>  .../ipq-wifi/board-glinet_gl-b2200.qca9888| Bin 12200 -> 12164 bytes
>  1 file changed, 0 insertions(+), 0 deletions(-)
>
> diff --git a/package/firmware/ipq-wifi/board-glinet_gl-b2200.qca9888 
> b/package/firmware/ipq-wifi/board-glinet_gl-b2200.qca9888
> index 
> 4f0a521f35935d4be1fb01b66dd9954a48eade52..f0a493ace340de0fe84d89d46cc07209df1b6883
>  100644
> GIT binary patch
> delta 79
> zcmZ1x-x5DT!X-nW0SwH5WKwCdZ9#ITPEu~BZgNIufo^7stpS*ql%H6X0^%B)>69fF
> YWhUm8*t+}ZIvE)m806+|Y|zjL0EmDWzyJUM
>
> delta 115
> zcmZpPUlBh+BBMv20Ssb*WKwCdZ9#ITPFZSRN`8^8p_xUpPD*N7W^$^nfq|)+PH|~c
> qab
> --
> 2.36.0
>

___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/mailman/listinfo/openwrt-devel


[PATCH] ipq40xx: fix BDF file for pcie wifi chip on the GL.Inet GL-B2200

2022-04-27 Thread Enrico Mioso
Fix the BDF file to include the expected BMI IDs, so that the PCI Wi-Fi device 
will work again after switching to pre-calibration.

Fixes: 80d34d9d593 ("ipq40xx: document pcie wifi chip on the GL.Inet GL-B2200")
CC: Christian Lamparter 
CC: Robert Marko 
Signed-off-by: Enrico Mioso 
---
 .../ipq-wifi/board-glinet_gl-b2200.qca9888| Bin 12200 -> 12164 bytes
 1 file changed, 0 insertions(+), 0 deletions(-)

diff --git a/package/firmware/ipq-wifi/board-glinet_gl-b2200.qca9888 
b/package/firmware/ipq-wifi/board-glinet_gl-b2200.qca9888
index 
4f0a521f35935d4be1fb01b66dd9954a48eade52..f0a493ace340de0fe84d89d46cc07209df1b6883
 100644
GIT binary patch
delta 79
zcmZ1x-x5DT!X-nW0SwH5WKwCdZ9#ITPEu~BZgNIufo^7stpS*ql%H6X0^%B)>69fF
YWhUm8*t+}ZIvE)m806+|Y|zjL0EmDWzyJUM

delta 115
zcmZpPUlBh+BBMv20Ssb*WKwCdZ9#ITPFZSRN`8^8p_xUpPD*N7W^$^nfq|)+PH|~c
qabhttps://lists.openwrt.org/mailman/listinfo/openwrt-devel


Re: [PATCH v2] realtek: do not reset SerDes on link change

2022-04-27 Thread Sander Vanheule
Hi Birger,

On Sun, 2022-04-24 at 22:01 +0200, Birger Koblitz wrote:
> Do not reset the RTL930x SerDes on link changes, instead set up
> the SDS with internal PHYs for the SFP+ ports only.
> This fixes the 8 1GBit ports on the Zyxel XGS1250 which
> do not work without this patch.
> 
> Tested-by: Stijn Segers 
> Signed-off-by: Birger Koblitz 
> ---
> v2: A different patch was previously sent with this subject.
>     This is the correct patch.
>  target/linux/realtek/files-5.10/drivers/net/dsa/rtl83xx/dsa.c  | 3 ++-
>  .../linux/realtek/files-5.10/drivers/net/dsa/rtl83xx/rtl83xx.h | 1 +
>  2 files changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/target/linux/realtek/files-5.10/drivers/net/dsa/rtl83xx/dsa.c
> b/target/linux/realtek/files-5.10/drivers/net/dsa/rtl83xx/dsa.c
> index 858b692640..5f19a1f590 100644
> --- a/target/linux/realtek/files-5.10/drivers/net/dsa/rtl83xx/dsa.c
> +++ b/target/linux/realtek/files-5.10/drivers/net/dsa/rtl83xx/dsa.c
> @@ -814,7 +814,8 @@ static void rtl93xx_phylink_mac_config(struct dsa_switch 
> *ds, int
> port,
>    __func__, phy_modes(state->interface));
> return;
> }
> -   rtl9300_sds_rst(sds_num, sds_mode);
> +   if (state->interface == PHY_INTERFACE_MODE_10GBASER)
> +   rtl9300_serdes_setup(sds_num, state->interface);


Resetting the SerDes(-es?) makes it end up in a state where the 1Gb (copper) 
ports don't
work. So with fixed phy-s, I can see how skipping a reset could help.

Instead of a _reset_, you now only do a mode change on 10GBASER ports, using a 
_setup_
call. The reset and setup also are not entirely equivalent, so why change to
rtl9300_serdes_setup()? Do 1G SFP modules still work if you only change modes 
for
10GBASER?

Best,
Sander

___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/mailman/listinfo/openwrt-devel


Re: Optimizing kernel compilation / alignments for network performance

2022-04-27 Thread Alexander Lobakin
From: Rafał Miłecki 
Date: Wed, 27 Apr 2022 14:04:54 +0200

> Hi,

Hej,

> 
> I noticed years ago that kernel changes touching code - that I don't use
> at all - can affect network performance for me.
> 
> I work with home routers based on Broadcom Northstar platform. Those
> are SoCs with not-so-powerful 2 x ARM Cortex-A9 CPU cores. Main task of
> those devices is NAT masquerade and that is what I test with iperf
> running on two x86 machines.
> 
> ***
> 
> Example of such unused code change:
> ce5013ff3bec ("mtd: spi-nor: Add support for XM25QH64A and XM25QH128A").
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ce5013ff3bec05cf2a8a05c75fcd520d9914d92b
> It lowered my NAT speed from 381 Mb/s to 367 Mb/s (-3,5%).
> 
> I first reported that issue it in the e-mail thread:
> ARM router NAT performance affected by random/unrelated commits
> https://lkml.org/lkml/2019/5/21/349
> https://www.spinics.net/lists/linux-block/msg40624.html
> 
> Back then it was commit 5b0890a97204 ("flow_dissector: Parse batman-adv
> unicast headers")
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9316a9ed6895c4ad2f0cde171d486f80c55d8283
> that increased my NAT speed from 741 Mb/s to 773 Mb/s (+4,3%).
> 
> ***
> 
> It appears Northstar CPUs have little cache size and so any change in
> location of kernel symbols can affect NAT performance. That explains why
> changing unrelated code affects anything & it has been partially proven
> aligning some of cache-v7.S code.
> 
> My question is: is there a way to find out & force an optimal symbols
> locations?

Take a look at CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_64B[0]. I've been
fighting with the same issue on some Realtek MIPS boards: random
code changes in random kernel core parts were affecting NAT /
network performance. This option resolved this I'd say, for the cost
of slightly increased vmlinux size (almost no change in vmlinuz
size).
The only thing is that it was recently restricted to a set of
architectures and MIPS and ARM32 are not included now lol. So it's
either a matter of expanding the list (since it was restricted only
because `-falign-functions=` is not supported on some architectures)
or you can just do:

make KCFLAGS=-falign-functions=64 # replace 64 with your I-cache size

The actual alignment is something to play with, I stopped on the
cacheline size, 32 in my case.
Also, this does not provide any guarantees that you won't suffer
from random data cacheline changes. There were some initiatives to
introduce debug alignment of data as well, but since function are
often bigger than 32, while variables are usually much smaller, it
was increasing the vmlinux size by a ton (imagine each u32 variable
occupying 32-64 bytes instead of 4). But the chance of catching this
is much lower than to suffer from I-cache function misplacement.

> 
> Adding .align 5 to the cache-v7.S is a partial success. I'd like to find
> out what other functions are worth optimizing (aligning) and force that
> (I guess  __attribute__((aligned(32))) could be used).
> 
> I can't really draw any conclusions from comparing System.map before and
> after above commits as they relocate thousands of symbols in one go.
> 
> Optimizing is pretty important for me for two reasons:
> 1. I want to reach maximum possible NAT masquerade performance
> 2. I need stable performance across random commits to detect regressions

[0] 
https://elixir.bootlin.com/linux/v5.18-rc4/K/ident/CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_64B

Thanks,
Al

___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/mailman/listinfo/openwrt-devel


Optimizing kernel compilation / alignments for network performance

2022-04-27 Thread Rafał Miłecki

Hi,

I noticed years ago that kernel changes touching code - that I don't use
at all - can affect network performance for me.

I work with home routers based on Broadcom Northstar platform. Those
are SoCs with not-so-powerful 2 x ARM Cortex-A9 CPU cores. Main task of
those devices is NAT masquerade and that is what I test with iperf
running on two x86 machines.

***

Example of such unused code change:
ce5013ff3bec ("mtd: spi-nor: Add support for XM25QH64A and XM25QH128A").
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ce5013ff3bec05cf2a8a05c75fcd520d9914d92b
It lowered my NAT speed from 381 Mb/s to 367 Mb/s (-3,5%).

I first reported that issue it in the e-mail thread:
ARM router NAT performance affected by random/unrelated commits
https://lkml.org/lkml/2019/5/21/349
https://www.spinics.net/lists/linux-block/msg40624.html

Back then it was commit 5b0890a97204 ("flow_dissector: Parse batman-adv
unicast headers")
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9316a9ed6895c4ad2f0cde171d486f80c55d8283
that increased my NAT speed from 741 Mb/s to 773 Mb/s (+4,3%).

***

It appears Northstar CPUs have little cache size and so any change in
location of kernel symbols can affect NAT performance. That explains why
changing unrelated code affects anything & it has been partially proven
aligning some of cache-v7.S code.

My question is: is there a way to find out & force an optimal symbols
locations?

Adding .align 5 to the cache-v7.S is a partial success. I'd like to find
out what other functions are worth optimizing (aligning) and force that
(I guess  __attribute__((aligned(32))) could be used).

I can't really draw any conclusions from comparing System.map before and
after above commits as they relocate thousands of symbols in one go.

Optimizing is pretty important for me for two reasons:
1. I want to reach maximum possible NAT masquerade performance
2. I need stable performance across random commits to detect regressions

___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/mailman/listinfo/openwrt-devel


Re: [PATCH] Revert "ipq40xx: document pcie wifi chip on the GL.Inet GL-B2200"

2022-04-27 Thread Enrico Mioso

Dear Rafał,
first of all, thanks a lot for taking the time and the patience to review this 
patch.

I explained the situation in more detail here:
http://lists.openwrt.org/pipermail/openwrt-devel/2022-April/038451.html

... admittedly, the subject was too generic.

Thanks!

Enrico___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/mailman/listinfo/openwrt-devel


Re: [PATCH] Revert "ipq40xx: document pcie wifi chip on the GL.Inet GL-B2200"

2022-04-27 Thread Rafał Miłecki

On 17.04.2022 02:42, Enrico Mioso wrote:

This reverts commit 80d34d9d593865248bf5a23794e9163895140de7.
This brings back the PCI Wi-Fi interface on the GL-B2200.

CC: Christian Lamparter 
Signed-off-by: Enrico Mioso 


This description doesn't tell anything really. Please explain what
happens and why do we need this patch. How does it fix things.

___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/mailman/listinfo/openwrt-devel