Re: stmmac: Performance regression after commit aff3d9eff843 "net: stmmac: enable multiple buffers"
Às 10:56 AM de 3/23/2017, Joao Pinto escreveu: > Às 10:51 AM de 3/23/2017, Giuseppe CAVALLARO escreveu: >> On 3/23/2017 11:48 AM, Giuseppe CAVALLARO wrote: >>> Hello >>> >>> On 3/23/2017 11:20 AM, Corentin Labbe wrote: > I have a 4.21 QoS Core with 4 RX + 4 TX and detected no regression. >> Could you please share the iperf cmds you are using in order for me to > reproduce >> in my side? >>> HW Version: 4.21 QoS Core in HAPS DX7 (FPGA) The connection between the FPGA and PC where stmmac is running is PCIe. My configurations are done in stmmac_pci. Here they are: @@ -68,10 +70,52 @@ static void stmmac_default_data(struct plat_stmmacenet_data *plat) { plat->bus_id = 1; plat->phy_addr = 0; - plat->interface = PHY_INTERFACE_MODE_GMII; - plat->clk_csr = 2; /* clk_csr_i = 20-35MHz & MDC = clk_csr_i/16 */ - plat->has_gmac = 1; - plat->force_sf_dma_mode = 1; + plat->interface = PHY_INTERFACE_MODE_SGMII; + plat->clk_csr = 0x5; + plat->has_gmac = 0; + plat->has_gmac4 = 1; + plat->force_sf_dma_mode = 0; + + plat->rx_queues_to_use = 4; + plat->tx_queues_to_use = 4; + + plat->rx_sched_algorithm = MTL_RX_ALGORITHM_SP; + + plat->rx_queues_cfg[0].mode_to_use = MTL_QUEUE_AVB; + plat->rx_queues_cfg[1].mode_to_use = MTL_QUEUE_DCB; + plat->rx_queues_cfg[2].mode_to_use = MTL_QUEUE_DCB; + plat->rx_queues_cfg[3].mode_to_use = MTL_QUEUE_DCB; + + plat->tx_queues_cfg[0].mode_to_use = MTL_QUEUE_DCB; + plat->tx_queues_cfg[1].mode_to_use = MTL_QUEUE_AVB; + plat->tx_queues_cfg[2].mode_to_use = MTL_QUEUE_DCB; + plat->tx_queues_cfg[3].mode_to_use = MTL_QUEUE_DCB; + + plat->tx_queues_cfg[1].send_slope = 0xCCC; + plat->tx_queues_cfg[1].idle_slope = 0x1333; + plat->tx_queues_cfg[1].high_credit = 0x4B; + plat->tx_queues_cfg[1].low_credit = 0xFFB5; + + plat->rx_queues_cfg[0].chan = 0; + plat->rx_queues_cfg[1].chan = 1; + plat->rx_queues_cfg[2].chan = 2; + plat->rx_queues_cfg[3].chan = 3; + + plat->tx_sched_algorithm = MTL_TX_ALGORITHM_WRR; + plat->tx_queues_cfg[0].weight = 0x10; + plat->tx_queues_cfg[1].weight = 0x11; + plat->tx_queues_cfg[2].weight = 0x12; + plat->tx_queues_cfg[3].weight = 0x13; + + /* Disable Priority config by default */ + plat->tx_queues_cfg[0].use_prio = false; + plat->rx_queues_cfg[0].use_prio = false; + + /* Disable RX queues routing by default */ + plat->rx_queues_cfg[0].pkt_route = 0x0; + plat->rx_queues_cfg[1].pkt_route = 0x0; + plat->rx_queues_cfg[2].pkt_route = 0x0; + plat->rx_queues_cfg[3].pkt_route = 0x0; plat->mdio_bus_data->phy_reset = NULL; plat->mdio_bus_data->phy_mask = 0; @@ -83,22 +127,14 @@ static void stmmac_default_data(struct plat_stmmacenet_data *plat) /* Set default value for multicast hash bins */ plat->multicast_filter_bins = HASH_TABLE_SIZE; + plat->dma_cfg->fixed_burst = 0; + plat->dma_cfg->aal = 0; + /* Set default value for unicast filter entries */ plat->unicast_filter_entries = 1; /* Set the maxmtu to a default of JUMBO_LEN */ plat->maxmtu = JUMBO_LEN; - - /* Set default number of RX and TX queues to use */ - plat->tx_queues_to_use = 1; - plat->rx_queues_to_use = 1; - - /* Disable Priority config by default */ - plat->tx_queues_cfg[0].use_prio = false; - plat->rx_queues_cfg[0].use_prio = false; - - /* Disable RX queues routing by default */ - plat->rx_queues_cfg[0].pkt_route = 0x0; } *** TESTS *** *TEST 1: File (linux-next tarball) transfer of ~1.4G by scp to the DUT* scp net-next-20170323.tar.gz x@XXX:/home/synopsys/ The authenticity of host 'X' can't be established. ECDSA key fingerprint is SHA256:/XX. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added 'XX' (ECDSA) to the list of known hosts. XX@X's password: net-next20170323.tar.gz 100% 1366MB 79.3MB/s 00:17 ifconfig after transfer: eth1 Link encap:Ethernet HWaddr inet addr: Bcast: Mask: inet6 addr: X Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:1026614 errors:0 dropped:0 overruns:0 frame:0 TX packets:56804 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:1502856063 (1.5 GB) TX bytes:4224767 (4.2 MB) Interrupt:16 *stmmac Log after transfer: #:~/temp$ dmesg | grep stmmac [0.278200] stmmac - user ID: 0x10, Synopsys ID: 0x42 [0.278207] stmmaceth :01:00.0: DMA HW capability register supported [0.278209] stmmaceth :01:00.0: RX Checksum Offload Engine supported [0.278211] stmmaceth :01:00.0: TX Checksum insertion supported [0.278224]
Re: stmmac: Performance regression after commit aff3d9eff843 "net: stmmac: enable multiple buffers"
Às 10:51 AM de 3/23/2017, Giuseppe CAVALLARO escreveu: > On 3/23/2017 11:48 AM, Giuseppe CAVALLARO wrote: >> Hello >> >> On 3/23/2017 11:20 AM, Corentin Labbe wrote: I have a 4.21 QoS Core with 4 RX + 4 TX and detected no regression. >Could you please share the iperf cmds you are using in order for me to reproduce >in my side? >> >> Joao, you have a really powerful HW integration with multiple channels for >> both RX and TX. >> Often this is not the same for other setup where, usually just a DMA0 is >> present or, sometime, there >> is just one RX extra channel. >> >> My question is, what happens on this kind of configurations? Are we still >> guarantying the best performances? >> >> Also we have to guarantee, that the TSO and SG are always working. Another >> point is the buffer sizes that >> can be different among platforms. >> >> The problem below reported by Corentin push me to think that there is a bug, >> so we should >> understand when this has been introduced and if likely fixed by some >> configuration we are >> not take care right now. >> >> ndesc_get_rx_status: Oversized frame spanned multiple buffers" > > I wonder if this could be easily triggered by getting a big file via FTP. So > not > properly related on performance benchs I am going to do that test and check it out and also run iperf a couple of times. I am counting on doing this today and send you later the results. If anyone gets results sooner please share. > > peppe > >> >> >> Best Regards >> Peppe >> > Thanks.
Re: stmmac: Performance regression after commit aff3d9eff843 "net: stmmac: enable multiple buffers"
Hi Peppe, Às 10:48 AM de 3/23/2017, Giuseppe CAVALLARO escreveu: > Hello > > On 3/23/2017 11:20 AM, Corentin Labbe wrote: >>> I have a 4.21 QoS Core with 4 RX + 4 TX and detected no regression. >>> >Could you please share the iperf cmds you are using in order for me to >>> reproduce >>> >in my side? > > Joao, you have a really powerful HW integration with multiple channels for > both > RX and TX. > Often this is not the same for other setup where, usually just a DMA0 is > present > or, sometime, there > is just one RX extra channel. My opinion is that we should not have problems, since the majority of features introduced are used if you configure rx queues > 1 or tx queues > 1, so if you use the default (=1) those confiogurations will not take place. > > My question is, what happens on this kind of configurations? Are we still > guarantying the best performances? > > Also we have to guarantee, that the TSO and SG are always working. Another > point > is the buffer sizes that > can be different among platforms. We have to pay attention to the RX buffer size, since I had problems with DHCP messages not being received because of little buffer size. Currently TX buffer size is not configurable and in the future it should be useful to include it too. > > The problem below reported by Corentin push me to think that there is a bug, > so > we should > understand when this has been introduced and if likely fixed by some > configuration we are > not take care right now. Of course. > > ndesc_get_rx_status: Oversized frame spanned multiple buffers" > > > Best Regards > Peppe Thanks, Joao
Re: stmmac: Performance regression after commit aff3d9eff843 "net: stmmac: enable multiple buffers"
On 3/23/2017 11:48 AM, Giuseppe CAVALLARO wrote: Hello On 3/23/2017 11:20 AM, Corentin Labbe wrote: I have a 4.21 QoS Core with 4 RX + 4 TX and detected no regression. >Could you please share the iperf cmds you are using in order for me to reproduce >in my side? Joao, you have a really powerful HW integration with multiple channels for both RX and TX. Often this is not the same for other setup where, usually just a DMA0 is present or, sometime, there is just one RX extra channel. My question is, what happens on this kind of configurations? Are we still guarantying the best performances? Also we have to guarantee, that the TSO and SG are always working. Another point is the buffer sizes that can be different among platforms. The problem below reported by Corentin push me to think that there is a bug, so we should understand when this has been introduced and if likely fixed by some configuration we are not take care right now. ndesc_get_rx_status: Oversized frame spanned multiple buffers" I wonder if this could be easily triggered by getting a big file via FTP. So not properly related on performance benchs peppe Best Regards Peppe
Re: stmmac: Performance regression after commit aff3d9eff843 "net: stmmac: enable multiple buffers"
Hello On 3/23/2017 11:20 AM, Corentin Labbe wrote: I have a 4.21 QoS Core with 4 RX + 4 TX and detected no regression. >Could you please share the iperf cmds you are using in order for me to reproduce >in my side? Joao, you have a really powerful HW integration with multiple channels for both RX and TX. Often this is not the same for other setup where, usually just a DMA0 is present or, sometime, there is just one RX extra channel. My question is, what happens on this kind of configurations? Are we still guarantying the best performances? Also we have to guarantee, that the TSO and SG are always working. Another point is the buffer sizes that can be different among platforms. The problem below reported by Corentin push me to think that there is a bug, so we should understand when this has been introduced and if likely fixed by some configuration we are not take care right now. ndesc_get_rx_status: Oversized frame spanned multiple buffers" Best Regards Peppe
Re: stmmac: Performance regression after commit aff3d9eff843 "net: stmmac: enable multiple buffers"
Às 10:20 AM de 3/23/2017, Corentin Labbe escreveu: > On Thu, Mar 23, 2017 at 10:12:18AM +, Joao Pinto wrote: >> >> Hi Corentin, >> >> Às 10:08 AM de 3/23/2017, Corentin Labbe escreveu: >>> Hello >>> >>> Using next-20170323 produce a huge performance regression on my sunxi >>> boards. >>> On dwmac-sun8i, iperf goes from 94mbs/s to 37 when sending. >>> >>> On cubieboard2(dwmac-sunxi), iperf made the kernel flood with >>> "ndesc_get_rx_status: Oversized frame spanned multiple buffers" >>> and network is lost after. >>> >>> Reverting aff3d9eff84399e433c4aca65a9bb236581bc082 fix the issue. >>> I still try to found which part of this patch mades the performance lower. >>> >>> Regards >>> Corentin Labbe >>> >> >> I have a 4.21 QoS Core with 4 RX + 4 TX and detected no regression. >> Could you please share the iperf cmds you are using in order for me to >> reproduce >> in my side? > > simple iperf -c serverip for both board > Ok, I am going to run my tests with a fresh net-next and come back to you soon. Thanks, Joao
Re: stmmac: Performance regression after commit aff3d9eff843 "net: stmmac: enable multiple buffers"
On Thu, Mar 23, 2017 at 10:12:18AM +, Joao Pinto wrote: > > Hi Corentin, > > Às 10:08 AM de 3/23/2017, Corentin Labbe escreveu: > > Hello > > > > Using next-20170323 produce a huge performance regression on my sunxi > > boards. > > On dwmac-sun8i, iperf goes from 94mbs/s to 37 when sending. > > > > On cubieboard2(dwmac-sunxi), iperf made the kernel flood with > > "ndesc_get_rx_status: Oversized frame spanned multiple buffers" > > and network is lost after. > > > > Reverting aff3d9eff84399e433c4aca65a9bb236581bc082 fix the issue. > > I still try to found which part of this patch mades the performance lower. > > > > Regards > > Corentin Labbe > > > > I have a 4.21 QoS Core with 4 RX + 4 TX and detected no regression. > Could you please share the iperf cmds you are using in order for me to > reproduce > in my side? simple iperf -c serverip for both board
Re: stmmac: Performance regression after commit aff3d9eff843 "net: stmmac: enable multiple buffers"
Hi Corentin, Às 10:08 AM de 3/23/2017, Corentin Labbe escreveu: > Hello > > Using next-20170323 produce a huge performance regression on my sunxi boards. > On dwmac-sun8i, iperf goes from 94mbs/s to 37 when sending. > > On cubieboard2(dwmac-sunxi), iperf made the kernel flood with > "ndesc_get_rx_status: Oversized frame spanned multiple buffers" > and network is lost after. > > Reverting aff3d9eff84399e433c4aca65a9bb236581bc082 fix the issue. > I still try to found which part of this patch mades the performance lower. > > Regards > Corentin Labbe > I have a 4.21 QoS Core with 4 RX + 4 TX and detected no regression. Could you please share the iperf cmds you are using in order for me to reproduce in my side? @stmmac users: It would be great if people that have a setup could also perform teh same iperf test in order to clean in up for everyone. Thanks, Joao
stmmac: Performance regression after commit aff3d9eff843 "net: stmmac: enable multiple buffers"
Hello Using next-20170323 produce a huge performance regression on my sunxi boards. On dwmac-sun8i, iperf goes from 94mbs/s to 37 when sending. On cubieboard2(dwmac-sunxi), iperf made the kernel flood with "ndesc_get_rx_status: Oversized frame spanned multiple buffers" and network is lost after. Reverting aff3d9eff84399e433c4aca65a9bb236581bc082 fix the issue. I still try to found which part of this patch mades the performance lower. Regards Corentin Labbe