Re: [PATCH for-7.2] rtl8139: honor large send MSS value

2022-11-17 Thread Jason Wang



在 2022/11/17 19:26, Tobias Fiebig 写道:

Heho,
Ok, that explains a lot. I was also thinking that the vlan bit seem to overlap 
with the MTU field, and wanted to look at that later today.

Re the 12b: IIRC, the standard 1500 MTU for ethernet is already without the 
ethernet header; That can have up to 26b (18b basis, 4b 802.1q, 4b 802.1ad), 
but leads to a total frame length of 1526 (with other additions (MPLS) also 
just making the frame bigger, without touching the MTU/MSS). MSS than as usual 
-40 for v4 and ~ -60 for v6.

So I doubt that those 12b are subtracted for the ethernet header.

Does somebody still have an RTL8139 around, to test how the real hardware 
behaved?



This would be very hard, I can think that the Qemu rtl8139 emulation is 
probably the only user for kernel 8139cp drivers for years.


Thanks




With best regards,
Tobias

-Original Message-
From: Stefan Hajnoczi 
Sent: Thursday, 17 November 2022 12:16
To: Tobias Fiebig 
Cc: Jason Wang ; Stefan Hajnoczi ; 
qemu-devel@nongnu.org; qemu-sta...@nongnu.org; Russell King - ARM Linux 

Subject: Re: [PATCH for-7.2] rtl8139: honor large send MSS value

After looking more closely at txdw0 it seems that the code mixes "Tx command mode 0", "Tx 
command mode 1", and "Tx status mode". The bits have different meanings in each mode, so this 
leads to confusion :).

Stefan






Re: [PATCH for-7.2] rtl8139: honor large send MSS value

2022-11-17 Thread Stefan Hajnoczi
On Thu, Nov 17, 2022, 15:42 Tobias Fiebig  wrote:

> Heho,
> I gave v3 a shot and it performs as expected; For a requested MSS of 1320,
> TSO consistently uses a 1308 MSS. So for me, this patch works. Thanks for
> fixing this. :-)
>
> Sadly, I do not have boxes to test with .1q around; If none of you has
> either, and that should be tested as well, I can give it a shot in the
> coming days, but it might take some time.
>

Awesome, thanks for your help! I don't have a .1q setup on hand.

Stefan


> Side note: I found the 'missing' 12b in 12b of TCP options being added.
> :-| So sorry for that noise.
>
> With best regards,
> Tobias
>
> MTU1500
> RTL8139: +++ C+ mode offloaded task TSO IP data 7272 frame data 7292
> specified MSS=1448
> RTL8139: +++ C+ mode offloaded task TSO IP data 8720 frame data 8740
> specified MSS=1448
> RTL8139: +++ C+ mode offloaded task TSO IP data 8720 frame data 8740
> specified MSS=1448
> RTL8139: +++ C+ mode offloaded task TSO IP data 10168 frame data 10188
> specified MSS=1448
>
> MTU1320
> RTL8139: +++ C+ mode offloaded task TSO IP data 2648 frame data 2668
> specified MSS=1308
> RTL8139: +++ C+ mode offloaded task TSO IP data 10496 frame data 10516
> specified MSS=1308
> RTL8139: +++ C+ mode offloaded task TSO IP data 6572 frame data 6592
> specified MSS=1308
> RTL8139: +++ C+ mode offloaded task TSO IP data 6572 frame data 6592
> specified MSS=1308
>
>


RE: [PATCH for-7.2] rtl8139: honor large send MSS value

2022-11-17 Thread Tobias Fiebig
Heho,
I gave v3 a shot and it performs as expected; For a requested MSS of 1320, TSO 
consistently uses a 1308 MSS. So for me, this patch works. Thanks for fixing 
this. :-)

Sadly, I do not have boxes to test with .1q around; If none of you has either, 
and that should be tested as well, I can give it a shot in the coming days, but 
it might take some time.

Side note: I found the 'missing' 12b in 12b of TCP options being added. :-| So 
sorry for that noise.

With best regards,
Tobias

MTU1500
RTL8139: +++ C+ mode offloaded task TSO IP data 7272 frame data 7292 specified 
MSS=1448
RTL8139: +++ C+ mode offloaded task TSO IP data 8720 frame data 8740 specified 
MSS=1448
RTL8139: +++ C+ mode offloaded task TSO IP data 8720 frame data 8740 specified 
MSS=1448
RTL8139: +++ C+ mode offloaded task TSO IP data 10168 frame data 10188 
specified MSS=1448

MTU1320
RTL8139: +++ C+ mode offloaded task TSO IP data 2648 frame data 2668 specified 
MSS=1308
RTL8139: +++ C+ mode offloaded task TSO IP data 10496 frame data 10516 
specified MSS=1308
RTL8139: +++ C+ mode offloaded task TSO IP data 6572 frame data 6592 specified 
MSS=1308
RTL8139: +++ C+ mode offloaded task TSO IP data 6572 frame data 6592 specified 
MSS=1308




RE: [PATCH for-7.2] rtl8139: honor large send MSS value

2022-11-17 Thread Tobias Fiebig
Heho,
Thanks, will test the three patches later.

With best regards,
Tobias

-Original Message-
From: Stefan Hajnoczi  
Sent: Thursday, 17 November 2022 17:57
To: Tobias Fiebig 
Cc: Jason Wang ; Stefan Hajnoczi ; 
qemu-devel@nongnu.org; qemu-sta...@nongnu.org; Russell King - ARM Linux 

Subject: Re: [PATCH for-7.2] rtl8139: honor large send MSS value

Hi Tobias,
My initial patch was broken. I did some cleanup and sent a v3.

Stefan




Re: [PATCH for-7.2] rtl8139: honor large send MSS value

2022-11-17 Thread Stefan Hajnoczi
Hi Tobias,
My initial patch was broken. I did some cleanup and sent a v3.

Stefan



RE: [PATCH for-7.2] rtl8139: honor large send MSS value

2022-11-17 Thread Tobias Fiebig
Heho,
Ok, that explains a lot. I was also thinking that the vlan bit seem to overlap 
with the MTU field, and wanted to look at that later today.

Re the 12b: IIRC, the standard 1500 MTU for ethernet is already without the 
ethernet header; That can have up to 26b (18b basis, 4b 802.1q, 4b 802.1ad), 
but leads to a total frame length of 1526 (with other additions (MPLS) also 
just making the frame bigger, without touching the MTU/MSS). MSS than as usual 
-40 for v4 and ~ -60 for v6.

So I doubt that those 12b are subtracted for the ethernet header.

Does somebody still have an RTL8139 around, to test how the real hardware 
behaved?

With best regards,
Tobias

-Original Message-
From: Stefan Hajnoczi  
Sent: Thursday, 17 November 2022 12:16
To: Tobias Fiebig 
Cc: Jason Wang ; Stefan Hajnoczi ; 
qemu-devel@nongnu.org; qemu-sta...@nongnu.org; Russell King - ARM Linux 

Subject: Re: [PATCH for-7.2] rtl8139: honor large send MSS value

After looking more closely at txdw0 it seems that the code mixes "Tx command 
mode 0", "Tx command mode 1", and "Tx status mode". The bits have different 
meanings in each mode, so this leads to confusion :).

Stefan




Re: [PATCH for-7.2] rtl8139: honor large send MSS value

2022-11-17 Thread Stefan Hajnoczi
After looking more closely at txdw0 it seems that the code mixes "Tx
command mode 0", "Tx command mode 1", and "Tx status mode". The bits
have different meanings in each mode, so this leads to confusion :).

Stefan



Re: [PATCH for-7.2] rtl8139: honor large send MSS value

2022-11-17 Thread Stefan Hajnoczi
On Wed, 16 Nov 2022 at 21:49, Tobias Fiebig  wrote:
>
> Heho,
> Ok, I just learned more C than I ever wanted to. There is a bit more amiss 
> here (ll from 7d7238c72b983cff5064734349d2d45be9c6282c):
>
> In line 1916 of rtl8139.c we set txdw0; If we calculate the MSS at this 
> point, it is consistently 12 below requested, but generally accurate. The 
> bits that flip re: -12 must happen somewhere in the Linux kernel driver (ll 
> 764 in drivers/net/ethernet/realtek/8139cp.c?); Didn't look there in-depth 
> yet (and do not plan to, maybe one of you has more experience with this?) 
> Given the consistency of this deviation, maybe just doing a +12 might be more 
> straight forward.
>
> However, in ll2030ff we reset a couple of status indicators. These overlap 
> with the fields for the MSS, leading to inaccurate values being calculated 
> later on; For example, requesting an MSS of 767 leads to an MSS of 3 being 
> calculated by your patch; Similarly, requesting 1000 leads to 268. At least 
> for the latter I see packets of that size being generated on the wire (which 
> should also not happen, as the MSS should never be below 536; maybe a check 
> could help here to make sure we are not trusting arbitrary values from the 
> driver, esp. given the bobble of sec issues around PMTUD/MSS; Technically, 
> now that MSS is defined earlier, we could also move this closer to the start 
> of TSO large frame handling).
>
> Below is also a draft patch following my suggestions (save txdw0, +12, check 
> for <536) and some examples for what I described above, which I can on your 
> last patch. Please note again that this is essentially the first time I do 
> anything in C; Also, I wasn't sure what has less perf impact (save the whole 
> 32bit of txdw0 even though it might not be needed vs. also doing the shift/& 
> even though it might not be needed).
>
> Apart from that, my patch seems to work, and the MSS gets set correctly; 
> Someone else testing would be nice, though:
>
> # MSS_requested=1320
> RTL8139: +++ C+ mode offloaded task TSO IP data 2648 frame data 2668 
> specified MSS=1320
>
> # MSS_requested=1000
> RTL8139: +++ C+ mode offloaded task TSO IP data 2008 frame data 2028 
> specified MSS=1000
>
> # MSS_requested=600
> RTL8139: +++ C+ mode offloaded task TSO IP data 1796 frame data 1816 
> specified MSS=600
>
> With best regards,
> Tobias
>
> diff --git a/hw/net/rtl8139.c b/hw/net/rtl8139.c
> index e6643e3c9d..59321460b9 100644
> --- a/hw/net/rtl8139.c
> +++ b/hw/net/rtl8139.c
> @@ -77,7 +77,6 @@
>  ( ( input ) & ( size - 1 )  )
>
>  #define ETHER_TYPE_LEN 2
> -#define ETH_MTU 1500
>
>  #define VLAN_TCI_LEN 2
>  #define VLAN_HLEN (ETHER_TYPE_LEN + VLAN_TCI_LEN)
> @@ -1934,8 +1933,9 @@ static int rtl8139_cplus_transmit_one(RTL8139State *s)
>  #define CP_TX_LS (1<<28)
>  /* large send packet flag */
>  #define CP_TX_LGSEN (1<<27)
> -/* large send MSS mask, bits 16...25 */
> -#define CP_TC_LGSEN_MSS_MASK ((1 << 12) - 1)
> +/* large send MSS mask, bits 16...26 */
> +#define CP_TC_LGSEN_MSS_SHIFT 16
> +#define CP_TC_LGSEN_MSS_MASK ((1 << 11) - 1)
>
>  /* IP checksum offload flag */
>  #define CP_TX_IPCS (1<<18)
> @@ -2027,6 +2027,9 @@ static int rtl8139_cplus_transmit_one(RTL8139State *s)
>  s->currCPlusTxDesc = 0;
>  }
>
> +/* store unaltered txdw0 for later use in MSS calculation*/
> +uint32_t txdw0_save = txdw0;
> +
>  /* transfer ownership to target */
>  txdw0 &= ~CP_TX_OWN;
>
> @@ -2149,10 +2152,12 @@ static int rtl8139_cplus_transmit_one(RTL8139State *s)
>  goto skip_offload;
>  }
>
> -int large_send_mss = (txdw0 >> 16) & CP_TC_LGSEN_MSS_MASK;
> +/* set large_send_mss from txdw0 before overlapping mss 
> fields were cleared */
> +int large_send_mss = ((txdw0_save >> CP_TC_LGSEN_MSS_SHIFT) &
> +CP_TC_LGSEN_MSS_MASK) + 12;

Hi Tobias,
Thanks for posting this information.

12 bytes hapens to be the size of the Ethernet header:
https://en.wikipedia.org/wiki/Ethernet_header#Structure

That could be a clue. I'll try to investigate some more.

Stefan

>
> -DPRINTF("+++ C+ mode offloaded task TSO MTU=%d IP data %d "
> -"frame data %d specified MSS=%d\n", ETH_MTU,
> +DPRINTF("+++ C+ mode offloaded task TSO IP data %d "
> +"frame data %d specified MSS=%d\n",
>  ip_data_len, saved_size - ETH_HLEN, large_send_mss);
>
>  int tcp_send_offset = 0;
> @@ -2177,9 +2182,13 @@ static int rtl8139_cplus_transmit_one(RTL8139State *s)
>  goto skip_offload;
>  }
>
> -/* ETH_MTU = ip header len + tcp header len + payload */
> +/* MSS too small? Min MSS = 536 */
> +if (tcp_hlen + hlen >= large_send_mss || 535 >= 
> large_send_mss) {
> +goto skip_offload;
> +}
> +
>  

RE: [PATCH for-7.2] rtl8139: honor large send MSS value

2022-11-16 Thread Tobias Fiebig
Heho,
Ok, I just learned more C than I ever wanted to. There is a bit more amiss here 
(ll from 7d7238c72b983cff5064734349d2d45be9c6282c):

In line 1916 of rtl8139.c we set txdw0; If we calculate the MSS at this point, 
it is consistently 12 below requested, but generally accurate. The bits that 
flip re: -12 must happen somewhere in the Linux kernel driver (ll 764 in 
drivers/net/ethernet/realtek/8139cp.c?); Didn't look there in-depth yet (and do 
not plan to, maybe one of you has more experience with this?) Given the 
consistency of this deviation, maybe just doing a +12 might be more straight 
forward.

However, in ll2030ff we reset a couple of status indicators. These overlap with 
the fields for the MSS, leading to inaccurate values being calculated later on; 
For example, requesting an MSS of 767 leads to an MSS of 3 being calculated by 
your patch; Similarly, requesting 1000 leads to 268. At least for the latter I 
see packets of that size being generated on the wire (which should also not 
happen, as the MSS should never be below 536; maybe a check could help here to 
make sure we are not trusting arbitrary values from the driver, esp. given the 
bobble of sec issues around PMTUD/MSS; Technically, now that MSS is defined 
earlier, we could also move this closer to the start of TSO large frame 
handling).

Below is also a draft patch following my suggestions (save txdw0, +12, check 
for <536) and some examples for what I described above, which I can on your 
last patch. Please note again that this is essentially the first time I do 
anything in C; Also, I wasn't sure what has less perf impact (save the whole 
32bit of txdw0 even though it might not be needed vs. also doing the shift/& 
even though it might not be needed).

Apart from that, my patch seems to work, and the MSS gets set correctly; 
Someone else testing would be nice, though:

# MSS_requested=1320
RTL8139: +++ C+ mode offloaded task TSO IP data 2648 frame data 2668 specified 
MSS=1320

# MSS_requested=1000
RTL8139: +++ C+ mode offloaded task TSO IP data 2008 frame data 2028 specified 
MSS=1000

# MSS_requested=600
RTL8139: +++ C+ mode offloaded task TSO IP data 1796 frame data 1816 specified 
MSS=600

With best regards,
Tobias

diff --git a/hw/net/rtl8139.c b/hw/net/rtl8139.c
index e6643e3c9d..59321460b9 100644
--- a/hw/net/rtl8139.c
+++ b/hw/net/rtl8139.c
@@ -77,7 +77,6 @@
 ( ( input ) & ( size - 1 )  )
 
 #define ETHER_TYPE_LEN 2
-#define ETH_MTU 1500
 
 #define VLAN_TCI_LEN 2
 #define VLAN_HLEN (ETHER_TYPE_LEN + VLAN_TCI_LEN)
@@ -1934,8 +1933,9 @@ static int rtl8139_cplus_transmit_one(RTL8139State *s)
 #define CP_TX_LS (1<<28)
 /* large send packet flag */
 #define CP_TX_LGSEN (1<<27)
-/* large send MSS mask, bits 16...25 */
-#define CP_TC_LGSEN_MSS_MASK ((1 << 12) - 1)
+/* large send MSS mask, bits 16...26 */
+#define CP_TC_LGSEN_MSS_SHIFT 16
+#define CP_TC_LGSEN_MSS_MASK ((1 << 11) - 1)
 
 /* IP checksum offload flag */
 #define CP_TX_IPCS (1<<18)
@@ -2027,6 +2027,9 @@ static int rtl8139_cplus_transmit_one(RTL8139State *s)
 s->currCPlusTxDesc = 0;
 }
 
+/* store unaltered txdw0 for later use in MSS calculation*/
+uint32_t txdw0_save = txdw0;
+
 /* transfer ownership to target */
 txdw0 &= ~CP_TX_OWN;
 
@@ -2149,10 +2152,12 @@ static int rtl8139_cplus_transmit_one(RTL8139State *s)
 goto skip_offload;
 }
 
-int large_send_mss = (txdw0 >> 16) & CP_TC_LGSEN_MSS_MASK;
+/* set large_send_mss from txdw0 before overlapping mss fields 
were cleared */
+int large_send_mss = ((txdw0_save >> CP_TC_LGSEN_MSS_SHIFT) &
+CP_TC_LGSEN_MSS_MASK) + 12;
 
-DPRINTF("+++ C+ mode offloaded task TSO MTU=%d IP data %d "
-"frame data %d specified MSS=%d\n", ETH_MTU,
+DPRINTF("+++ C+ mode offloaded task TSO IP data %d "
+"frame data %d specified MSS=%d\n",
 ip_data_len, saved_size - ETH_HLEN, large_send_mss);
 
 int tcp_send_offset = 0;
@@ -2177,9 +2182,13 @@ static int rtl8139_cplus_transmit_one(RTL8139State *s)
 goto skip_offload;
 }
 
-/* ETH_MTU = ip header len + tcp header len + payload */
+/* MSS too small? Min MSS = 536 */
+if (tcp_hlen + hlen >= large_send_mss || 535 >= 
large_send_mss) {
+goto skip_offload;
+}
+
 int tcp_data_len = ip_data_len - tcp_hlen;
-int tcp_chunk_size = ETH_MTU - hlen - tcp_hlen;
+int tcp_chunk_size = large_send_mss - hlen - tcp_hlen;
 
 DPRINTF("+++ C+ mode TSO IP data len %d TCP hlen %d TCP "
 "data len %d TCP chunk size %d\n", ip_data_len,



Some examples (with additional DPRINT capturing txdw0/MSS at various places; 
txdw0_0=ll1923, txdw0_4=ll2029, txdw0_5=ll2039, 

Re: [PATCH for-7.2] rtl8139: honor large send MSS value

2022-11-16 Thread Stefan Hajnoczi
I have sent a v2 with a fixed MSS mask constant but haven't tested it.

Thanks,
Stefan



RE: [PATCH for-7.2] rtl8139: honor large send MSS value

2022-11-16 Thread Tobias Fiebig
Heho,
Quick follow-up; Applied the change you suggested, but there are still some 
things to test.

While this now works (mostly), MSS values are still off; Especially the 
behavior below <=1036 is difficult, as for v4 the minimum MTU is 576 and 
minimum MSS is 536:

RequestedDPRINT
1320  1292
1319  1291
1100  1024
1036  1024
1035271
1000268

So, I guess there is something else amiss; Will test more in-depth later 
tonight and shift the bits a bit; Also, I will look into behavior when forcing 
>1500 MTUs (in case that works with the rtl8139).

Sidenote: This all seems to be a non-issue for v6, as the RTL8139 does not 
support TSO for v6, so at least one thing less to worry about. ;-)

With best regards,
Tobias




RE: [PATCH for-7.2] rtl8139: honor large send MSS value

2022-11-16 Thread Tobias Fiebig
Heho,
> Ok, I think I found at least one issue:
> 
> /* large send MSS mask, bits 16...25 */
> #define CP_TC_LGSEN_MSS_MASK ((1 << 12) - 1)
> 
> First, MSS occupies 11 bits from 16 to 26 Second, the mask is wrong it should 
> be ((1 << 11) - 1)
Awesome, thanks, will give this a shot later on and let you know.

With best regards,
Tobias




Re: [PATCH for-7.2] rtl8139: honor large send MSS value

2022-11-15 Thread Jason Wang
On Wed, Nov 16, 2022 at 10:59 AM Tobias Fiebig  wrote:
>
> Heho,
> I just tested around with the patch;
> Good news: Certainly my builds are being executed. Also, if I patch the old 
> code to have a MAX_MTU <= the max MTU on my path, throughput is ok.
>
> Bad news: Something is wrong with getting the MSS in the patch you shared. 
> When enabling DPRINT, values are off (sent MSS vs. printed MSS):
> 600 2060
> 800 2308
> 1000 2316
> 1023 2307
> 1200 3076
> 1400 3340 (most likely clamped to 1320)
>
> Fiddling around a bit more, I found txdw0 printed earlier in the stack as hex 
> (sent MSS, txdw0):
> 769 900502f5
> 1000 900503dc
> 1280 900504f4
> 1281 900504f5
> 1301 90050509
> 1317 90050519
> 1320 9005051c
>
> This maps rather well to:
> MSS = txdw0 - 2416246772
> MSS = txdw0 - 9004FFF4
>
> Sadly, my C is 'non-existent' and it is kind-of 4AM, so also not in the 
> brainspace to fill those gaps. But if one of you could look at the patch 
> again, that would be nice. Otherwise, I should have some brainspace for this 
> tomorrow night (UTC) again.
>

Ok, I think I found at least one issue:

/* large send MSS mask, bits 16...25 */
#define CP_TC_LGSEN_MSS_MASK ((1 << 12) - 1)

First, MSS occupies 11 bits from 16 to 26
Second, the mask is wrong it should be ((1 << 11) - 1)

Thanks

> With best regards,
> Tobias
>




Re: [PATCH for-7.2] rtl8139: honor large send MSS value

2022-11-15 Thread Jason Wang
On Wed, Nov 16, 2022 at 7:43 AM Tobias Fiebig  wrote:
>
> Heho,
> Just to keep you in the loop; Just applied the patch, but things didn't 
> really get better; I am currently doing a 'make clean; make' for good measure 
> (had built head first), and will also double-check that there is no 
> accidental use of system-qemu libs.
>
> If that still doesn't show an effect, I'll hold tcpdump to the wire again.
>
> With best regards,
> Tobias

It might be also helpful to dump mss saw by Qemu to see if it really
changes or differs a lot from 1500.

Thanks

>
> -Original Message-
> From: Stefan Hajnoczi 
> Sent: Tuesday, 15 November 2022 17:37
> To: qemu-devel@nongnu.org
> Cc: jasow...@redhat.com; qemu-sta...@nongnu.org; Stefan Hajnoczi 
> ; Russell King - ARM Linux ; 
> Tobias Fiebig 
> Subject: [PATCH for-7.2] rtl8139: honor large send MSS value
>
> The Large-Send Task Offload Tx Descriptor (9.2.1 Transmit) has a Large-Send 
> MSS value where the driver specifies the MSS. See the datasheet here:
> http://realtek.info/pdf/rtl8139cp.pdf
>
> The code ignores this value and uses a hardcoded MSS of 1500 bytes instead. 
> When the MTU is less than 1500 bytes the hardcoded value results in IP 
> fragmentation and poor performance.
>
> Use the Large-Send MSS value to correctly size Large-Send packets.
>
> This issue was discussed in the past here:
> https://lore.kernel.org/all/20161114162505.GD26664@stefanha-x1.localdomain/
>
> Reported-by: Russell King - ARM Linux 
> Reported-by: Tobias Fiebig 
> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1312
> Signed-off-by: Stefan Hajnoczi 
> ---
>  hw/net/rtl8139.c | 13 -
>  1 file changed, 8 insertions(+), 5 deletions(-)
>
> Tobias: Please test this fix. Thanks!
>
> diff --git a/hw/net/rtl8139.c b/hw/net/rtl8139.c index e6643e3c9d..ecc4dcb07f 
> 100644
> --- a/hw/net/rtl8139.c
> +++ b/hw/net/rtl8139.c
> @@ -77,7 +77,6 @@
>  ( ( input ) & ( size - 1 )  )
>
>  #define ETHER_TYPE_LEN 2
> -#define ETH_MTU 1500
>
>  #define VLAN_TCI_LEN 2
>  #define VLAN_HLEN (ETHER_TYPE_LEN + VLAN_TCI_LEN) @@ -2151,8 +2150,8 @@ 
> static int rtl8139_cplus_transmit_one(RTL8139State *s)
>
>  int large_send_mss = (txdw0 >> 16) & CP_TC_LGSEN_MSS_MASK;
>
> -DPRINTF("+++ C+ mode offloaded task TSO MTU=%d IP data %d "
> -"frame data %d specified MSS=%d\n", ETH_MTU,
> +DPRINTF("+++ C+ mode offloaded task TSO IP data %d "
> +"frame data %d specified MSS=%d\n",
>  ip_data_len, saved_size - ETH_HLEN, large_send_mss);
>
>  int tcp_send_offset = 0; @@ -2177,9 +2176,13 @@ static int 
> rtl8139_cplus_transmit_one(RTL8139State *s)
>  goto skip_offload;
>  }
>
> -/* ETH_MTU = ip header len + tcp header len + payload */
> +/* MSS too small? */
> +if (tcp_hlen + hlen >= large_send_mss) {
> +goto skip_offload;
> +}
> +
>  int tcp_data_len = ip_data_len - tcp_hlen;
> -int tcp_chunk_size = ETH_MTU - hlen - tcp_hlen;
> +int tcp_chunk_size = large_send_mss - hlen - tcp_hlen;
>
>  DPRINTF("+++ C+ mode TSO IP data len %d TCP hlen %d TCP "
>  "data len %d TCP chunk size %d\n", ip_data_len,
> --
> 2.38.1
>
>




Re: [PATCH for-7.2] rtl8139: honor large send MSS value

2022-11-15 Thread Jason Wang
On Wed, Nov 16, 2022 at 12:37 AM Stefan Hajnoczi  wrote:
>
> The Large-Send Task Offload Tx Descriptor (9.2.1 Transmit) has a
> Large-Send MSS value where the driver specifies the MSS. See the
> datasheet here:
> http://realtek.info/pdf/rtl8139cp.pdf
>
> The code ignores this value and uses a hardcoded MSS of 1500 bytes
> instead. When the MTU is less than 1500 bytes the hardcoded value
> results in IP fragmentation and poor performance.
>
> Use the Large-Send MSS value to correctly size Large-Send packets.
>
> This issue was discussed in the past here:
> https://lore.kernel.org/all/20161114162505.GD26664@stefanha-x1.localdomain/
>
> Reported-by: Russell King - ARM Linux 
> Reported-by: Tobias Fiebig 
> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1312
> Signed-off-by: Stefan Hajnoczi 

Acked-by: Jason Wang 

Thanks

> ---
>  hw/net/rtl8139.c | 13 -
>  1 file changed, 8 insertions(+), 5 deletions(-)
>
> Tobias: Please test this fix. Thanks!
>
> diff --git a/hw/net/rtl8139.c b/hw/net/rtl8139.c
> index e6643e3c9d..ecc4dcb07f 100644
> --- a/hw/net/rtl8139.c
> +++ b/hw/net/rtl8139.c
> @@ -77,7 +77,6 @@
>  ( ( input ) & ( size - 1 )  )
>
>  #define ETHER_TYPE_LEN 2
> -#define ETH_MTU 1500
>
>  #define VLAN_TCI_LEN 2
>  #define VLAN_HLEN (ETHER_TYPE_LEN + VLAN_TCI_LEN)
> @@ -2151,8 +2150,8 @@ static int rtl8139_cplus_transmit_one(RTL8139State *s)
>
>  int large_send_mss = (txdw0 >> 16) & CP_TC_LGSEN_MSS_MASK;
>
> -DPRINTF("+++ C+ mode offloaded task TSO MTU=%d IP data %d "
> -"frame data %d specified MSS=%d\n", ETH_MTU,
> +DPRINTF("+++ C+ mode offloaded task TSO IP data %d "
> +"frame data %d specified MSS=%d\n",
>  ip_data_len, saved_size - ETH_HLEN, large_send_mss);
>
>  int tcp_send_offset = 0;
> @@ -2177,9 +2176,13 @@ static int rtl8139_cplus_transmit_one(RTL8139State *s)
>  goto skip_offload;
>  }
>
> -/* ETH_MTU = ip header len + tcp header len + payload */
> +/* MSS too small? */
> +if (tcp_hlen + hlen >= large_send_mss) {
> +goto skip_offload;
> +}
> +
>  int tcp_data_len = ip_data_len - tcp_hlen;
> -int tcp_chunk_size = ETH_MTU - hlen - tcp_hlen;
> +int tcp_chunk_size = large_send_mss - hlen - tcp_hlen;
>
>  DPRINTF("+++ C+ mode TSO IP data len %d TCP hlen %d TCP "
>  "data len %d TCP chunk size %d\n", ip_data_len,
> --
> 2.38.1
>




RE: [PATCH for-7.2] rtl8139: honor large send MSS value

2022-11-15 Thread Tobias Fiebig
Heho,
I just tested around with the patch;
Good news: Certainly my builds are being executed. Also, if I patch the old 
code to have a MAX_MTU <= the max MTU on my path, throughput is ok.

Bad news: Something is wrong with getting the MSS in the patch you shared. When 
enabling DPRINT, values are off (sent MSS vs. printed MSS):
600 2060
800 2308
1000 2316
1023 2307
1200 3076
1400 3340 (most likely clamped to 1320)

Fiddling around a bit more, I found txdw0 printed earlier in the stack as hex 
(sent MSS, txdw0):
769 900502f5
1000 900503dc
1280 900504f4
1281 900504f5
1301 90050509
1317 90050519
1320 9005051c

This maps rather well to:
MSS = txdw0 - 2416246772 
MSS = txdw0 - 9004FFF4

Sadly, my C is 'non-existent' and it is kind-of 4AM, so also not in the 
brainspace to fill those gaps. But if one of you could look at the patch again, 
that would be nice. Otherwise, I should have some brainspace for this tomorrow 
night (UTC) again.

With best regards,
Tobias




RE: [PATCH for-7.2] rtl8139: honor large send MSS value

2022-11-15 Thread Tobias Fiebig
Heho,
Just to keep you in the loop; Just applied the patch, but things didn't really 
get better; I am currently doing a 'make clean; make' for good measure (had 
built head first), and will also double-check that there is no accidental use 
of system-qemu libs. 

If that still doesn't show an effect, I'll hold tcpdump to the wire again.

With best regards,
Tobias 

-Original Message-
From: Stefan Hajnoczi  
Sent: Tuesday, 15 November 2022 17:37
To: qemu-devel@nongnu.org
Cc: jasow...@redhat.com; qemu-sta...@nongnu.org; Stefan Hajnoczi 
; Russell King - ARM Linux ; Tobias 
Fiebig 
Subject: [PATCH for-7.2] rtl8139: honor large send MSS value

The Large-Send Task Offload Tx Descriptor (9.2.1 Transmit) has a Large-Send MSS 
value where the driver specifies the MSS. See the datasheet here:
http://realtek.info/pdf/rtl8139cp.pdf

The code ignores this value and uses a hardcoded MSS of 1500 bytes instead. 
When the MTU is less than 1500 bytes the hardcoded value results in IP 
fragmentation and poor performance.

Use the Large-Send MSS value to correctly size Large-Send packets.

This issue was discussed in the past here:
https://lore.kernel.org/all/20161114162505.GD26664@stefanha-x1.localdomain/

Reported-by: Russell King - ARM Linux 
Reported-by: Tobias Fiebig 
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1312
Signed-off-by: Stefan Hajnoczi 
---
 hw/net/rtl8139.c | 13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

Tobias: Please test this fix. Thanks!

diff --git a/hw/net/rtl8139.c b/hw/net/rtl8139.c index e6643e3c9d..ecc4dcb07f 
100644
--- a/hw/net/rtl8139.c
+++ b/hw/net/rtl8139.c
@@ -77,7 +77,6 @@
 ( ( input ) & ( size - 1 )  )
 
 #define ETHER_TYPE_LEN 2
-#define ETH_MTU 1500
 
 #define VLAN_TCI_LEN 2
 #define VLAN_HLEN (ETHER_TYPE_LEN + VLAN_TCI_LEN) @@ -2151,8 +2150,8 @@ static 
int rtl8139_cplus_transmit_one(RTL8139State *s)
 
 int large_send_mss = (txdw0 >> 16) & CP_TC_LGSEN_MSS_MASK;
 
-DPRINTF("+++ C+ mode offloaded task TSO MTU=%d IP data %d "
-"frame data %d specified MSS=%d\n", ETH_MTU,
+DPRINTF("+++ C+ mode offloaded task TSO IP data %d "
+"frame data %d specified MSS=%d\n",
 ip_data_len, saved_size - ETH_HLEN, large_send_mss);
 
 int tcp_send_offset = 0; @@ -2177,9 +2176,13 @@ static int 
rtl8139_cplus_transmit_one(RTL8139State *s)
 goto skip_offload;
 }
 
-/* ETH_MTU = ip header len + tcp header len + payload */
+/* MSS too small? */
+if (tcp_hlen + hlen >= large_send_mss) {
+goto skip_offload;
+}
+
 int tcp_data_len = ip_data_len - tcp_hlen;
-int tcp_chunk_size = ETH_MTU - hlen - tcp_hlen;
+int tcp_chunk_size = large_send_mss - hlen - tcp_hlen;
 
 DPRINTF("+++ C+ mode TSO IP data len %d TCP hlen %d TCP "
 "data len %d TCP chunk size %d\n", ip_data_len,
--
2.38.1