Re: CVS commit: src/share/man/man4

2020-04-09 Thread Jason Thorpe



> On Apr 9, 2020, at 7:19 PM, SAITOH Masanobu  wrote:
> 
> You're welcome.
> Some drivers still have no m_defrag() code, so we should add it
> to them().

Others do something different than m_defrag() do essentially the same effect.  
Personally, I am not a huge fan of the m_defrag() API.

-- thorpej



Re: CVS commit: src/share/man/man4

2020-04-09 Thread SAITOH Masanobu

On 2020/04/10 2:42, David Young wrote:

On Thu, Apr 09, 2020 at 03:25:32PM +0900, SAITOH Masanobu wrote:

On 2020/04/09 11:08, David Young wrote:

On Wed, Apr 08, 2020 at 11:01:52PM +, Jaromir Dolecek wrote:

on I219 I observe about 35% transmit performance drop when tso4 enabled


This sounds familiar.  There was a bug affecting TCP segmentation
offload (I think) that we found at CoyotePoint.  ISTR
bus_dmamap_load_mbuf(9) failed with EFBIG because under some
circumstances the number of segments in the DMA map was too small
for the mbuf chain.  The driver would drop the whole mbuf chain
on the floor.  This showed up as terrible performance under some
circumstances---possibly when the TCP window grew long?  The solution
was to increase the number of DMA segments, *I think*.


m_defrag() was added to -current in September 2018, and 9.0,
8.1, post 7.2 have this code.


Thank you, that's just the change I was thinking of.


You're welcome.
Some drivers still have no m_defrag() code, so we should add it
to them().


Dave




--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)


Re: CVS commit: src/share/man/man4

2020-04-09 Thread David Young
On Thu, Apr 09, 2020 at 03:25:32PM +0900, SAITOH Masanobu wrote:
> On 2020/04/09 11:08, David Young wrote:
> > On Wed, Apr 08, 2020 at 11:01:52PM +, Jaromir Dolecek wrote:
> > > on I219 I observe about 35% transmit performance drop when tso4 enabled
> > 
> > This sounds familiar.  There was a bug affecting TCP segmentation
> > offload (I think) that we found at CoyotePoint.  ISTR
> > bus_dmamap_load_mbuf(9) failed with EFBIG because under some
> > circumstances the number of segments in the DMA map was too small
> > for the mbuf chain.  The driver would drop the whole mbuf chain
> > on the floor.  This showed up as terrible performance under some
> > circumstances---possibly when the TCP window grew long?  The solution
> > was to increase the number of DMA segments, *I think*.
> 
> m_defrag() was added to -current in September 2018, and 9.0,
> 8.1, post 7.2 have this code.

Thank you, that's just the change I was thinking of.

Dave

-- 
David Young
dyo...@pobox.comUrbana, IL(217) 721-9981


Re: CVS commit: src/sys

2020-04-09 Thread Joerg Sonnenberger
On Thu, Apr 09, 2020 at 09:52:37PM +0900, Tetsuya Isaki wrote:
> At Fri, 3 Apr 2020 17:51:21 +0200,
> Joerg Sonnenberger wrote:
> > It seems perfectly
> > sensible to me that the final output device can provide a lower limit as
> > well as having one derived from HZ and using whatever is higher.
> 
> Sorry, I could not translate well and I didn't understand.
> Could you write that in another way?

There are two possible reasons for a lower limit for the buffer size:
(1) The device requires a certain amount.
(2) The system wants to ensure the interrupt rate doesn't go over a
certain rate.

The real value shouldn't be a constant, but the maximum of the two?

Joerg


Re: CVS commit: src/sys/arch/xen/xen

2020-04-09 Thread Frank Kardel

HI,

I am not sur whether it is the commit below, but 2 out 4 times my 
xen-DOMU from today (20200409/9.99.55)


panics with following locking botch:

[  29.9301379] panic: kernel diagnostic assertion "IFNET_LOCKED(ifp)" 
failed: file "/usr/src/sys/arch/xen/xen/if_xennet_xenbus.c", line 1120

[  29.9301379] cpu2: Begin traceback...
[  29.9301379] vpanic() at netbsd:vpanic+0x146
[  29.9301379] kern_assert() at netbsd:kern_assert+0x48
[  29.9301379] xennet_ioctl() at netbsd:xennet_ioctl+0x6d
[  29.9301379] if_mcast_op() at netbsd:if_mcast_op+0x6a
[  29.9301379] in6_addmulti() at netbsd:in6_addmulti+0x153
[  29.9301379] in6_joingroup() at netbsd:in6_joingroup+0x45
[  29.9301379] ip6_ctloutput() at netbsd:ip6_ctloutput+0x141c
[  29.9301379] udp6_ctloutput() at netbsd:udp6_ctloutput+0xa2
[  29.9301379] udp6_ctloutput_wrapper() at 
netbsd:udp6_ctloutput_wrapper+0x2c

[  29.9301379] sosetopt() at netbsd:sosetopt+0x5c
[  29.9301379] sys_setsockopt() at netbsd:sys_setsockopt+0x8e
[  29.9301379] syscall() at netbsd:syscall+0x9c
[  29.9301379] --- syscall (number 105) ---
[  29.9301379] 75d934d3469a:
[  29.9301379] cpu2: End traceback...
[  29.9301379] rebooting...

Best regards,

  Frank


On 04/06/20 20:23, Jaromir Dolecek wrote:

Module Name:src
Committed By:   jdolecek
Date:   Mon Apr  6 18:23:21 UTC 2020

Modified Files:
src/sys/arch/xen/xen: if_xennet_xenbus.c

Log Message:
convert to IFEF_MPSAFE, also enable interrupt handler without biglock

no performance difference observed compared to version before change,
for neither UP nor MP DomU


To generate a diff of this commit:
cvs rdiff -u -r1.105 -r1.106 src/sys/arch/xen/xen/if_xennet_xenbus.c

Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.





Re: CVS commit: src/sys

2020-04-09 Thread Tetsuya Isaki
At Fri, 3 Apr 2020 17:51:21 +0200,
Joerg Sonnenberger wrote:
> It seems perfectly
> sensible to me that the final output device can provide a lower limit as
> well as having one derived from HZ and using whatever is higher.

Sorry, I could not translate well and I didn't understand.
Could you write that in another way?

Thanks,
---
Tetsuya Isaki 


Re: CVS commit: src/share/man/man4

2020-04-09 Thread SAITOH Masanobu

Hi.

On 2020/04/09 11:08, David Young wrote:

On Wed, Apr 08, 2020 at 11:01:52PM +, Jaromir Dolecek wrote:

Module Name:src
Committed By:   jdolecek
Date:   Wed Apr  8 23:01:52 UTC 2020

Modified Files:
src/share/man/man4: wm.4

Log Message:
add a warning in checksum offload that hardware TCP segmentation might be
slow

on I219 I observe about 35% transmit performance drop when tso4 enabled


This sounds familiar.  There was a bug affecting TCP segmentation
offload (I think) that we found at CoyotePoint.  ISTR
bus_dmamap_load_mbuf(9) failed with EFBIG because under some
circumstances the number of segments in the DMA map was too small
for the mbuf chain.  The driver would drop the whole mbuf chain
on the floor.  This showed up as terrible performance under some
circumstances---possibly when the TCP window grew long?  The solution
was to increase the number of DMA segments, *I think*.

I don't think CoyotePoint ever fed its change back to NetBSD,
unfortunately.  On the other hand, some other NetBSDer may have
independently fixed the bug.

Do any stats increase (vmstat -e, ifconfig -v wm0) when the poor
performance occurs? You may have to enable WM_DEBUG or something to see
all of the relevant stats.


m_defrag() was added to -current in September 2018, and 9.0,
8.1, post 7.2 have this code.


http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/dev/pci/if_wm.c.diff?r1=1.586=1.587=date=h

The driver has wmX txqYYtoomanyseg event counter, so we can check it
by enabling "options WM_EVENT_COUNTERS". The counter is disabled by
default.


Dave




--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)