Re: ixg wierdness

2021-12-21 Thread SAITOH Masanobu

On 2021/12/22 9:38, SAITOH Masanobu wrote:

Hi.

On 2021/12/21 19:53, Patrick Welche wrote:

On a box with 4 bnx and 4 ixg interfaces, I just hit PR kern/53155
when trying to use bnx1. (Built LOCKDEBUG etc kernel, with serial
console. Hang such that ~# doesn't drop into ddb) No problems
running as an NFS server for a year or two just using bnx0.
(I didn't try "up"ing bnx2)

So I tried swapping to use ixg0 and ixg1 instead.

I see a strange bursty pattern with what looks like a 1s count down, e.g.:

64 bytes from 10.0.0.236: icmp_seq=642 ttl=255 time=37004.721972 ms
64 bytes from 10.0.0.236: icmp_seq=643 ttl=255 time=36004.533428 ms
64 bytes from 10.0.0.236: icmp_seq=644 ttl=255 time=35004.224479 ms
64 bytes from 10.0.0.236: icmp_seq=645 ttl=255 time=34003.925027 ms
64 bytes from 10.0.0.236: icmp_seq=646 ttl=255 time=33003.615239 ms
64 bytes from 10.0.0.236: icmp_seq=647 ttl=255 time=32003.313832 ms
64 bytes from 10.0.0.236: icmp_seq=648 ttl=255 time=31003.008233 ms
64 bytes from 10.0.0.236: icmp_seq=649 ttl=255 time=30002.702356 ms
64 bytes from 10.0.0.236: icmp_seq=650 ttl=255 time=29002.396480 ms
64 bytes from 10.0.0.236: icmp_seq=651 ttl=255 time=28002.090882 ms
64 bytes from 10.0.0.236: icmp_seq=652 ttl=255 time=27001.772992 ms
64 bytes from 10.0.0.236: icmp_seq=653 ttl=255 time=26001.477731 ms
64 bytes from 10.0.0.236: icmp_seq=654 ttl=255 time=25001.291421 ms
64 bytes from 10.0.0.236: icmp_seq=655 ttl=255 time=24000.965150 ms
64 bytes from 10.0.0.236: icmp_seq=656 ttl=255 time=23000.622398 ms
64 bytes from 10.0.0.236: icmp_seq=657 ttl=255 time=22000.278807 ms
64 bytes from 10.0.0.236: icmp_seq=658 ttl=255 time=20999.931305 ms
64 bytes from 10.0.0.236: icmp_seq=659 ttl=255 time=1.592463 ms
64 bytes from 10.0.0.236: icmp_seq=660 ttl=255 time=19009.253137 ms
64 bytes from 10.0.0.236: icmp_seq=661 ttl=255 time=18008.910105 ms
64 bytes from 10.0.0.236: icmp_seq=662 ttl=255 time=17008.551987 ms
64 bytes from 10.0.0.236: icmp_seq=663 ttl=255 time=16008.224040 ms
64 bytes from 10.0.0.236: icmp_seq=664 ttl=255 time=15007.874862 ms
64 bytes from 10.0.0.236: icmp_seq=665 ttl=255 time=14007.533506 ms
64 bytes from 10.0.0.236: icmp_seq=666 ttl=255 time=13007.194943 ms
64 bytes from 10.0.0.236: icmp_seq=667 ttl=255 time=12006.852469 ms
64 bytes from 10.0.0.236: icmp_seq=668 ttl=255 time=11006.509437 ms
64 bytes from 10.0.0.236: icmp_seq=669 ttl=255 time=10006.193223 ms
64 bytes from 10.0.0.236: icmp_seq=670 ttl=255 time=9005.846559 ms
64 bytes from 10.0.0.236: icmp_seq=671 ttl=255 time=8005.508556 ms
64 bytes from 10.0.0.236: icmp_seq=672 ttl=255 time=7005.165803 ms
64 bytes from 10.0.0.236: icmp_seq=673 ttl=255 time=6004.818579 ms
64 bytes from 10.0.0.236: icmp_seq=674 ttl=255 time=5004.479458 ms
64 bytes from 10.0.0.236: icmp_seq=675 ttl=255 time=4004.132514 ms
64 bytes from 10.0.0.236: icmp_seq=676 ttl=255 time=3003.794232 ms
64 bytes from 10.0.0.236: icmp_seq=677 ttl=255 time=2003.431084 ms
64 bytes from 10.0.0.236: icmp_seq=678 ttl=255 time=1003.103697 ms
64 bytes from 10.0.0.236: icmp_seq=679 ttl=255 time=2.761223 ms
64 bytes from 10.0.0.236: icmp_seq=717 ttl=255 time=6373.442427 ms
64 bytes from 10.0.0.236: icmp_seq=718 ttl=255 time=5373.238237 ms
64 bytes from 10.0.0.236: icmp_seq=719 ttl=255 time=4372.937388 ms
64 bytes from 10.0.0.236: icmp_seq=720 ttl=255 time=3372.631791 ms
64 bytes from 10.0.0.236: icmp_seq=721 ttl=255 time=2372.325913 ms
64 bytes from 10.0.0.236: icmp_seq=722 ttl=255 time=1372.006627 ms
64 bytes from 10.0.0.236: icmp_seq=723 ttl=255 time=371.714159 ms
ping: sendto: Host is down
ping: sendto: Host is down
ping: sendto: Host is down
ping: sendto: Host is down
ping: sendto: Host is down
...
then eventually wakes up again

when pinging to its ixg0 interface.

You see the bursts while running tcpdump -ni ixg0.

ixg0 at pci8 dev 0 function 0: Intel(R) PRO/10GbE PCI-Express Network Driver, 
Version - 4.0.1-k
ixg0: device 82599EB
ixg0: ETrackID 81a5
ixg0: autoconfiguration error: failed to allocate MSI-X interrupt
ixg0: interrupting at ioapic1 pin 22
ixg0: Ethernet address 00:1b:21:9a:d4:84
ixg0: PHY: OUI 0x0014a6 model 0x0001, rev. 0
ixg0: PCI Express Bus: Speed 5.0GT/s Width x8
ixg0: feature cap 0x1780
ixg0: feature ena 0x1000

ixg0: flags=0x8843 mtu 1500
 capabilities=0x7ff80
 capabilities=0x7ff80
 capabilities=0x7ff80
 enabled=0
 ec_capabilities=0xf
 ec_enabled=0x7
 address: 00:1b:21:9a:d4:84
 media: Ethernet autoselect (1000baseT full-duplex)
 status: active
 inet6 fe80::21b:21ff:fe9a:d484%ixg0/64 flags 0 scopeid 0x5
 inet 10.0.0.236/24 broadcast 10.0.0.255 flags 0

This is with 15 December 2021 -current/amd64.

Any ideas on what might be going on?


Cheers,

Patrick


One of the possibility of the problem is the shortage
of mbuf cluster. Could you show me the output of netstat -m?


% netstat -m
9222 mbufs in use:
    9217 mbufs allocated to data
    3 mbufs allocated

Re: ixg wierdness

2021-12-21 Thread SAITOH Masanobu
 to socket names and addresses
0 calls to protocol drain routines


If the last line's number is not 0, increase kern.mbuf.nmbclusters.


% vmstat -ev | grep "Rx no mbuf"
ixg0 q0 Rx no mbuf00 misc
ixg0 q1 Rx no mbuf00 misc
ixg0 q2 Rx no mbuf00 misc
ixg0 q3 Rx no mbuf00 misc


When MCLGET() failed, the above evcnt also increased.


ixg0: autoconfiguration error: failed to allocate MSI-X interrupt
ixg0: interrupting at ioapic1 pin 22
ixg0: Ethernet address 00:1b:21:9a:d4:84
ixg0: PHY: OUI 0x0014a6 model 0x0001, rev. 0
ixg0: PCI Express Bus: Speed 5.0GT/s Width x8
ixg0: feature cap 0x1780
ixg0: feature ena 0x1000


The allocation of an MSI-X vector failed and it uses INTx.
Could you show me the full dmesg?

And, please show me the output of "vmstat -ev | grep ixg".
I will take a look of it.


--
-------
    SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)


Re: Unable to send packets with ixg(4) driver and NetBSD-9_stable

2020-09-06 Thread SAITOH Masanobu

Hi.

On 2020/09/05 4:54, Brian Buhrow wrote:

hello.  I'm trying to get a 10G interface working on a
NetBSD-9.0_stable/amd64 machine.  I'm able to receive packets on this
interface, but appear to be unable to send packets, though the driver
doesn't report any errors.  Is this a known issue?  Version and driver
details below.  This is with Netbsd-9 CVS sources as of 09/03/2020.

Ideas on how to go about troubleshooting this would be greatly
appreciated.
-thanks
-Brian


NetBSD 9.0_STABLE (GENERIC) #0: Fri Sep  4 09:06:40 PDT 2020

buh...@nat-1.via.net:/usr/local/netbsd/obj-64/sys/arch/amd64/compile/GENERIC
total memory = 32759 MB
avail memory = 31784 MB

. . .

ixg0 at pci4 dev 0 function 0: Intel(R) PRO/10GbE PCI-Express Network Driver, 
Version - 4.0.1-k
ixg0: device 82599EB
ixg0: ETrackID 86c5
ixg0: for TX/RX, interrupting at msix2 vec 0, bound queue 0 to cpu 0
ixg0: for TX/RX, interrupting at msix2 vec 1, bound queue 1 to cpu 1
ixg0: for TX/RX, interrupting at msix2 vec 2, bound queue 2 to cpu 2
ixg0: for TX/RX, interrupting at msix2 vec 3, bound queue 3 to cpu 3
ixg0: for TX/RX, interrupting at msix2 vec 4, bound queue 4 to cpu 4
ixg0: for TX/RX, interrupting at msix2 vec 5, bound queue 5 to cpu 5
ixg0: for TX/RX, interrupting at msix2 vec 6, bound queue 6 to cpu 6
ixg0: for TX/RX, interrupting at msix2 vec 7, bound queue 7 to cpu 7
ixg0: for TX/RX, interrupting at msix2 vec 8, bound queue 8 to cpu 8
ixg0: for TX/RX, interrupting at msix2 vec 9, bound queue 9 to cpu 9
ixg0: for TX/RX, interrupting at msix2 vec 10, bound queue 10 to cpu 10
ixg0: for TX/RX, interrupting at msix2 vec 11, bound queue 11 to cpu 11
ixg0: for TX/RX, interrupting at msix2 vec 12, bound queue 12 to cpu 12
ixg0: for TX/RX, interrupting at msix2 vec 13, bound queue 13 to cpu 13
ixg0: for TX/RX, interrupting at msix2 vec 14, bound queue 14 to cpu 14
ixg0: for TX/RX, interrupting at msix2 vec 15, bound queue 15 to cpu 15
ixg0: for link, interrupting at msix2 vec 16, affinity to cpu 0
ixg0: Using MSI-X interrupts with 17 vectors
ixg0: Ethernet address a0:36:9f:66:47:24
ixg0: PCI Express Bus: Speed 5.0GT/s Width x8
ixg0: feature cap 0x1780
ixg0: feature ena 0x400
ixg1 at pci4 dev 0 function 1: Intel(R) PRO/10GbE PCI-Express Network Driver, 
Version - 4.0.1-k
ixg1: device 82599EB
WARNING: Intel (R) Network Connections are quality tested using Intel (R) 
Ethernet Optics. Using untested modules is not supported and may cause unstable 
operation or damage to the module or the adapter. Intel Corporation is not 
responsible for any harm caused by using untested modules.
ixg1: ETrackID 86c5
ixg1: for TX/RX, interrupting at msix3 vec 0, bound queue 0 to cpu 0
ixg1: for TX/RX, interrupting at msix3 vec 1, bound queue 1 to cpu 1
ixg1: for TX/RX, interrupting at msix3 vec 2, bound queue 2 to cpu 2
ixg1: for TX/RX, interrupting at msix3 vec 3, bound queue 3 to cpu 3
ixg1: for TX/RX, interrupting at msix3 vec 4, bound queue 4 to cpu 4
ixg1: for TX/RX, interrupting at msix3 vec 5, bound queue 5 to cpu 5
ixg1: for TX/RX, interrupting at msix3 vec 6, bound queue 6 to cpu 6
ixg1: for TX/RX, interrupting at msix3 vec 7, bound queue 7 to cpu 7
ixg1: for TX/RX, interrupting at msix3 vec 8, bound queue 8 to cpu 8
ixg1: for TX/RX, interrupting at msix3 vec 9, bound queue 9 to cpu 9
ixg1: for TX/RX, interrupting at msix3 vec 10, bound queue 10 to cpu 10
ixg1: for TX/RX, interrupting at msix3 vec 11, bound queue 11 to cpu 11
ixg1: for TX/RX, interrupting at msix3 vec 12, bound queue 12 to cpu 12
ixg1: for TX/RX, interrupting at msix3 vec 13, bound queue 13 to cpu 13
ixg1: for TX/RX, interrupting at msix3 vec 14, bound queue 14 to cpu 14
ixg1: for TX/RX, interrupting at msix3 vec 15, bound queue 15 to cpu 15
ixg1: for link, interrupting at msix3 vec 16, affinity to cpu 0
ixg1: Using MSI-X interrupts with 17 vectors
ixg1: Ethernet address a0:36:9f:66:47:26
WARNING: Intel (R) Network Connections are quality tested using Intel (R) 
Ethernet Optics. Using untested modules is not supported and may cause unstable 
operation or damage to the module or the adapter. Intel Corporation is not 
responsible for any harm caused by using untested modules.


Have you tried any other SFP+ modules?


ixg1: PCI Express Bus: Speed 5.0GT/s Width x8
ixg1: feature cap 0x1780
ixg1: feature ena 0x400




--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)


MII PHY autonego restart change

2020-08-23 Thread SAITOH Masanobu
 Hi.

I've commited the following change now:

--
Module Name:src
Committed By:   msaitoh
Date:   Mon Aug 24 04:23:41 UTC 2020

Modified Files:
src/sys/dev/mii: ciphy.c mii_physubr.c miivar.h urlphy.c

Log Message:
 Don't do full initialization for autonego when just restarting autonego
because it's not required.

 This change reduce extra initialization which include PHY_RESET() which
caused long delay(max 500ms).


To generate a diff of this commit:
cvs rdiff -u -r1.40 -r1.41 src/sys/dev/mii/ciphy.c
cvs rdiff -u -r1.91 -r1.92 src/sys/dev/mii/mii_physubr.c
cvs rdiff -u -r1.72 -r1.73 src/sys/dev/mii/miivar.h
cvs rdiff -u -r1.36 -r1.37 src/sys/dev/mii/urlphy.c

Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
--

 Linux also doesn't do full initialization when just restarting autonego.

 If you see any regression, please let me know.

Thanks.

-- 
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)


Re: timecounter changes break netbsd-9/i386 for NET4501

2020-07-27 Thread SAITOH Masanobu

Hi, John.

On 2020/07/27 4:33, John D. Baker wrote:

The following patch fixes the problem:

+Index: sys/arch/x86/x86/cpu.c
+===
+RCS file: /cvsroot/src/sys/arch/x86/x86/cpu.c,v
+retrieving revision 1.171.2.2
+diff -u -p -r1.171.2.2 cpu.c
+--- sys/arch/x86/x86/cpu.c 15 Jul 2020 17:25:08 -  1.171.2.2
 sys/arch/x86/x86/cpu.c 26 Jul 2020 17:30:27 -
+@@ -1267,7 +1267,7 @@ cpu_get_tsc_freq(struct cpu_info *ci)
+ {
+   uint64_t freq = 0, last_tsc;
+
+-  if (cpu_hascounter())
++  if (cpu_hascounter()) {
+   freq = cpu_tsc_freq_cpuid(ci);
+
+   if (freq != 0) {
+@@ -1280,6 +1280,7 @@ cpu_get_tsc_freq(struct cpu_info *ci)
+   ci->ci_data.cpu_cc_freq =
+   (cpu_counter_serializing() - last_tsc) * 10;
+   }
++  }
+ }
+
+ void



Yes, your patch is correct.

I'll send new pullup request to fix this problem soon.

I'm sorry for the regression and thank you very much.

--
---
    SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)


Re: wm0 panic

2020-07-13 Thread SAITOH Masanobu
ks held on cpu0:

* Lock 0 (initialized at main)
lock address : 0x818a9600 type :   spin
initialized  : 0x80e7e7e9
shared holds :  0 exclusive:  1
shares wanted:  0 exclusive:  2
relevant cpu :  0 last held:  0
relevant lwp : 0x85078a413b40 last held: 0x85078a413b40
last locked* : 0x80db867b unlocked : 0x80db866c
curcpu holds :  0 wanted by: 0x850788f52140
trace: pid 507 lid 507 at 0x96026e1d4990
lockdebug_unlocked() at netbsd:lockdebug_unlocked+0x10b
comcnputc() at netbsd:comcnputc+0xaa
comcnputc() at netbsd:comcnputc+0xaa

* Lock 1 (initialized at ifmedia_init_with_lock)
lock address : 0x850789377380 type :   spin
initialized  : 0x80dd14c2
shared holds :  0 exclusive:  1
shares wanted:  0 exclusive:  0
relevant cpu :  0 last held:  0
relevant lwp : 0x85078a413b40 last held: 0x85078a413b40
last locked* : 0x80dd124f unlocked : 0x803325fb
owner field  : 0x00010600 wait/spin:0/1
trace: pid 507 lid 507 at 0x96026e1d4990
lockdebug_unlocked() at netbsd:lockdebug_unlocked+0x10b
comcnputc() at netbsd:comcnputc+0xaa
comcnputc() at netbsd:comcnputc+0xaa

--

--
-------
    SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)


Re: Failure to build -current - lapic_reset

2020-05-20 Thread SAITOH Masanobu

On 2020/05/20 17:33, Chavdar Ivanov wrote:

Hi,

EVen after 'make cleandir' and removing the obj directory, I get three
times in a row:

#   compile  GENERIC/if_media_80.o
/home/sysbuild/amd64/tools/bin/x86_64--netbsd-gcc -mcmodel=kernel
-mno-red-zone -mno-mmx -mno-sse -mno-avx -msoft-float
-mindirect-branch=thunk -mindirect-branch-registe
r -ffreestanding -fno-zero-initialized-in-bss
-fno-delete-null-pointer-checks -g -O2 -fno-omit-frame-pointer
-fstack-protector -Wstack-protector --param ssp-buffer-size=
1 -fno-strict-aliasing -fno-common -std=gnu99 -Werror -Wall -Wno-main
-Wno-format-zero-length -Wpointer-arith -Wmissing-prototypes
-Wstrict-prototypes -Wold-style-defini
tion -Wswitch -Wshadow -Wcast-qual -Wwrite-strings
-Wno-unreachable-code -Wno-pointer-sign -Wno-attributes -Wextra
-Wno-unused-parameter -Wold-style-definition -Wno-sign
-compare --sysroot=/home/sysbuild/amd64/destdir -Damd64 -Dx86_64 -I.
-I/home/sysbuild/src/sys/external/mit/xen-include-public/dist/
-I/home/sysbuild/src/sys/external/bsd
/acpica/dist -I/home/sysbuild/src/sys/external/bsd/libnv/dist
-I/home/sysbuild/src/sys/../common/lib/libx86emu
-I/home/sysbuild/src/sys/../common/lib/libc/misc -I/home/s
ysbuild/src/sys/../common/include -I/home/sysbuild/src/sys/arch
-I/home/sysbuild/src/sys -nostdinc -DCOMPAT_UTILS
-D__XEN_INTERFACE_VERSION__=0x3020a -DDIAGNOSTIC -DCOMP
AT_44 -DDISKLABEL_EI -D_KERNEL -D_KERNEL_OPT -std=gnu99
-I/home/sysbuild/src/sys/lib/libkern/../../../common/lib/libc/quad
-I/home/sysbuild/src/sys/lib/libkern/../../../
common/lib/libc/string
-I/home/sysbuild/src/sys/lib/libkern/../../../common/lib/libc/arch/x86_64/string
-I/home/sysbuild/src/sys/lib/libkern/../../../common/lib/libc/has
h/sha3 -D_FORTIFY_SOURCE=2
-I/home/sysbuild/src/sys/external/isc/atheros_hal/dist
-I/home/sysbuild/src/sys/external/isc/atheros_hal/ic
-I/home/sysbuild/src/sys/external/
bsd/common/include
-I/home/sysbuild/src/sys/external/bsd/common/include
-I/home/sysbuild/src/sys/external/bsd/drm2/include
-I/home/sysbuild/src/sys/external/bsd/drm2/inc
lude -I/home/sysbuild/src/sys/external/bsd/drm2/include/drm
-I/home/sysbuild/src/sys/external/bsd/common/include
-I/home/sysbuild/src/sys/external/bsd/drm2/dist/include
-I/home/sysbuild/src/sys/external/bsd/drm2/dist/include/drm
-I/home/sysbuild/src/sys/external/bsd/drm2/dist/uapi
-I/home/sysbuild/src/sys/external/bsd/drm2/dist -D__KERN
EL__ -DCONFIG_BACKLIGHT_CLASS_DEVICE=0
-DCONFIG_BACKLIGHT_CLASS_DEVICE_MODULE=0
-DCONFIG_DRM_FBDEV_EMULATION=1 -DCONFIG_FB=0
-I/home/sysbuild/src/sys/../common/include -
I/home/sysbuild/src/sys/external/bsd/libnv/dist
-I/home/sysbuild/src/sys/external/bsd/drm2/i915drm
-I/home/sysbuild/src/sys/external/bsd/drm2/dist/drm/i915 -DCONFIG_DRM_
I915_FBDEV=1 -DCONFIG_DRM_I915_PRELIMINARY_HW_SUPPORT=0
-DCONFIG_DRM_FBDEV_EMULATION=1
-I/home/sysbuild/src/sys/external/bsd/drm2/include/radeon
-I/home/sysbuild/src/sys
/external/bsd/drm2/radeon
-I/home/sysbuild/src/sys/external/bsd/drm2/dist/drm/amd/include
-I/home/sysbuild/src/sys/external/bsd/drm2/dist/drm/radeon
-I/home/sysbuild/src
/sys/external/bsd/drm2/dist/drm/nouveau
-I/home/sysbuild/src/sys/external/bsd/drm2/dist/drm/nouveau/include
-I/home/sysbuild/src/sys/external/bsd/drm2/dist/drm/nouveau/i
nclude/nvkm -I/home/sysbuild/src/sys/external/bsd/drm2/dist/drm/nouveau/nvkm
-I/home/sysbuild/src/sys/external/bsd/drm2/nouveau
-DCONFIG_NOUVEAU_DEBUG=5 -DCONFIG_NOUVEAU
_DEBUG_DEFAULT=3
-I/home/sysbuild/src/sys/external/bsd/acpica/dist/include -c
/home/sysbuild/src/sys/compat/common/if_media_80.c -o if_media_80.o
--- kern-INSTALL ---
/home/sysbuild/src/sys/arch/x86/x86/lapic.c: In function 'lapic_delay':
/home/sysbuild/src/sys/arch/x86/x86/lapic.c:749:4: error: implicit
declaration of function 'lapic_reset'; did you mean 'acpi_reset'?
[-Werror=implicit-function-declaration]
 lapic_reset();
 ^~~


Please update the latest x86/lapic.c

Thanks.


 acpi_reset

..





--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)


Re: Anyone currently using fxp(4)?

2020-03-12 Thread SAITOH Masanobu

On 2020/03/13 10:23, Jason Thorpe wrote:

Is anyone currently using fxp(4) successfully?  I am having trouble with link 
stability on a new-in-box i82559 card, but on a non-x86 platform.

Thx.

-- thorpej


I have some cards. Those are too old, some cards doesn't work (broken).

--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)


Is KUBSAN broken?

2019-10-30 Thread SAITOH Masanobu
 Hi.

 Today, I updated three amd64 machines to the latest -current and
all of them didn't boot. All of them use "options KUBSAN". Two of
them stuck at after "loading /var/db/entropy-file" and another
machine reset after loading the kernel. Without KUBSAN, all of the
machines boot.

 OK: 2019/10/28 06:31:39
 NG: 2019/10/30 02:44:29

-- 
---
    SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)


Re: Panic on a -current from 13/12/2018

2018-12-16 Thread SAITOH Masanobu
Hi.

On 2018/12/16 18:09, Chavdar Ivanov wrote:
> Repeated this morning. Happens when the host hibernates when the
> machine is running. The initial trace is slightly different, but the
> lines with wm_gmii are the same, so for now I will switch to a
> different NIC emulator.
> 

In your .png:
>vpanic()
>lapic_delay()
>wm_gmii_mdic_readreg()
>.
>.
>.

There is no panic message itself, but I suspect it's:
> static void
> lapic_delay(unsigned int usec)
> {
> int32_t xtick, otick;
> int64_t deltat; /* XXX may want to be 64bit */
> 
> otick = lapic_gettick();
> 
> if (usec <= 0)
> return;
> if (usec <= 25)
> deltat = lapic_delaytab[usec];
> else
> deltat = (lapic_frac_cycle_per_usec * usec) >> 32;
> 
> while (deltat > 0) {
> xtick = lapic_gettick();
> if (lapic_broken_periodic && xtick == 0 && otick == 0) {
> lapic_initclocks();
> xtick = lapic_gettick();
> if (xtick == 0)
> panic("lapic timer stopped ticking");   
> <=== here!
> }
> if (xtick > otick)
> deltat -= lapic_tval - (xtick - otick);
> else
> deltat -= otick - xtick;
> otick = xtick;
> 
> x86_pause();
> }
> }

Why does it cause?


> And yes, it used to survive many hibernations of the hosts before. I
> only had to adjust the time after waking the host up.
> On Sat, 15 Dec 2018 at 10:59, Chavdar Ivanov  wrote:
>>
>> Hi,
>>
>> On 8.99.27 AMD64 running under VirtualBox I got this morning the panic
>> in http://ci4ic4.tx0.org/ci4ic4-panic-01.png
>>
>> I have the  coredump, if it is of interest. I thought it might be
>> useful, as it is apparently in the wm driver.
>>
>> Chavdar
>> --
>> 
> 
> 
> 


-- 
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)


Re: ThinkPad - suspend-to-RAM intel-x86 issues and tests

2018-11-30 Thread SAITOH Masanobu
Committed.

 Thank you all!


On 2018/12/01 7:17, David Brownlee wrote:
> On Thu, 29 Nov 2018 at 06:15, Masanobu SAITOH  wrote:
>>
>> On 2018/11/28 22:12, SAITOH Masanobu wrote:
>>>>> http://ftp.netbsd.org/pub/NetBSD/misc/abs/acpi-suspend-resume/pcidump.pre
>>>>> http://ftp.netbsd.org/pub/NetBSD/misc/abs/acpi-suspend-resume/pcidump.post
>>>>
>>>> The diff says we should save/restore MSI table.
>>>> We also should save/restore some other registers.
>>>>
>>>>   Give me one or two days to resolve the problem.
>>>
>>>   Please try the following diff:
>>>
>>>   http://www.netbsd.org/~msaitoh/pci-resume-20181118-0.dif
>>>
>>> Even if I use this change with Thinkpad X220, it doesn't recover from
>>> suspend...
>>
>>   But, my X61 survived from suspend with this patch!
> 
> I can confirm a T420s, T430 and T530 all suspend and resume single
> user or multiuser without X11 including disk and network fine with
> this patch (excellent stuff!).
> 
> X11 on T420s
> - Suspend and resumes fine while in X or on console with X running in
> another virtual console
> - The display seems to reverts to a blank console on resume into which
> you can type
> - Switching vtys fixes the display
> - The ThinkPad touchpoint stops working on resume (but an external USB
> mouse is fine)
> 
> X11 on T530
> - Panics on resume if X is running (even if I then switch to the console)
> drm/i915: Resetting chip after gpu hang
> ufm_fault(0x8a77a0c0, 0x0, 1) -> w
> fatal page fault in supervisor mode
> ...
> at netbsd:fini_hash_table+0x88: movq 1
> 
> but this is awesome progress!
> 
> Thanks
> 
> David
> 
> DAvid
> 


-- 
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)


Re: ThinkPad - suspend-to-RAM intel-x86 issues and tests

2018-11-28 Thread SAITOH Masanobu
On 2018/11/28 14:18, Masanobu SAITOH wrote:
> Hi, David.
> 
> On 2018/11/28 6:09, David Brownlee wrote:
>> On Tue, 27 Nov 2018 at 18:10, David Brownlee  wrote:
>>>
>>> On Tue, 27 Nov 2018 at 08:27, Masanobu SAITOH  wrote:
>>>>
>>>>    Hi, David.
>>>>
>>>> On 2018/11/26 6:11, David Brownlee wrote:
>>>>> I've bisected the changes against the github src copy, and it looks like 
>>>>> the suspend/resume issue is related to the following commit:
>>>>>
>>>>> commit 0fe469276f49bf0dc003300e0b8a35a80b7b246d (HEAD)
>>>>> Author: jdolecek 
>>>>> Date:   Mon Oct 22 20:57:07 2018 +
>>>>>
>>>>>   enable MSI support where available, blatantly copied from 
>>>>> jmcneill's msk(4)
>>>>>
>>>>> I tried building from HEAD with just that one commit reverted, and my 
>>>>> T420s suspends and resumes again!
>>>>>
>>>>> iwn0 is still non responsive after resume and wm0 will not pick up an IP 
>>>>> via dhcpcd, but the disk responds :-p
>>>>
>>>>    (Note that I'm not familiar with suspend/resume though...)
>>>>
>>>>    Our pci_suspend()/pci_resume() copy only first 16 bytes of each PCI
>>>> config space. Other OSes copy some other control registers and
>>>> MSI/MSI-X capability area.
>>>>
>>>>    Could you dump all PCI config space both before and after suspend with:
>>>>
>>>>  http://www.netbsd.org/~msaitoh/pcidump
>>>>
>>>> and put the two output somewhere? Diffing the two output will teach
>>>> us what we have to do.
>>>>
>>>>    Thanks in advance.
>>>
>>> Let me just install to a USB stick to give me a working filesystem
>>> from which to run pcidump after resume :-p
>>
>> Collecting a pre-suspend dump was easy, but getting post-resume turned
>> out to be a little more involved :)
>> - root on wd0 on ahcisata - times out on resume
>> - root on sd0 on usb on xhci - times out on resume
>> - root on sd0 on usb on uhci - loses the root filesystem mount point on 
>> resume
>> - install image - doesn't have the libs to run pcictl
>> - install image, then chroot to mfs with extracted base - suspends but
>> video does not come back (no drm)
>> - root on wd0, then chroot to mfs with extracted base, suspend &
>> resume, then mount sd0 on usb on uhci to save data - \o/
>>
>> After all that it occurred to me I could have probably run the
>> suspend/resume with an older NetBSD version where MSI was not being
>> used. Still, interesting puzzle to try, and useful technique to stash.
>>
>> Files for the ThinkPad T420s:
>>
>> http://ftp.netbsd.org/pub/NetBSD/misc/abs/acpi-suspend-resume/pcidump.pre
>> http://ftp.netbsd.org/pub/NetBSD/misc/abs/acpi-suspend-resume/pcidump.post
> 
> The diff says we should save/restore MSI table.
> We also should save/restore some other registers.
> 
>  Give me one or two days to resolve the problem.

 Please try the following diff:

http://www.netbsd.org/~msaitoh/pci-resume-20181118-0.dif

Even if I use this change with Thinkpad X220, it doesn't recover from
suspend...


> 
>  Thanks.
> 
> 
>> Thanks for looking at this!
>>
>> David
> 
> 
> 


-- 
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)


Re: if_addrflags6: Can't assign requested address

2018-08-17 Thread SAITOH Masanobu
On 2018/08/18 2:45, Roy Marples wrote:
> On 17/08/2018 10:08, Roy Marples wrote:
>> On 17/08/2018 09:04, Masanobu SAITOH wrote:
>>>> wm2: carrier lost
>>>> wm2: executing `/libexec/dhcpcd-run-hooks' NOCARRIER
>>>> wm2: deleting address fe80::1392:4012:56d8:a7a2
>>>> wm2: if_addrflags6: Can't assign requested address
>>>> wm2: if_addrflags6: Can't assign requested address
>>>> wm2: if_addrflags6: Can't assign requested address
>>>> wm2: if_addrflags6: Can't assign requested address
>>>> wm2: carrier acquired
>>>> wm2: executing `/libexec/dhcpcd-run-hooks' CARRIER
>>
>> This helps.
>> I never saw this because on NetBSD-8, we have addrflags available in 
>> ifa_msghdr when sent over route(4). This does not exist on NetBSD-7 so we 
>> need to make an ioctl per address to work out the flags. Sadly, this is racy 
>> and this is what happens:
>>
>> Something adds an address.
>> Kernel annnounces new address to route(4).
>> Something deletes this address.
>> Kernel announces the address deleted to route(4).
>>
>> dhcpcd reads the address added message from route(4) *after* the address has 
>> been deleted from the kernel. Because dhcpcd needs the address flags at this 
>> point, an ioctl is made to the deleted address and boom, error.
>>
>> Luckily dhcpcd handles it correctly and it's just noise.
>> Please test the attached patch to silence it.
>> If you can verify it works, let me know and I'll push a new version out.
> 
> Since then I've discovered two more critical issues with dhcpcd-7 on NetBSD-7.
> 1) Broken IP_PKTINFO implementation
> 2) Invalid RTA_BRD in RTM_NEWADDR messages for new addresses
> Both of these have already been fixed in -8 and -current and neither looks 
> suitable for a pullup and dhcpcd needs a workaround for both anyway.
> 
> A better patch attached and I'll hopefully get this pushed out over the 
> weekend.
> 
> Roy

This patch worked. if_addrflags6's error messages disappeared.

Before this patch,

> Aug 18 01:00:58 amd64-n7 dhcpcd[250]: wm1: dhcp_sendudp: Invalid argument
> Aug 18 01:30:59 amd64-n7 dhcpcd[250]: wm1: dhcp_sendudp: Invalid argument
> Aug 18 02:01:01 amd64-n7 dhcpcd[250]: wm1: dhcp_sendudp: Invalid argument
> Aug 18 02:31:03 amd64-n7 dhcpcd[250]: wm1: dhcp_sendudp: Invalid argument
> Aug 18 03:01:04 amd64-n7 dhcpcd[250]: wm1: dhcp_sendudp: Invalid argument
> Aug 18 03:31:05 amd64-n7 dhcpcd[250]: wm1: dhcp_sendudp: Invalid argument
> Aug 18 04:01:06 amd64-n7 dhcpcd[250]: wm1: dhcp_sendudp: Invalid argument
> Aug 18 04:31:08 amd64-n7 dhcpcd[250]: wm1: dhcp_sendudp: Invalid argument
> Aug 18 05:01:09 amd64-n7 dhcpcd[250]: wm1: dhcp_sendudp: Invalid argument
> Aug 18 05:31:11 amd64-n7 dhcpcd[250]: wm1: dhcp_sendudp: Invalid argument
> Aug 18 06:01:11 amd64-n7 dhcpcd[250]: wm1: dhcp_sendudp: Invalid argument
> Aug 18 06:31:12 amd64-n7 dhcpcd[250]: wm1: dhcp_sendudp: Invalid argument
> Aug 18 07:01:14 amd64-n7 dhcpcd[250]: wm1: dhcp_sendudp: Invalid argument
> Aug 18 07:31:15 amd64-n7 dhcpcd[250]: wm1: dhcp_sendudp: Invalid argument
> Aug 18 08:01:16 amd64-n7 dhcpcd[250]: wm1: dhcp_sendudp: Invalid argument
> Aug 18 08:31:16 amd64-n7 dhcpcd[250]: wm1: dhcp_sendudp: Invalid argument

This error message appeared ever 30 minutes, but it also disappeared
with this patch.

Thanks.


-- 
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)


Re: netbsd-8 crash in ixg driver during booting

2017-11-11 Thread SAITOH Masanobu
 Hi, Uwe.

On 2017/11/12 8:53, SAITOH Masanobu wrote:
>  Hi, all.
> 
> 
> On 2017/11/10 0:02, Chavdar Ivanov wrote:
>> My (very uneducated) guess would be that for some reason adapter->num_queues 
>> does not get initialised. There have been quite a few commits recently here 
>> (although I am looking at -current, I suppose they have been done to 8 as 
>> well). 
> 
>  It might be easy to fix this panic but I can't because I'm now
> attending BSDTW17 in Taipei. I'll be back to Tokyo Monday night.
> 
>  A lot of changes and fixes have done in -current but the pullup
> request for netbsd-8 have not been sent yet. It would take a few
> weeks or more.
> 
>  Thanks.
> 
> 
>> Chavdar Ivanov
>>
>> On Thu, 9 Nov 2017 at 14:18 <6b...@6bone.informatik.uni-leipzig.de 
>> <mailto:6b...@6bone.informatik.uni-leipzig.de>> wrote:
>>
>> the current version of netbsd-8 crashes while booting during the
>> initialization of the network driver.
>>
>> https://suse.uni-leipzig.de/crash/crash1.jpg

 Does your machine boot with the latest -current?
If it boots, could you show the dmesg output with the
following patch?

http://www.netbsd.org/~msaitoh/ixgbe-current-20171112-0.dif

And, if you can, please test for netbsd-8 using with
the following patch and how the dmesg output:

http://www.netbsd.org/~msaitoh/ixgbe-n8-20171112-0.dif


 Thanks in advance.




>> https://suse.uni-leipzig.de/crash/crash2.jpg
>> https://suse.uni-leipzig.de/crash/crash3.jpg
>>
>> My old kernel from August 2017 did not have the problem yet.
>>
>> Can someone take a look at the problem?
>>
>> Thank you for your Efforts
>>
>> Regards
>> Uwe
>>
> 
> 


-- 
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)


Re: netbsd-8 crash in ixg driver during booting

2017-11-11 Thread SAITOH Masanobu
 Hi, all.


On 2017/11/10 0:02, Chavdar Ivanov wrote:
> My (very uneducated) guess would be that for some reason adapter->num_queues 
> does not get initialised. There have been quite a few commits recently here 
> (although I am looking at -current, I suppose they have been done to 8 as 
> well). 

 It might be easy to fix this panic but I can't because I'm now
attending BSDTW17 in Taipei. I'll be back to Tokyo Monday night.

 A lot of changes and fixes have done in -current but the pullup
request for netbsd-8 have not been sent yet. It would take a few
weeks or more.

 Thanks.


> Chavdar Ivanov
> 
> On Thu, 9 Nov 2017 at 14:18 <6b...@6bone.informatik.uni-leipzig.de 
> <mailto:6b...@6bone.informatik.uni-leipzig.de>> wrote:
> 
> the current version of netbsd-8 crashes while booting during the
> initialization of the network driver.
> 
> https://suse.uni-leipzig.de/crash/crash1.jpg
> https://suse.uni-leipzig.de/crash/crash2.jpg
> https://suse.uni-leipzig.de/crash/crash3.jpg
> 
> My old kernel from August 2017 did not have the problem yet.
> 
> Can someone take a look at the problem?
> 
> Thank you for your Efforts
> 
> Regards
>     Uwe
> 


-- 
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)


Re: wm devices don't work under current amd64

2016-03-09 Thread SAITOH Masanobu
Hi, Tom.

On 2016/03/10 4:12, Tom Ivar Helbekkmo wrote:
> SAITOH Masanobu <msai...@execsw.org> writes:
> 
>>  You mean your machine works with INTx but it doesn't work on MSI, right?
> 
> That is correct.
> 
>> If so, could you show the full dmesg of the machine?
> 
> Appended below.
> 
>>  And, did you test if your machine's problem does occur "without" vlan?
> 
> This is the laptop, which doesn't use vlans.  The other machine, the
> Poweredge 2650, is my main server, and does all its networking over a
> vlan trunk on its wm0 interface.  I suspect that its problem is
> different, since it works with a -current from October 10th, whereas the
> laptop doesn't.
> 
> dmesg output from the laptop after making its wm0 use INTx instead of MSI:

 Thank you for your quick reply. I had two ICH9 motherboard but I discarded
them because both of them were broken... Now I have no any ICH9 machine.
I have some ICH8s and one ICH10. All of them worked, so I had thought that
ICH9 worked.

 I'm sorry that I'm busy because AsiaBSDCon starts today and I'll be absent
the next one week from Tokyo. I would be happy if someone(TM) debug
and test with variety of ICH9 machines.

 Regards.


> Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005,
> 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016
> The NetBSD Foundation, Inc.  All rights reserved.
> Copyright (c) 1982, 1986, 1989, 1991, 1993
> The Regents of the University of California.  All rights reserved.
> 
> NetBSD 7.99.26 (DEJAH) #5: Wed Mar  9 17:36:37 CET 2016
>   
> r...@barsoom.hamartun.priv.no:/usr/obj/sys/arch/amd64/compile.amd64/DEJAH
> total memory = 4083 MB
> avail memory = 3945 MB
> rnd: seeded with 128 bits
> timecounter: Timecounters tick every 10.000 msec
> Kernelized RAIDframe activated
> timecounter: Timecounter "i8254" frequency 1193182 Hz quality 100
> Dell Inc. Latitude E6400  
> mainbus0 (root)
> cpu0 at mainbus0
> cpu0: Intel(R) Core(TM)2 Duo CPU T9600  @ 2.80GHz, id 0x1067a
> pci0 at mainbus0 bus 0: configuration mode 1
> pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
> pchb0 at pci0 dev 0 function 0: vendor 8086 product 2a40 (rev. 0x07)
> agp0 at pchb0: can't find internal VGA config space
> ppb0 at pci0 dev 1 function 0: vendor 8086 product 2a41 (rev. 0x07)
> ppb0: PCI Express capability version 1  x16 
> @ 2.5GT/s
> pci1 at ppb0 bus 1
> pci1: i/o space, memory space enabled, rd/line, wr/inv ok
> vga0 at pci1 dev 0 function 0: vendor 10de product 06eb (rev. 0xa1)
> wsdisplay0 at vga0 kbdmux 1: console (80x25, vt100 emulation)
> wsmux1: connecting to wsdisplay0
> drm at vga0 not configured
> wm0 at pci0 dev 25 function 0: 82801I mobile (AMT) LAN Controller (rev. 0x03)
> wm0: interrupting at irq 11
> wm0: PCI-Express bus
> wm0: 2048 words FLASH
> wm0: Ethernet address 00:26:b9:cd:21:c2
> makphy0 at wm0 phy 2: Marvell 88E1149 Gigabit PHY, rev. 1
> makphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
> 1000baseT-FDX, auto
> uhci0 at pci0 dev 26 function 0: vendor 8086 product 2937 (rev. 0x03)
> uhci0: interrupting at irq 10
> usb0 at uhci0: USB revision 1.0
> uhci1 at pci0 dev 26 function 1: vendor 8086 product 2938 (rev. 0x03)
> uhci1: interrupting at irq 3
> usb1 at uhci1: USB revision 1.0
> uhci2 at pci0 dev 26 function 2: vendor 8086 product 2939 (rev. 0x03)
> uhci2: interrupting at irq 11
> usb2 at uhci2: USB revision 1.0
> ehci0 at pci0 dev 26 function 7: vendor 8086 product 293c (rev. 0x03)
> ehci0: interrupting at irq 11
> ehci0: BIOS has given up ownership
> ehci0: EHCI version 1.0
> ehci0: companion controllers, 2 ports each: uhci0 uhci1 uhci2
> usb3 at ehci0: USB revision 2.0
> hdaudio0 at pci0 dev 27 function 0: HD Audio Controller
> hdaudio0: interrupting at irq 3
> hdafg0 at hdaudio0: vendor 111d product 76b2
> hdafg0: DAC00 2ch: Speaker [Built-In], HP Out [Jack]
> hdafg0: DAC01 2ch: Speaker [Jack]
> hdafg0: DIG02 2ch: SPDIF Out [Jack]
> hdafg0: 2ch/0ch 44100Hz 48000Hz 88200Hz 96000Hz 192000Hz PCM16 PCM20 PCM24 AC3
> audio0 at hdafg0: full duplex, playback, capture, mmap, independent
> ppb1 at pci0 dev 28 function 0: vendor 8086 product 2940 (rev. 0x03)
> ppb1: PCI Express capability version 1  x1 @ 
> 2.5GT/s
> pci2 at ppb1 bus 11
> pci2: i/o space, memory space enabled, rd/line, wr/inv ok
> ppb2 at pci0 dev 28 function 1: vendor 8086 product 2942 (rev. 0x03)
> ppb2: PCI Express capability version 1  x1 @ 
> 2.5GT/s
> pci3 at ppb2 bus 12
> pci3: i/o space, memory space enabled, rd/line, wr/inv ok
> iwn0 at pci3 dev 0 function 0: vendor 8086 product 4235 (rev. 0x00)
> iwn0: interrupting at irq 10
> iwn0: MIMO 3T3R, Mo

Re: wm devices don't work under current amd64

2016-03-09 Thread SAITOH Masanobu
Hi.

On 2016/03/10 2:40, Tom Ivar Helbekkmo wrote:
> Masanobu SAITOH <msai...@execsw.org> writes:
> 
>>  A bug must be exist. sborrill@ repored vlan related probem before. One of
>> the problem is that I can't reproduce the problem with my machines...
>> If I can reproduce the problem with my machine, I can fix it...
> 
> Well, I know a bit more about the problem with the laptop, now.  It's
> something to do with MSI.  I observed that the (non-working) kernel had
> logged exactly one MSI interrupt, which I figured was the one from the
> autoconfiguration.  There's a place in if_wm.c where MSI is masked out
> for a couple of versions of the Intel chip.  Adding the version in the
> Dell Latitude E640 laptop to that list, so it fell back to traditional
> IRQ handling, made the interface start working properly:
> 
>   if ((sc->sc_type <= WM_T_82541_2) || (sc->sc_type == WM_T_82571)
>   || (sc->sc_type == WM_T_82572) || (sc->sc_type == WM_T_ICH9))
>   pa->pa_flags &= ~PCI_FLAGS_MSI_OKAY;

 You mean your machine works with INTx but it doesn't work on MSI, right?
If so, could you show the full dmesg of the machine?

 And, did you test if your machine's problem does occur "without" vlan?


> (The WM_T_ICH9 is the one in the laptop.)  No idea yet about the Dell
> Poweredge 2650.  I'll see if I can take a closer look at that tomorrow.
> 
> -tih



-- 
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)


iop(4)/iopsp(4) hang

2015-08-16 Thread SAITOH Masanobu
 Hi.

 Today, I fixed two bugs in iop.c and iopsp.c,
but it still hangs while booting:

 pci3 at ppb2 bus 3
 pci3: i/o space, memory space enabled, rd/line, wr/inv ok
 ppb3 at pci2 dev 0 function 2: vendor 8086 product 032a (rev. 0x09)
 ppb3: PCI Express capability version 1 PCI-E to PCI/PCI-X Bridge
 ppb3: disabling notification events
 pci4 at ppb3 bus 4
 pci4: i/o space, memory space enabled, rd/line, wr/inv ok
 iop0 at pci4 dev 1 function 0: allocated pic ioapic0 type level pin 18 level 
 6 to cpu0 slot 1 idt entry 97
 I2O adapter ADAPTEC 2110S
 iop0: interrupting at ioapic0 pin 18
 iop0: WARNING: power management not supported
(snip)
 wskbd0 at ukbd0 mux 1
 wskbd0: connecting to wsdisplay0
 uhidev1 at uhub4 port 1 configuration 1 interface 1
 uhidev1: vendor 0557 product 2419, rev 1.10/1.00, addr 4, iclass 3/1
 ums0 at uhidev1: 3 buttons and Z dir
 wsmouse0 at ums0 mux 0
 ipmi0: version 2.0 interface KCS iobase 0xca2/2 spacing 1
 iopsp0 at iop0 tid 8: SCSI port ADAPTEC, AIC-7899, 0001
(hang)

It hangs both with and without LOCKDEBUG and I couldn't enter DDB.
I can't debug next 1 week, so I'm glad if someone debug it.

 Thanks.

-- 
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)


Re: current status of ixg(4)

2015-04-07 Thread SAITOH Masanobu
On 2015/03/27 16:03, Masanobu SAITOH wrote:
 
 New patch:

 http://www.netbsd.org/~msaitoh/ixg-20150321-0.dif
 
  This change have commited now.
 
 New patch:
 
 http://www.netbsd.org/~msaitoh/ixg-20150327-0.dif

New patch:

http://www.netbsd.org/~msaitoh/ixg-20150407-0.dif

--
Sync ixg(4) up to FreeBSD r243716:
 - A lot of bugfixes. Some of them are realted to multi queue and those
   have not affected in NetBSD because we have not used it yet.
 - Show 1000Base-SX correctly.
 - Fix if_baudrate from 1G to 10G.
 - Improve performance.
--

-- 
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)


Re: current status of ixg(4)

2015-04-07 Thread SAITOH Masanobu
On 2015/04/07 21:02, SAITOH Masanobu wrote:
 On 2015/03/27 16:03, Masanobu SAITOH wrote:

 New patch:

 http://www.netbsd.org/~msaitoh/ixg-20150321-0.dif

  This change have commited now.

 New patch:

 http://www.netbsd.org/~msaitoh/ixg-20150327-0.dif
 
 New patch:
 
   

Sorry, I overrided with broken ixgbe.c before making the diff.
Use new one:

http://www.netbsd.org/~msaitoh/ixg-20150407-1.dif




 --
 Sync ixg(4) up to FreeBSD r243716:
  - A lot of bugfixes. Some of them are realted to multi queue and those
have not affected in NetBSD because we have not used it yet.
  - Show 1000Base-SX correctly.
  - Fix if_baudrate from 1G to 10G.
  - Improve performance.
 --
 


-- 
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)


Re: current status of ixg(4)

2015-03-25 Thread SAITOH Masanobu

On 2015/03/25 21:18, 6b...@6bone.informatik.uni-leipzig.de wrote:
(snip)

So you are right. The patch was not applied. If I add the code manuelly
it works perfekt!

Thank you.


You're welcome. I'll send pullup requests to netbsd-7 and netbsd-6.


Regards
Uwe



--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)


Re: current status of ixg(4)

2015-03-24 Thread SAITOH Masanobu
Hi.

On 2015/03/24 7:05, 6b...@6bone.informatik.uni-leipzig.de wrote:
 On Mon, 23 Mar 2015, Masanobu SAITOH wrote:
 
 Is this problem filed PR? If not, could you file a PR?

 Could you test with this patch?

 
 The path dosn't solve the problem.

Did you really applied this patch?

Index: ixgbe.c
===
RCS file: /cvsroot/src/sys/dev/pci/ixgbe/ixgbe.c,v
retrieving revision 1.14.2.2
diff -u -p -r1.14.2.2 ixgbe.c
--- ixgbe.c24 Feb 2015 10:41:09 -1.14.2.2
+++ ixgbe.c23 Mar 2015 07:32:50 -
@@ -1064,6 +1064,9 @@ ixgbe_ifflags_cb(struct ethercom *ec)
 else if ((change  (IFF_PROMISC | IFF_ALLMULTI)) != 0)
 ixgbe_set_promisc(adapter);

+/* Set up VLAN support and filter */
+ixgbe_setup_vlan_hw_support(adapter);
+
 IXGBE_CORE_UNLOCK(adapter);

 return rc;




 Here the requested information:
 
 HW:
 023:00:0: Intel 82599 (SFP+) 10 GbE Controller (ethernet network, revision 
 0x01)
 023:00:1: Intel 82599 (SFP+) 10 GbE Controller (ethernet network, revision 
 0x01)
 
 Driver:
 ixg0 at pci14 dev 0 function 0: Intel(R) PRO/10GbE PCI-Express Network 
 Driver, Version - 2.4.5
 ixg0: interrupting at ioapic0 pin 19
 ixg0: PCI Express Bus: Speed 2.5Gb/s Width x8
 ixg1 at pci14 dev 0 function 1: Intel(R) PRO/10GbE PCI-Express Network 
 Driver, Version - 2.4.5
 ixg1: interrupting at ioapic0 pin 16
 ifmedia_match: multiple match for 0x20/0xfff, selected instance 0
 ixg1: PCI Express Bus: Speed 2.5Gb/s Width x8

One of my card is:

011:00:0: Intel 82599 (SFI/SFP+) 10 GbE Controller (ethernet network, revision 
0x01)

It's the same as yours.


 ifconfig:
 
 ixg0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST mtu 1500
 capabilities=bff80TSO4,IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx
 capabilities=bff80TCP4CSUM_Tx,UDP4CSUM_Rx,UDP4CSUM_Tx,TCP6CSUM_Rx
 capabilities=bff80TCP6CSUM_Tx,UDP6CSUM_Rx,UDP6CSUM_Tx,LRO
 enabled=0
 ec_capabilities=7VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU
 ec_enabled=7VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU
 address: a0:36:9f:26:95:04
 media: Ethernet autoselect (10GbaseSR full-duplex)
 status: active
 input: 64405 packets, 5192777 bytes, 700 multicasts, 2459 unknown 
 protocol
 output: 7 packets, 1138 bytes, 3 multicasts
 inet 0.0.0.0 netmask 0xff00 broadcast 255.255.255.255
 inet6 fe80::a236:9fff:fe26:9504%ixg0 prefixlen 64 scopeid 0x1
 
 vlan8: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST mtu 1500
 capabilities=3ff80TSO4,IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx
 capabilities=3ff80TCP4CSUM_Tx,UDP4CSUM_Rx,UDP4CSUM_Tx,TCP6CSUM_Rx
 capabilities=3ff80TCP6CSUM_Tx,UDP6CSUM_Rx,UDP6CSUM_Tx
 enabled=0
 vlan: 8 parent: ixg0
 address: a0:36:9f:26:95:04
 input: 0 packets, 0 bytes
 output: 3 packets, 250 bytes, 3 multicasts
 inet6 fe80::a236:9fff:fe26:9504%vlan8 prefixlen 64 scopeid 0x4
 
 You can see, the input counter is 0. tcpdump -i vlan8 shows no packets. But 
 tcpdump -evi ixg0 shows tagged packets for vlan 8:
 
 e.g.:
 23:26:13.880538 a2:de:48:00:00:0e  ff:ff:ff:ff:ff:ff, ethertype 802.1Q 
 (0x8100), length 64: vlan 8, p 0, ethertype ARP, Request who-has 
 139.18.13.212 tell 139.18.13.254, length 46

On my machine:

ixg1: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST mtu 1500
capabilities=fff80TSO4,IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx
capabilities=fff80TCP4CSUM_Tx,UDP4CSUM_Rx,UDP4CSUM_Tx,TCP6CSUM_Rx
capabilities=fff80TCP6CSUM_Tx,UDP6CSUM_Rx,UDP6CSUM_Tx,TSO6,LRO
enabled=0
ec_capabilities=7VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU
ec_enabled=7VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU
address: 00:1b:21:9b:4e:d6
media: Ethernet autoselect (1000baseT full-duplex)
status: active
input: 3 packets, 284 bytes
output: 14 packets, 1244 bytes, 5 multicasts
inet6 fe80::21b:21ff:fe9b:4ed6%ixg1 prefixlen 64 scopeid 0x4

vlan8: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST mtu 1500
capabilities=7ff80TSO4,IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx
capabilities=7ff80TCP4CSUM_Tx,UDP4CSUM_Rx,UDP4CSUM_Tx,TCP6CSUM_Rx
capabilities=7ff80TCP6CSUM_Tx,UDP6CSUM_Rx,UDP6CSUM_Tx,TSO6
enabled=0
vlan: 8 parent: ixg1
address: 00:1b:21:9b:4e:d6
input: 3 packets, 260 bytes
output: 9 packets, 702 bytes, 5 multicasts
inet 10.0.0.2 netmask 0xff00 broadcast 10.255.255.255
inet6 fe80::21b:21ff:fe9b:4ed6%vlan8 prefixlen 64 scopeid 0x6

and ping 10.0.0.2 from other machine makes vlan8's input counter
increments.

 Could you test again?

 Thank you for your efforts
 
 
 Regards
 Uwe
 


-- 
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)


Re: current status of ixg(4)

2015-03-21 Thread SAITOH Masanobu
Hi.

On 2015/03/21 18:55, 6b...@6bone.informatik.uni-leipzig.de wrote:
 On Fri, 20 Mar 2015, Masanobu SAITOH wrote:
 
 Date: Fri, 20 Mar 2015 17:38:03 +0900
 From: Masanobu SAITOH msai...@execsw.org
 To: current-users@NetBSD.org
 Cc: msai...@execsw.org
 Subject: current status of ixg(4)

 Hello.

 Yesterday, I commited some changes to ixg(4) on -current.

 http://mail-index.netbsd.org/source-changes/2015/03/19/msg064110.html

 I'll wait for a few days to wait feedback of this change. And then
 I'll send pullup request to pullup-7@.

 
 I have applied the patch on -current. The build fails with:
 
 ixgbe_api.o: In function `ixgbe_init_shared_code':
 ixgbe_api.c:(.text+0x16d): undefined reference to `ixgbe_init_ops_X540'
 
 
 Regards
 Uwe

Sorry, I forgot to add a diff of files.pci.

New patch:

http://www.netbsd.org/~msaitoh/ixg-20150321-0.dif

Could you try with this patch again?

-- 
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)


Re: interface send-q stall in 6.99.40?

2014-05-12 Thread SAITOH Masanobu
Hi, Frank.

(2014/05/13 4:56), Frank Kardel wrote:
 Hi,
 
 I have two observations on 6.99.44 (amd64/evbarm) where a wm-interface 
 send-queue is filled to the max. sendto()-calls terminate with ENOBUFS.
 
 net.interfaces.wm3.sndq.len = 256
 net.interfaces.wm3.sndq.maxlen = 256
 net.interfaces.wm3.sndq.drops = 20007
 
 the interface status is:
 wm3: flags=8a43UP,BROADCAST,RUNNING,ALLMULTI,SIMPLEX,MULTICAST mtu 1500
 capabilities=7ff80TSO4,IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx
 capabilities=7ff80TCP4CSUM_Tx,UDP4CSUM_Rx,UDP4CSUM_Tx,TCP6CSUM_Rx
 capabilities=7ff80TCP6CSUM_Tx,UDP6CSUM_Rx,UDP6CSUM_Tx,TSO6
 enabled=7ff80TSO4,IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx
 enabled=7ff80TCP4CSUM_Tx,UDP4CSUM_Rx,UDP4CSUM_Tx,TCP6CSUM_Rx
 enabled=7ff80TCP6CSUM_Tx,UDP6CSUM_Rx,UDP6CSUM_Tx,TSO6
 ec_capabilities=7VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU
 ec_enabled=0
 address: 00:00:xx:xx:xx:xx
 media: Ethernet autoselect (1000baseT 
 full-duplex,flowcontrol,master,rxpause,txpause)
 status: active
 [addresses skipped]
 
 traceroute packets from outside look like the are anwsered, but ICMP ECHO is 
 not answered.
 
 The interface recovers with an ifconfig wmX down/up.

 Could you show me the dmesg?

 While I do not know how to provoke this on wm (just happens once a week). I 
 found the same phenomenon occurring on a Raspberry Pi when detaching the 
 cable. The same symptoms occur there and ifconfig usmsc0 down/up will recover.
 
 Is anybody else seeing this?

 No, I'm not.

 One of diagnostic way of wm(4) is WM_EVENT_COUNTERS option.
Could you enable the option and check with vmstat -ev?


 Best regards,
   Frank


-- 
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)


bge jumbo patch for netbsd-6

2013-08-29 Thread SAITOH Masanobu
 rev. 1.86.
- Document says 5717 and newer chips have no BGE_PCISTATE_INTR_NOT_ACTIVE bit,
  so don't use the bit on those chips. Same as OpenBSD.
- Fix a bug that the PHY address bits in MI_MODE register is wrongly cleard.
  Set the PHY address correctly.
- Use BGE_SETBIT() instead of CSR_WRITE_4() for the BGE_MISC_LOCAL_CTL register
  to not to modify some GPIO bits.
- Call bge_poll_fw() before writing BGE_MODE_CTL register like the
  latest linux tg3 dirver.
- Set DMA watermark depend on the PCI max payload size.
-  Add BGE_JUMBO_CAPABLE flag to some chips. With this commit, 5714, 5780,
   5717, 5718, 5719 (exclude rev. A0), 5720, 57765 and 57766 are added to
   support jumbo frame.
- Fix the setting of sc-bge_flags for 5717 and newer devices.
- Fix a link detect bug on non-autopoll systems. Same as OpenBSD
  (rev.1.329 and 1.336) and FreeBSD (r213710).
- 57765 series is not based on 5717 series. 5717 series is based on 57765
  series.
- Set the TX DMA segment size based on the MTU size.
- Change the TX ring size for 5717 series and 57764 series.
- For 57766, set BGE_RDMAMODE_JMB_2K_MMRR for non-jumbo frame.
  Same as Linux tg3.
- For 57765 and newer devices, set BGE_MAX_RX_FRAME_LOWAT to 1.
  This value is recommended by the document.
- Change sysctl related functions for consistency.
- Style change.
- Use macro. Remove duplicated macro. Remove unused variable.
- Fix comments. Add comments.
- Remove extra semicolon. Remove unused code.
===


-- 
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)