date:20200611

Re: Interfaces errors and latency spikes with Intel 82583V

2020-06-11 Thread Gabri Tofano


Yes, this is today without resetting the interface:

#netstat -ie
NameMtu   Network Address  Ipkts IerrsOpkts 
Oerrs Colls
em0 1500XX:XX:XX:XX:XX:XX  5351463  1868  3016695 
0 0
em0 1500  XX:XX:XX:XX XX:XX:XX:XX:XX:XX  5351463  1868  3016695 
0 0
em1 1500XX:XX:XX:XX:XX:XX  2839738 0  5147702 
0 0
em1 1500  172.16.200. XX:XX:XX:XX:XX:XX  2839738 0  5147702 
0 0
em2 1500XX:XX:XX:XX:XX:XX46977 044135 
0 0
em2 1500  172.16.103/ XX:XX:XX:XX:XX:XX46977 044135 
0 0
em3*150000:e0:67:10:9d:970 00 
0 0
enc0*   00 00 
0 0
pflog0  331360 0   128982 
0 0



On 2020-06-11 20:29, David Gwynne wrote:

Is it consistently Ierrs?

dlg


On 11 Jun 2020, at 10:14 pm, Gabri Tofano  wrote:

#netstat -id
NameMtu   Network Address  Ipkts IdropOpkts 
Odrop Colls
em0 1500XX:XX:XX:XX:XX:XX   266894 0   202813
 0 0
em0 1500  XX.XX.XX.XX XX:XX:XX:XX:XX:XX   266894 0   202813
 0 0
em1 1500XX:XX:XX:XX:XX:XX   170280 0   230226
 1 0
em1 1500  172.16.200. XX:XX:XX:XX:XX:XX   170280 0   230226
 1 0
em2 1500XX:XX:XX:XX:XX:XX15788 013249
 2 0
em2 1500  172.16.103/ XX:XX:XX:XX:XX:XX15788 013249
 2 0
em3*1500XX:XX:XX:XX:XX:XX0 00
 0 0
enc0*   00 00
 0 0
pflog0  331360 029771
 0 0


#netstat -ie
NameMtu   Network Address  Ipkts IerrsOpkts 
Oerrs Colls
em0 1500XX:XX:XX:XX:XX:XX   26971372   205469
 0 0
em0 1500  XX.XX.XX.XX XX:XX:XX:XX:XX:XX   26971372   205469
 0 0
em1 1500XX:XX:XX:XX:XX:XX   172137 0   232148
 0 0
em1 1500  172.16.200. XX:XX:XX:XX:XX:XX   172137 0   232148
 0 0
em2 1500XX:XX:XX:XX:XX:XX15892 013316
 0 0
em2 1500  172.16.103/ XX:XX:XX:XX:XX:XX15892 013316
 0 0
em3*1500XX:XX:XX:XX:XX:XX0 00
 0 0
enc0*   00 00
 0 0
pflog0  331360 030174
 0 0



#systat queues
QUEUE  BW/FL SCH  PKTSBYTES   DROP_P   
DROP_B QLEN BORROW SUSPEN P/S B/S
main on em0 120M fifo000   
 00
defq   100M fifo   139394 215744110
00
voip10M fifo34699  49496350
00
games   10M fifo32277  24608070
00


Thank you!
Gabri

Re: Interfaces errors and latency spikes with Intel 82583V

2020-06-11 Thread David Gwynne

are there any config options on the switch site relating to flow control you 
can try turning off? are there any counters for pause frames on the switch side 
too?

dlg

> On 12 Jun 2020, at 12:16 pm, Gabri Tofano  wrote:
> 
> Apparently it is not:
> 
> #ifconfig em0 hwfeatures
> em0: flags=808843 mtu 1500
>hwfeatures=36 hardmtu 
> 9216
>lladdr XX:XX:XX:XX:XX:XX
>index 1 priority 0 llprio 3
>groups: egress
>media: Ethernet autoselect (1000baseT full-duplex)
>status: active
>inet XX:XX:XX:XX netmask 0xff00 broadcast XX:XX:XX:XX
> 
> 
> On 2020-06-11 21:57, David Gwynne wrote:
>> Is flow control enabled? Can you try disabling rxpause and txpause?
>>> On 12 Jun 2020, at 10:36 am, Gabri Tofano  wrote:
>>> Yes, this is today without resetting the interface:
>>> #netstat -ie
>>> NameMtu   Network Address  Ipkts IerrsOpkts Oerrs 
>>> Colls
>>> em0 1500XX:XX:XX:XX:XX:XX  5351463  1868  3016695 0   
>>>   0
>>> em0 1500  XX:XX:XX:XX XX:XX:XX:XX:XX:XX  5351463  1868  3016695 0   
>>>   0
>>> em1 1500XX:XX:XX:XX:XX:XX  2839738 0  5147702 0   
>>>   0
>>> em1 1500  172.16.200. XX:XX:XX:XX:XX:XX  2839738 0  5147702 0   
>>>   0
>>> em2 1500XX:XX:XX:XX:XX:XX46977 044135 0   
>>>   0
>>> em2 1500  172.16.103/ XX:XX:XX:XX:XX:XX46977 044135 0   
>>>   0
>>> em3*150000:e0:67:10:9d:970 00 0   
>>>   0
>>> enc0*   00 00 0   
>>>   0
>>> pflog0  331360 0   128982 0   
>>>   0
>>> On 2020-06-11 20:29, David Gwynne wrote:
 Is it consistently Ierrs?
 dlg
> On 11 Jun 2020, at 10:14 pm, Gabri Tofano  wrote:
> #netstat -id
> NameMtu   Network Address  Ipkts IdropOpkts Odrop 
> Colls
> em0 1500XX:XX:XX:XX:XX:XX   266894 0   202813 0 
> 0
> em0 1500  XX.XX.XX.XX XX:XX:XX:XX:XX:XX   266894 0   202813 0 
> 0
> em1 1500XX:XX:XX:XX:XX:XX   170280 0   230226 1 
> 0
> em1 1500  172.16.200. XX:XX:XX:XX:XX:XX   170280 0   230226 1 
> 0
> em2 1500XX:XX:XX:XX:XX:XX15788 013249 2 
> 0
> em2 1500  172.16.103/ XX:XX:XX:XX:XX:XX15788 013249 2 
> 0
> em3*1500XX:XX:XX:XX:XX:XX0 00 0 
> 0
> enc0*   00 00 0 
> 0
> pflog0  331360 029771 0 
> 0
> #netstat -ie
> NameMtu   Network Address  Ipkts IerrsOpkts Oerrs 
> Colls
> em0 1500XX:XX:XX:XX:XX:XX   26971372   205469 0 
> 0
> em0 1500  XX.XX.XX.XX XX:XX:XX:XX:XX:XX   26971372   205469 0 
> 0
> em1 1500XX:XX:XX:XX:XX:XX   172137 0   232148 0 
> 0
> em1 1500  172.16.200. XX:XX:XX:XX:XX:XX   172137 0   232148 0 
> 0
> em2 1500XX:XX:XX:XX:XX:XX15892 013316 0 
> 0
> em2 1500  172.16.103/ XX:XX:XX:XX:XX:XX15892 013316 0 
> 0
> em3*1500XX:XX:XX:XX:XX:XX0 00 0 
> 0
> enc0*   00 00 0 
> 0
> pflog0  331360 030174 0 
> 0
> #systat queues
> QUEUE  BW/FL SCH  PKTSBYTES   DROP_P   
> DROP_B QLEN BORROW SUSPEN P/S B/S
> main on em0 120M fifo000  
>   00
> defq   100M fifo   139394 215744110   
>  00
> voip10M fifo34699  49496350   
>  00
> games   10M fifo32277  24608070   
>  00
> Thank you!
> Gabri

Re: Not correctly supported on OpenBSD 6.7: HPE 10/25Gb 2p 640FLR-SFP28 network adapter on HPE DL380 Gen10 servers

2020-06-11 Thread Jonathan Matthew

On Fri, Jun 12, 2020 at 12:13:42AM +0200, Mark Schneider wrote:
> Hello
> 
> 
> Even the 640FLR-SFP28 network adapter is listed in the "pcidump -v" output
> on OpenBSD 6.7 there are no entries for it's interfaces in the output of
> "ifconfig -a"
> 
> # -
> obsd67a1# grep "^ [0-9]" OBSD67-pcidump-v.txt | grep -v Intel
>  1:0:0: Hewlett-Packard iLO3 Slave
>  1:0:1: Matrox unknown
>  1:0:2: Hewlett-Packard iLO3 Management
>  1:0:4: Hewlett-Packard unknown
>  2:0:0: Broadcom BCM5719
>  2:0:1: Broadcom BCM5719
>  2:0:2: Broadcom BCM5719
>  2:0:3: Broadcom BCM5719
>  18:0:0: Adaptec unknown
>  177:0:0: Adaptec unknown
>  178:0:0: Mellanox ConnectX-4 Lx
>  178:0:1: Mellanox ConnectX-4 Lx
> # -

Can you try -current please?  This system will likely work better with
support for acpi pci host bridges, which was not enabled in 6.7.

Re: Interfaces errors and latency spikes with Intel 82583V

2020-06-11 Thread David Gwynne

Is flow control enabled? Can you try disabling rxpause and txpause?

> On 12 Jun 2020, at 10:36 am, Gabri Tofano  wrote:
> 
> Yes, this is today without resetting the interface:
> 
> #netstat -ie
> NameMtu   Network Address  Ipkts IerrsOpkts Oerrs 
> Colls
> em0 1500XX:XX:XX:XX:XX:XX  5351463  1868  3016695 0 > 0
> em0 1500  XX:XX:XX:XX XX:XX:XX:XX:XX:XX  5351463  1868  3016695 0 > 0
> em1 1500XX:XX:XX:XX:XX:XX  2839738 0  5147702 0 > 0
> em1 1500  172.16.200. XX:XX:XX:XX:XX:XX  2839738 0  5147702 0 > 0
> em2 1500XX:XX:XX:XX:XX:XX46977 044135 0 > 0
> em2 1500  172.16.103/ XX:XX:XX:XX:XX:XX46977 044135 0 > 0
> em3*150000:e0:67:10:9d:970 00 0 > 0
> enc0*   00 00 0 > 0
> pflog0  331360 0   128982 0 > 0
> 
> 
> On 2020-06-11 20:29, David Gwynne wrote:
>> Is it consistently Ierrs?
>> dlg
>>> On 11 Jun 2020, at 10:14 pm, Gabri Tofano  wrote:
>>> #netstat -id
>>> NameMtu   Network Address  Ipkts IdropOpkts Odrop 
>>> Colls
>>> em0 1500XX:XX:XX:XX:XX:XX   266894 0   202813 0   
>>>   0
>>> em0 1500  XX.XX.XX.XX XX:XX:XX:XX:XX:XX   266894 0   202813 0   
>>>   0
>>> em1 1500XX:XX:XX:XX:XX:XX   170280 0   230226 1   
>>>   0
>>> em1 1500  172.16.200. XX:XX:XX:XX:XX:XX   170280 0   230226 1   
>>>   0
>>> em2 1500XX:XX:XX:XX:XX:XX15788 013249 2   
>>>   0
>>> em2 1500  172.16.103/ XX:XX:XX:XX:XX:XX15788 013249 2   
>>>   0
>>> em3*1500XX:XX:XX:XX:XX:XX0 00 0   
>>>   0
>>> enc0*   00 00 0   
>>>   0
>>> pflog0  331360 029771 0   
>>>   0
>>> #netstat -ie
>>> NameMtu   Network Address  Ipkts IerrsOpkts Oerrs 
>>> Colls
>>> em0 1500XX:XX:XX:XX:XX:XX   26971372   205469 0   
>>>   0
>>> em0 1500  XX.XX.XX.XX XX:XX:XX:XX:XX:XX   26971372   205469 0   
>>>   0
>>> em1 1500XX:XX:XX:XX:XX:XX   172137 0   232148 0   
>>>   0
>>> em1 1500  172.16.200. XX:XX:XX:XX:XX:XX   172137 0   232148 0   
>>>   0
>>> em2 1500XX:XX:XX:XX:XX:XX15892 013316 0   
>>>   0
>>> em2 1500  172.16.103/ XX:XX:XX:XX:XX:XX15892 013316 0   
>>>   0
>>> em3*1500XX:XX:XX:XX:XX:XX0 00 0   
>>>   0
>>> enc0*   00 00 0   
>>>   0
>>> pflog0  331360 030174 0   
>>>   0
>>> #systat queues
>>> QUEUE  BW/FL SCH  PKTSBYTES   DROP_P   
>>> DROP_B QLEN BORROW SUSPEN P/S B/S
>>> main on em0 120M fifo000
>>> 00
>>> defq   100M fifo   139394 215744110
>>> 00
>>> voip10M fifo34699  49496350
>>> 00
>>> games   10M fifo32277  24608070
>>> 00
>>> Thank you!
>>> Gabri

Re: Interfaces errors and latency spikes with Intel 82583V

2020-06-11 Thread David Gwynne

Is it consistently Ierrs?

dlg

> On 11 Jun 2020, at 10:14 pm, Gabri Tofano  wrote:
> 
> #netstat -id
> NameMtu   Network Address  Ipkts IdropOpkts Odrop 
> Colls
> em0 1500XX:XX:XX:XX:XX:XX   266894 0   202813 0 > 0
> em0 1500  XX.XX.XX.XX XX:XX:XX:XX:XX:XX   266894 0   202813 0 > 0
> em1 1500XX:XX:XX:XX:XX:XX   170280 0   230226 1 > 0
> em1 1500  172.16.200. XX:XX:XX:XX:XX:XX   170280 0   230226 1 > 0
> em2 1500XX:XX:XX:XX:XX:XX15788 013249 2 > 0
> em2 1500  172.16.103/ XX:XX:XX:XX:XX:XX15788 013249 2 > 0
> em3*1500XX:XX:XX:XX:XX:XX0 00 0 > 0
> enc0*   00 00 0 > 0
> pflog0  331360 029771 0 > 0
> 
> #netstat -ie
> NameMtu   Network Address  Ipkts IerrsOpkts Oerrs 
> Colls
> em0 1500XX:XX:XX:XX:XX:XX   26971372   205469 0 > 0
> em0 1500  XX.XX.XX.XX XX:XX:XX:XX:XX:XX   26971372   205469 0 > 0
> em1 1500XX:XX:XX:XX:XX:XX   172137 0   232148 0 > 0
> em1 1500  172.16.200. XX:XX:XX:XX:XX:XX   172137 0   232148 0 > 0
> em2 1500XX:XX:XX:XX:XX:XX15892 013316 0 > 0
> em2 1500  172.16.103/ XX:XX:XX:XX:XX:XX15892 013316 0 > 0
> em3*1500XX:XX:XX:XX:XX:XX0 00 0 > 0
> enc0*   00 00 0 > 0
> pflog0  331360 030174 0 > 0
> 
> 
> #systat queues
> QUEUE  BW/FL SCH  PKTSBYTES   DROP_P   DROP_B 
> QLEN BORROW SUSPEN P/S B/S
> main on em0 120M fifo0000 
>0
> defq   100M fifo   139394 2157441100  
>   0
> voip10M fifo34699  494963500  
>   0
> games   10M fifo32277  246080700  
>   0
> 
> Thank you!
> Gabri

Re: i915_request_create+0x4b: uvm_fault

2020-06-11 Thread Mark Kettenis

> Date: Thu, 11 Jun 2020 19:35:48 +0100
> From: Stuart Henderson 
> 
> And this, I was resizing a mupdf window at the time.

Should already be fixed.  See jsg's commit eralier today.

> uvm_fault(0xfd87039e5890, 0x58, 0, 1) -> e
> kernel: page fault trap, code=0
> Stopped at  intel_partial_pages+0xf4:   movq0x58,%rsi
> ddb{0}> ps /o
> TIDPIDUID PRFLAGS PFLAGS  CPU  COMMAND
>   98313  29118   10000x12  0x4001  firefox
>  513012  74234   10000x12  03  i3
>  256402  98134   1000 0x2  02  xcompmgr
> *495496  15497 350x12  00K Xorg
> ddb{0}> tr
> intel_partial_pages(81077028,80f32300) at 
> intel_partial_pages+0
> xf4
> ggtt_set_pages(81076e70) at ggtt_set_pages+0x5f
> i915_vma_pin(81076e70,0,0,40a) at i915_vma_pin+0x406
> i915_gem_object_ggtt_pin(80f32300,80003428f9e0,0,0,a) at 
> i915_gem_o
> bject_ggtt_pin+0xd8
> i915_gem_fault(80f32300,80003428fcf8,101925000,b5feaf63000,8000
> 3428fc00,1) at i915_gem_fault+0x3ee
> drm_fault(80003428fcf8,b5feaf63000,80003428fc00,1,0,0) at 
> drm_fault+0x1
> 55
> uvm_fault(fd87039e5890,b5feaf63000,0,2) at uvm_fault+0x6db
> pageflttrap(80003428fe70,1) at pageflttrap+0xfb
> usertrap(80003428fe70) at usertrap+0x1b0
> recall_trap() at recall_trap+0x8
> end of kernel
> end trace frame: 0x7f7cf710, count: -10
> ddb{0}> sh reg
> rdi   0x1000__ALIGN_SIZE
> rsi   0x1000__ALIGN_SIZE
> rbp   0x80003428f870
> rbx0
> rdx0
> rcx   0x81102000
> rax   0x81877120
> r80x
> r9  0x91
> r100x26fd86352fccbe7
> r11   0x9a94d7a63a7ccb0b
> r12   0x80f32300
> r13   0x81077028
> r14   0x816e1c60
> r15 0xf5
> rip   0x818b4734intel_partial_pages+0xf4
> cs   0x8
> rflags   0x10287__ALIGN_SIZE+0xf287
> rsp   0x80003428f820
> ss 0
> intel_partial_pages+0xf4:   movq0x58,%rsi
> ddb{0}> mach ddbcpu 1
> Stopped at  x86_ipi_db+0x12:leave
> ddb{1}> tr
> x86_ipi_db(800022411ff0) at x86_ipi_db+0x12
> x86_ipi_handler() at x86_ipi_handler+0x80
> Xresume_lapic_ipi() at Xresume_lapic_ipi+0x23
> _kernel_lock() at _kernel_lock+0xa2
> dofilewritev(8000349aeb18,4f,800034a8dc98,0,800034a8dd70) at 
> dofile
> writev+0xf9
> sys_write(8000349aeb18,800034a8dd10,800034a8dd70) at 
> sys_write+0x51
> 
> syscall(800034a8dde0) at syscall+0x3b9
> Xsyscall() at Xsyscall+0x128
> end of kernel
> end trace frame: 0x1e495ef0a440, count: -8
> ddb{1}> sh reg
> rdi   0x800022411ff0
> rsi0
> rbp   0x800034a8da80
> rbx   0x820db068ipifunc+0x38
> rdx0
> rcx  0x7
> rax   0xff7f
> r8 0
> r9 0
> r100
> r11   0x3233edf10ac64205
> r12  0x7
> r130
> r14   0x800022411ff0
> r150
> rip   0x81a79542x86_ipi_db+0x12
> cs   0x8
> rflags 0x206
> rsp   0x800034a8da70
> ss  0x10
> x86_ipi_db+0x12:leave
> ddb{1}> mach ddbcpu 2
> Stopped at  x86_ipi_db+0x12:leave
> ddb{2}> tr
> x86_ipi_db(800022422ff0) at x86_ipi_db+0x12
> x86_ipi_handler() at x86_ipi_handler+0x80
> Xresume_lapic_ipi() at Xresume_lapic_ipi+0x23
> _kernel_lock() at _kernel_lock+0xa9
> Xsyscall() at Xsyscall+0x128
> end of kernel
> end trace frame: 0x7f7c3750, count: -5
> ddb{2}> sh reg
> rdi   0x800022422ff0
> rsi0
> rbp   0x8000344374a0
> rbx   0x820db068ipifunc+0x38
> rdx0
> rcx  0x7
> rax   0xff7f
> r8 0
> r9 0
> r100
> r11   0x3233edf10ac64205
> r12  0x7
> r130
> r14   0x800022422ff0
> r150
> rip   0x81a79542x86_ipi_db+0x12
> cs   0x8
> rflags 0x206
> rsp   0x800034437490
> ss

Re: i915_request_create+0x4b: uvm_fault

2020-06-11 Thread Stuart Henderson

And this, I was resizing a mupdf window at the time.


uvm_fault(0xfd87039e5890, 0x58, 0, 1) -> e
kernel: page fault trap, code=0
Stopped at  intel_partial_pages+0xf4:   movq0x58,%rsi
ddb{0}> ps /o
TIDPIDUID PRFLAGS PFLAGS  CPU  COMMAND
  98313  29118   10000x12  0x4001  firefox
 513012  74234   10000x12  03  i3
 256402  98134   1000 0x2  02  xcompmgr
*495496  15497 350x12  00K Xorg
ddb{0}> tr
intel_partial_pages(81077028,80f32300) at intel_partial_pages+0
xf4
ggtt_set_pages(81076e70) at ggtt_set_pages+0x5f
i915_vma_pin(81076e70,0,0,40a) at i915_vma_pin+0x406
i915_gem_object_ggtt_pin(80f32300,80003428f9e0,0,0,a) at i915_gem_o
bject_ggtt_pin+0xd8
i915_gem_fault(80f32300,80003428fcf8,101925000,b5feaf63000,8000
3428fc00,1) at i915_gem_fault+0x3ee
drm_fault(80003428fcf8,b5feaf63000,80003428fc00,1,0,0) at drm_fault+0x1
55
uvm_fault(fd87039e5890,b5feaf63000,0,2) at uvm_fault+0x6db
pageflttrap(80003428fe70,1) at pageflttrap+0xfb
usertrap(80003428fe70) at usertrap+0x1b0
recall_trap() at recall_trap+0x8
end of kernel
end trace frame: 0x7f7cf710, count: -10
ddb{0}> sh reg
rdi   0x1000__ALIGN_SIZE
rsi   0x1000__ALIGN_SIZE
rbp   0x80003428f870
rbx0
rdx0
rcx   0x81102000
rax   0x81877120
r80x
r9  0x91
r100x26fd86352fccbe7
r11   0x9a94d7a63a7ccb0b
r12   0x80f32300
r13   0x81077028
r14   0x816e1c60
r15 0xf5
rip   0x818b4734intel_partial_pages+0xf4
cs   0x8
rflags   0x10287__ALIGN_SIZE+0xf287
rsp   0x80003428f820
ss 0
intel_partial_pages+0xf4:   movq0x58,%rsi
ddb{0}> mach ddbcpu 1
Stopped at  x86_ipi_db+0x12:leave
ddb{1}> tr
x86_ipi_db(800022411ff0) at x86_ipi_db+0x12
x86_ipi_handler() at x86_ipi_handler+0x80
Xresume_lapic_ipi() at Xresume_lapic_ipi+0x23
_kernel_lock() at _kernel_lock+0xa2
dofilewritev(8000349aeb18,4f,800034a8dc98,0,800034a8dd70) at dofile
writev+0xf9
sys_write(8000349aeb18,800034a8dd10,800034a8dd70) at sys_write+0x51

syscall(800034a8dde0) at syscall+0x3b9
Xsyscall() at Xsyscall+0x128
end of kernel
end trace frame: 0x1e495ef0a440, count: -8
ddb{1}> sh reg
rdi   0x800022411ff0
rsi0
rbp   0x800034a8da80
rbx   0x820db068ipifunc+0x38
rdx0
rcx  0x7
rax   0xff7f
r8 0
r9 0
r100
r11   0x3233edf10ac64205
r12  0x7
r130
r14   0x800022411ff0
r150
rip   0x81a79542x86_ipi_db+0x12
cs   0x8
rflags 0x206
rsp   0x800034a8da70
ss  0x10
x86_ipi_db+0x12:leave
ddb{1}> mach ddbcpu 2
Stopped at  x86_ipi_db+0x12:leave
ddb{2}> tr
x86_ipi_db(800022422ff0) at x86_ipi_db+0x12
x86_ipi_handler() at x86_ipi_handler+0x80
Xresume_lapic_ipi() at Xresume_lapic_ipi+0x23
_kernel_lock() at _kernel_lock+0xa9
Xsyscall() at Xsyscall+0x128
end of kernel
end trace frame: 0x7f7c3750, count: -5
ddb{2}> sh reg
rdi   0x800022422ff0
rsi0
rbp   0x8000344374a0
rbx   0x820db068ipifunc+0x38
rdx0
rcx  0x7
rax   0xff7f
r8 0
r9 0
r100
r11   0x3233edf10ac64205
r12  0x7
r130
r14   0x800022422ff0
r150
rip   0x81a79542x86_ipi_db+0x12
cs   0x8
rflags 0x206
rsp   0x800034437490
ss  0x10
x86_ipi_db+0x12:leave
ddb{2}> mach ddbcpu 3
Stopped at  x86_ipi_db+0x12:leave
ddb{3}> tr
x86_ipi_db(80002242bff0) at x86_ipi_db+0x12
x86_ipi_handler() at x86_ipi_handler+0x80
Xresume_lapic_ipi() at Xresume_lapic_ipi+0x23
_kernel_lock() at _kernel_lock+0xa2
sosend(fd866e5b9b28,0,8000344af6a0,0,0,80) at

Re: man double free

2020-06-11 Thread Ingo Schwarze

Hi Klemens,

Klemens Nanni wrote on Thu, Jun 11, 2020 at 06:32:46PM +0200:
> On Thu, Jun 11, 2020 at 06:03:20PM +0200, Ingo Schwarze wrote:

>> Indeed, i came up with exactly the same diff and had already
>> written this commit message, but was still testing:
>> 
>>   Fix a regression in rev. 1.238 (2019/07/26):
>>   Pass the right object to html_reset() or it will crash 
>>   when rendering more than one manual page to HTML in a row.
>>   Bug reported by Abel Romero Perez .

> Wasn't this just reshuffling the already existing html_reset() which
> was introduced in r1.222?

Oh right, i only looked far enough to see that the argument
of html_reset() changed in that commit, but you are right that
the previous argument (curp) wasn't correct either.  So it seems
it was even broken a few months longer.  No matter, now at least
it has the right argument.

Thanks for having another look,
  Ingo

Re: man double free

2020-06-11 Thread Theo de Raadt

Ingo Schwarze  wrote:

> Hi,
> 
> Theo de Raadt wrote on Thu, Jun 11, 2020 at 10:12:47AM -0600:
> > Romero Perez, Abel  wrote:
> 
> >> I suggest only to have a look into better measures of security by
> >> researching optimization flags, to find an equilibrium of optimization
> >> and security.
> 
> > Romero, that is bullshit.
> 
> However, there is something i ought to do to make such bugs less
> likely: Remove the last vestigial type-unsafe pointer handling.
> That was designed a decade ago with an excessive focus on flexibility
> when the scope of the program was not yet clear.  A typical example
> of over-abstraction.  When you don't know yet how general your code
> might need to be, write specific code first.  If it turns out
> additional situations need to be handled, consider generalizing it
> (and again, don't go overboard).  Never invent abstractions "because
> just in case".
> 
> If we would need many dozens of different output formats, and people
> would want to plug in new ones at run time or something crazy like
> that, the abstraction implemented with these void pointers might
> have a point.  But now that we know that less than a dozen output
> formats are really needed, and that they are all very stable, there
> are very likely ways to improve this code, making it more robust
> and less error-prone.

No way Ingo, you should be carefully use the compiler -O option!
It is the way to security, expert Romero has spoken!

Re: man double free

2020-06-11 Thread Ingo Schwarze

Hi,

Theo de Raadt wrote on Thu, Jun 11, 2020 at 10:12:47AM -0600:
> Romero Perez, Abel  wrote:

>> I suggest only to have a look into better measures of security by
>> researching optimization flags, to find an equilibrium of optimization
>> and security.

> Romero, that is bullshit.

However, there is something i ought to do to make such bugs less
likely: Remove the last vestigial type-unsafe pointer handling.
That was designed a decade ago with an excessive focus on flexibility
when the scope of the program was not yet clear.  A typical example
of over-abstraction.  When you don't know yet how general your code
might need to be, write specific code first.  If it turns out
additional situations need to be handled, consider generalizing it
(and again, don't go overboard).  Never invent abstractions "because
just in case".

If we would need many dozens of different output formats, and people
would want to plug in new ones at run time or something crazy like
that, the abstraction implemented with these void pointers might
have a point.  But now that we know that less than a dozen output
formats are really needed, and that they are all very stable, there
are very likely ways to improve this code, making it more robust
and less error-prone.

Yours,
  Ingo

Re: man double free

2020-06-11 Thread Klemens Nanni

On Thu, Jun 11, 2020 at 06:03:20PM +0200, Ingo Schwarze wrote:
> Indeed, i came up with exactly the same diff and had already
> written this commit message, but was still testing:
> 
>   Fix a regression in rev. 1.238 (2019/07/26):
>   Pass the right object to html_reset() or it will crash 
>   when rendering more than one manual page to HTML in a row.
>   Bug reported by Abel Romero Perez .
Wasn't this just reshuffling the already existing html_reset() which
was introduced in r1.222?

Re: man double free

2020-06-11 Thread Klemens Nanni

On Thu, Jun 11, 2020 at 05:07:17PM +0200, Otto Moerbeek wrote:
> This fixes it for me,
This looks like a simple mistake introduced back in main.c r1.222:

date: 2019/03/03 13:01:47;  author: schwarze;  state: Exp;  lines: +3 
-1;
Reset HTML formatter state, in particular the id_unique hash,
after processing each manual page, such that the next page
starts from a clean state and doesn't continue suffix numbering.

Issue found while looking at https://github.com/Debian/debiman/issues/48
which was brought up by Orestis Ioannou .

outst is on the stack and html_reset_internal() expects a struct html
pointer, but this obviously mismatches and eventually free()s stack
memory.

820 if (outst->had_output && outst->outtype <= OUTT_UTF8) {
821 if (outst->outdata == NULL)
822 outdata_alloc(outst, >output);
823 terminal_sepline(outst->outdata);
824 }   
826 if (resp->form == FORM_SRC) 

  
827 parse(mp, fd, resp->file, outst, >output);

   
828 else {  

outdata_alloc() properly allocates a struct html with html_alloc() in
our case which must be reset later in parse() through html_reset().

Pretty sure your diff is correct, but won't hurt to hear from Ingo
before committing.

OK kn

Re: Interfaces errors and latency spikes with Intel 82583V

2020-06-11 Thread Gabri Tofano


#netstat -id
NameMtu   Network Address  Ipkts IdropOpkts 
Odrop Colls
em0 1500XX:XX:XX:XX:XX:XX   266894 0   202813 
0 0
em0 1500  XX.XX.XX.XX XX:XX:XX:XX:XX:XX   266894 0   202813 
0 0
em1 1500XX:XX:XX:XX:XX:XX   170280 0   230226 
1 0
em1 1500  172.16.200. XX:XX:XX:XX:XX:XX   170280 0   230226 
1 0
em2 1500XX:XX:XX:XX:XX:XX15788 013249 
2 0
em2 1500  172.16.103/ XX:XX:XX:XX:XX:XX15788 013249 
2 0
em3*1500XX:XX:XX:XX:XX:XX0 00 
0 0
enc0*   00 00 
0 0
pflog0  331360 029771 
0 0


#netstat -ie
NameMtu   Network Address  Ipkts IerrsOpkts 
Oerrs Colls
em0 1500XX:XX:XX:XX:XX:XX   26971372   205469 
0 0
em0 1500  XX.XX.XX.XX XX:XX:XX:XX:XX:XX   26971372   205469 
0 0
em1 1500XX:XX:XX:XX:XX:XX   172137 0   232148 
0 0
em1 1500  172.16.200. XX:XX:XX:XX:XX:XX   172137 0   232148 
0 0
em2 1500XX:XX:XX:XX:XX:XX15892 013316 
0 0
em2 1500  172.16.103/ XX:XX:XX:XX:XX:XX15892 013316 
0 0
em3*1500XX:XX:XX:XX:XX:XX0 00 
0 0
enc0*   00 00 
0 0
pflog0  331360 030174 
0 0



#systat queues
QUEUE  BW/FL SCH  PKTSBYTES   DROP_P   
DROP_B QLEN BORROW SUSPEN P/S B/S
main on em0 120M fifo000 
   00
 defq   100M fifo   139394 215744110 
   00
 voip10M fifo34699  49496350 
   00
 games   10M fifo32277  24608070 
   00


Thank you!
Gabri

On 2020-06-11 07:35, David Gwynne wrote:

The Ifail and Ofail columns are a sum of queue drops and errors. Could
you run that netstat command with -d and -e so we can see the drops
and errors separately?

Cheers,
dlg


On 11 Jun 2020, at 2:21 pm, Gabri Tofano  wrote:

After extensive testing the latency spikes shown up again:

To the inside interface of the firewall:

Reply from 172.16.200.1: bytes=32 time<1ms TTL=254
Reply from 172.16.200.1: bytes=32 time<1ms TTL=254
Reply from 172.16.200.1: bytes=32 time<1ms TTL=254
Reply from 172.16.200.1: bytes=32 time<1ms TTL=254
Reply from 172.16.200.1: bytes=32 time<1ms TTL=254
Reply from 172.16.200.1: bytes=32 time<1ms TTL=254
Reply from 172.16.200.1: bytes=32 time<1ms TTL=254
Reply from 172.16.200.1: bytes=32 time<1ms TTL=254
Reply from 172.16.200.1: bytes=32 time<1ms TTL=254
Reply from 172.16.200.1: bytes=32 time=132ms TTL=254
Reply from 172.16.200.1: bytes=32 time<1ms TTL=254
Reply from 172.16.200.1: bytes=32 time<1ms TTL=254
Reply from 172.16.200.1: bytes=32 time<1ms TTL=254
Reply from 172.16.200.1: bytes=32 time<1ms TTL=254
Reply from 172.16.200.1: bytes=32 time<1ms TTL=254
Reply from 172.16.200.1: bytes=32 time<1ms TTL=254
Reply from 172.16.200.1: bytes=32 time<1ms TTL=254
Reply from 172.16.200.1: bytes=32 time<1ms TTL=254
Reply from 172.16.200.1: bytes=32 time<1ms TTL=254

And to the firewall's next hop (ISP ONT) at the same time:

Reply from 74.215.235.1: bytes=32 time=1ms TTL=62
Reply from 74.215.235.1: bytes=32 time=2ms TTL=62
Reply from 74.215.235.1: bytes=32 time=1ms TTL=62
Reply from 74.215.235.1: bytes=32 time=2ms TTL=62
Reply from 74.215.235.1: bytes=32 time=1ms TTL=62
Reply from 74.215.235.1: bytes=32 time=3ms TTL=62
Reply from 74.215.235.1: bytes=32 time=2ms TTL=62
Reply from 74.215.235.1: bytes=32 time=1ms TTL=62
Reply from 74.215.235.1: bytes=32 time=3ms TTL=62
Reply from 74.215.235.1: bytes=32 time=242ms TTL=62
Reply from 74.215.235.1: bytes=32 time=2ms TTL=62
Reply from 74.215.235.1: bytes=32 time=2ms TTL=62
Reply from 74.215.235.1: bytes=32 time=1ms TTL=62
Reply from 74.215.235.1: bytes=32 time=1ms TTL=62
Reply from 74.215.235.1: bytes=32 time=2ms TTL=62
Reply from 74.215.235.1: bytes=32 time=1ms TTL=62
Reply from 74.215.235.1: bytes=32 time=1ms TTL=62
Reply from 74.215.235.1: bytes=32 time=3ms TTL=62
Reply from 74.215.235.1: bytes=32 time=3ms TTL=62

Interface errors are now showing up just on the output:

#netstat -i
NameMtu   Network Address  Ipkts IfailOpkts 
Ofail Colls
em0 1500XX:XX:XX:XX:XX:XX22655 041589
 0 0
em0 1500  XX.XX.XX.XX XX:XX:XX:XX:XX:XX22655 041589
 0 0
em1 1500XX:XX:XX:XX:XX:XX39924 020476
 1 0
em1 1500  172.16.200. XX:XX:XX:XX:XX:XX39924 020476
 1 0
em2 1500XX:XX:XX:XX:XX:XX  427 0  330
 2 0
em2 1500  172.16.103/

Re: man double free

2020-06-11 Thread Theo de Raadt

Romero Pérez, Abel  wrote:

> Yes, thank you.
> 
> I suggest only to have a look into better measures of security by
> researching optimization flags, to find an equilibrium of optimization
> and security.

Romero, that is bullshit.

Re: man double free

2020-06-11 Thread Romero Pérez , Abel


Yes, thank you.

I suggest only to have a look into better measures of security by 
researching optimization flags, to find an equilibrium of optimization 
and security.


But I said to forget it because it was hard to explain.

On 2020-06-11 18:06, Theo de Raadt wrote:

Otto Moerbeek  wrote:


On Thu, Jun 11, 2020 at 05:15:28PM +0200, Romero Pérez, Abel wrote:




On 2020-06-11 17:07, Otto Moerbeek wrote:

On Thu, Jun 11, 2020 at 04:53:25PM +0200, Romero Pérez, Abel wrote:




On 2020-06-11 16:45, Klemens Nanni wrote:

On Thu, Jun 11, 2020 at 03:59:09PM +0200, Otto Moerbeek wrote:

This already trips the bug;

man -T html -c pfctl id

No need for a custom man function. No clue yet why.

This is in mandoc's HTML parser, but only happens for multiple manuals
in html.c:html_reset_internal():

164 while ((tag = h->tag) != NULL) {
165 h->tag = tag->next;
166 free(tag);
167 }

Note that it crashes differently depending on the optimization level:

$ cd /usr/src/usr.bin/mandoc
$ make DEBUG=-O0
$ ./obj/mandoc -Thtml `man -w id cat` >/dev/null ; echo $?
0

$ make DEBUG=-O1
$ ./obj/mandoc -Thtml `man -w id cat` >/dev/null
Segmentation fault (core dumped)

$ make DEBUG=-O2
$ ./obj/mandoc -Thtml `man -w id cat` >/dev/null
mandoc(32092) in free(): bogus pointer (double free?) 0x6641bab613b
Abort trap (core dumped)

Need to run now, but wanted to share what seems to be the right direction.


Compile with -O0 to fix temporally the bug.
But, I also want to note that a binary is not need to be specified, can be a
just a file... (as second man entry).



This fixes it for me,

-Otto

Index: main.c
===
RCS file: /cvs/src/usr.bin/mandoc/main.c,v
retrieving revision 1.247
diff -u -p -r1.247 main.c
--- main.c  24 Feb 2020 21:15:05 -  1.247
+++ main.c  11 Jun 2020 15:06:43 -
@@ -872,7 +872,7 @@ parse(struct mparse *mp, int fd, const c
if (outst->outdata == NULL)
outdata_alloc(outst, outconf);
else if (outst->outtype == OUTT_HTML)
-   html_reset(outst);
+   html_reset(outst->outdata);
mandoc_xr_reset();
meta = mparse_result(mp);


Only one comment, don't use -O0 flag as optimization (disabled) to hunt more
bugs of this kind.


I have no clue what you mean by above sentence. If code has a bug,
optmization level might cause the bug to be hidden or exposed; it can
work both ways.


The person who didn't fix the bug is giving you advice about fixing the bug.

Re: man double free

2020-06-11 Thread Theo de Raadt

Otto Moerbeek  wrote:

> On Thu, Jun 11, 2020 at 05:15:28PM +0200, Romero Pérez, Abel wrote:
> 
> > 
> > 
> > On 2020-06-11 17:07, Otto Moerbeek wrote:
> > > On Thu, Jun 11, 2020 at 04:53:25PM +0200, Romero Pérez, Abel wrote:
> > > 
> > > > 
> > > > 
> > > > On 2020-06-11 16:45, Klemens Nanni wrote:
> > > > > On Thu, Jun 11, 2020 at 03:59:09PM +0200, Otto Moerbeek wrote:
> > > > > > This already trips the bug;
> > > > > > 
> > > > > > man -T html -c pfctl id
> > > > > > 
> > > > > > No need for a custom man function. No clue yet why.
> > > > > This is in mandoc's HTML parser, but only happens for multiple manuals
> > > > > in html.c:html_reset_internal():
> > > > > 
> > > > > 164 while ((tag = h->tag) != NULL) {
> > > > > 165 h->tag = tag->next;
> > > > > 166 free(tag);
> > > > > 167 }
> > > > > 
> > > > > Note that it crashes differently depending on the optimization level:
> > > > > 
> > > > >   $ cd /usr/src/usr.bin/mandoc
> > > > >   $ make DEBUG=-O0
> > > > >   $ ./obj/mandoc -Thtml `man -w id cat` >/dev/null ; echo $?
> > > > >   0
> > > > > 
> > > > >   $ make DEBUG=-O1
> > > > >   $ ./obj/mandoc -Thtml `man -w id cat` >/dev/null
> > > > >   Segmentation fault (core dumped)
> > > > > 
> > > > >   $ make DEBUG=-O2
> > > > >   $ ./obj/mandoc -Thtml `man -w id cat` >/dev/null
> > > > >   mandoc(32092) in free(): bogus pointer (double free?) 
> > > > > 0x6641bab613b
> > > > >   Abort trap (core dumped)
> > > > > 
> > > > > Need to run now, but wanted to share what seems to be the right 
> > > > > direction.
> > > > > 
> > > > Compile with -O0 to fix temporally the bug.
> > > > But, I also want to note that a binary is not need to be specified, can 
> > > > be a
> > > > just a file... (as second man entry).
> > > > 
> > > 
> > > This fixes it for me,
> > > 
> > >   -Otto
> > > 
> > > Index: main.c
> > > ===
> > > RCS file: /cvs/src/usr.bin/mandoc/main.c,v
> > > retrieving revision 1.247
> > > diff -u -p -r1.247 main.c
> > > --- main.c24 Feb 2020 21:15:05 -  1.247
> > > +++ main.c11 Jun 2020 15:06:43 -
> > > @@ -872,7 +872,7 @@ parse(struct mparse *mp, int fd, const c
> > >   if (outst->outdata == NULL)
> > >   outdata_alloc(outst, outconf);
> > >   else if (outst->outtype == OUTT_HTML)
> > > - html_reset(outst);
> > > + html_reset(outst->outdata);
> > >   mandoc_xr_reset();
> > >   meta = mparse_result(mp);
> > > 
> > Only one comment, don't use -O0 flag as optimization (disabled) to hunt more
> > bugs of this kind.
> 
> I have no clue what you mean by above sentence. If code has a bug,
> optmization level might cause the bug to be hidden or exposed; it can
> work both ways.

The person who didn't fix the bug is giving you advice about fixing the bug.

Re: man double free

2020-06-11 Thread Ingo Schwarze

Hi Otto,

Otto Moerbeek wrote on Thu, Jun 11, 2020 at 05:07:17PM +0200:

> This fixes it for me,

Indeed, i came up with exactly the same diff and had already
written this commit message, but was still testing:

  Fix a regression in rev. 1.238 (2019/07/26):
  Pass the right object to html_reset() or it will crash 
  when rendering more than one manual page to HTML in a row.
  Bug reported by Abel Romero Perez .

Given that you published the diff earlier, please commit, OK schwarze@.


This teaches once again that having type safety is really useful
and artificially circumventing it tends to invite trouble.

On first sight, it seems surprising that such a blatant regression
could survive almost a year.  Then again, the reason probably
is that

  ..

isn't really valid HTML in the first place, so i guess few people
are using -T html with more than one input file at a time, and hence
the bug doesn't really bite in practically relevant use cases.

Yours,
  Ingo


> Index: main.c
> ===
> RCS file: /cvs/src/usr.bin/mandoc/main.c,v
> retrieving revision 1.247
> diff -u -p -r1.247 main.c
> --- main.c24 Feb 2020 21:15:05 -  1.247
> +++ main.c11 Jun 2020 15:06:43 -
> @@ -872,7 +872,7 @@ parse(struct mparse *mp, int fd, const c
>   if (outst->outdata == NULL)
>   outdata_alloc(outst, outconf);
>   else if (outst->outtype == OUTT_HTML)
> - html_reset(outst);
> + html_reset(outst->outdata);
>  
>   mandoc_xr_reset();
>   meta = mparse_result(mp);
>

Re: man double free

2020-06-11 Thread Otto Moerbeek

On Thu, Jun 11, 2020 at 05:15:28PM +0200, Romero Pérez, Abel wrote:

> 
> 
> On 2020-06-11 17:07, Otto Moerbeek wrote:
> > On Thu, Jun 11, 2020 at 04:53:25PM +0200, Romero Pérez, Abel wrote:
> > 
> > > 
> > > 
> > > On 2020-06-11 16:45, Klemens Nanni wrote:
> > > > On Thu, Jun 11, 2020 at 03:59:09PM +0200, Otto Moerbeek wrote:
> > > > > This already trips the bug;
> > > > > 
> > > > >   man -T html -c pfctl id
> > > > > 
> > > > > No need for a custom man function. No clue yet why.
> > > > This is in mandoc's HTML parser, but only happens for multiple manuals
> > > > in html.c:html_reset_internal():
> > > > 
> > > > 164 while ((tag = h->tag) != NULL) {
> > > > 165 h->tag = tag->next;
> > > > 166 free(tag);
> > > > 167 }
> > > > 
> > > > Note that it crashes differently depending on the optimization level:
> > > > 
> > > > $ cd /usr/src/usr.bin/mandoc
> > > > $ make DEBUG=-O0
> > > > $ ./obj/mandoc -Thtml `man -w id cat` >/dev/null ; echo $?
> > > > 0
> > > > 
> > > > $ make DEBUG=-O1
> > > > $ ./obj/mandoc -Thtml `man -w id cat` >/dev/null
> > > > Segmentation fault (core dumped)
> > > > 
> > > > $ make DEBUG=-O2
> > > > $ ./obj/mandoc -Thtml `man -w id cat` >/dev/null
> > > > mandoc(32092) in free(): bogus pointer (double free?) 
> > > > 0x6641bab613b
> > > > Abort trap (core dumped)
> > > > 
> > > > Need to run now, but wanted to share what seems to be the right 
> > > > direction.
> > > > 
> > > Compile with -O0 to fix temporally the bug.
> > > But, I also want to note that a binary is not need to be specified, can 
> > > be a
> > > just a file... (as second man entry).
> > > 
> > 
> > This fixes it for me,
> > 
> > -Otto
> > 
> > Index: main.c
> > ===
> > RCS file: /cvs/src/usr.bin/mandoc/main.c,v
> > retrieving revision 1.247
> > diff -u -p -r1.247 main.c
> > --- main.c  24 Feb 2020 21:15:05 -  1.247
> > +++ main.c  11 Jun 2020 15:06:43 -
> > @@ -872,7 +872,7 @@ parse(struct mparse *mp, int fd, const c
> > if (outst->outdata == NULL)
> > outdata_alloc(outst, outconf);
> > else if (outst->outtype == OUTT_HTML)
> > -   html_reset(outst);
> > +   html_reset(outst->outdata);
> > mandoc_xr_reset();
> > meta = mparse_result(mp);
> > 
> Only one comment, don't use -O0 flag as optimization (disabled) to hunt more
> bugs of this kind.

I have no clue what you mean by above sentence. If code has a bug,
optmization level might cause the bug to be hidden or exposed; it can
work both ways.

-Otto

> 
> Thanks by the patch.
>

Re: man double free

2020-06-11 Thread Romero Pérez , Abel





On 2020-06-11 17:07, Otto Moerbeek wrote:

On Thu, Jun 11, 2020 at 04:53:25PM +0200, Romero Pérez, Abel wrote:




On 2020-06-11 16:45, Klemens Nanni wrote:

On Thu, Jun 11, 2020 at 03:59:09PM +0200, Otto Moerbeek wrote:

This already trips the bug;

man -T html -c pfctl id

No need for a custom man function. No clue yet why.

This is in mandoc's HTML parser, but only happens for multiple manuals
in html.c:html_reset_internal():

164 while ((tag = h->tag) != NULL) {
165 h->tag = tag->next;
166 free(tag);
167 }

Note that it crashes differently depending on the optimization level:

$ cd /usr/src/usr.bin/mandoc
$ make DEBUG=-O0
$ ./obj/mandoc -Thtml `man -w id cat` >/dev/null ; echo $?
0

$ make DEBUG=-O1
$ ./obj/mandoc -Thtml `man -w id cat` >/dev/null
Segmentation fault (core dumped)

$ make DEBUG=-O2
$ ./obj/mandoc -Thtml `man -w id cat` >/dev/null
mandoc(32092) in free(): bogus pointer (double free?) 0x6641bab613b
Abort trap (core dumped)

Need to run now, but wanted to share what seems to be the right direction.


Compile with -O0 to fix temporally the bug.
But, I also want to note that a binary is not need to be specified, can be a
just a file... (as second man entry).



This fixes it for me,

-Otto

Index: main.c
===
RCS file: /cvs/src/usr.bin/mandoc/main.c,v
retrieving revision 1.247
diff -u -p -r1.247 main.c
--- main.c  24 Feb 2020 21:15:05 -  1.247
+++ main.c  11 Jun 2020 15:06:43 -
@@ -872,7 +872,7 @@ parse(struct mparse *mp, int fd, const c
if (outst->outdata == NULL)
outdata_alloc(outst, outconf);
else if (outst->outtype == OUTT_HTML)
-   html_reset(outst);
+   html_reset(outst->outdata);
  
  	mandoc_xr_reset();

meta = mparse_result(mp);

Only one comment, don't use -O0 flag as optimization (disabled) to hunt 
more bugs of this kind.


Thanks by the patch.

Re: man double free

2020-06-11 Thread Otto Moerbeek

On Thu, Jun 11, 2020 at 04:53:25PM +0200, Romero Pérez, Abel wrote:

> 
> 
> On 2020-06-11 16:45, Klemens Nanni wrote:
> > On Thu, Jun 11, 2020 at 03:59:09PM +0200, Otto Moerbeek wrote:
> > > This already trips the bug;
> > > 
> > >   man -T html -c pfctl id
> > > 
> > > No need for a custom man function. No clue yet why.
> > This is in mandoc's HTML parser, but only happens for multiple manuals
> > in html.c:html_reset_internal():
> > 
> > 164 while ((tag = h->tag) != NULL) {
> > 165 h->tag = tag->next;
> > 166 free(tag);
> > 167 }
> > 
> > Note that it crashes differently depending on the optimization level:
> > 
> > $ cd /usr/src/usr.bin/mandoc
> > $ make DEBUG=-O0
> > $ ./obj/mandoc -Thtml `man -w id cat` >/dev/null ; echo $?
> > 0
> > 
> > $ make DEBUG=-O1
> > $ ./obj/mandoc -Thtml `man -w id cat` >/dev/null
> > Segmentation fault (core dumped)
> > 
> > $ make DEBUG=-O2
> > $ ./obj/mandoc -Thtml `man -w id cat` >/dev/null
> > mandoc(32092) in free(): bogus pointer (double free?) 0x6641bab613b
> > Abort trap (core dumped)
> > 
> > Need to run now, but wanted to share what seems to be the right direction.
> > 
> Compile with -O0 to fix temporally the bug.
> But, I also want to note that a binary is not need to be specified, can be a
> just a file... (as second man entry).
> 

This fixes it for me,

-Otto

Index: main.c
===
RCS file: /cvs/src/usr.bin/mandoc/main.c,v
retrieving revision 1.247
diff -u -p -r1.247 main.c
--- main.c  24 Feb 2020 21:15:05 -  1.247
+++ main.c  11 Jun 2020 15:06:43 -
@@ -872,7 +872,7 @@ parse(struct mparse *mp, int fd, const c
if (outst->outdata == NULL)
outdata_alloc(outst, outconf);
else if (outst->outtype == OUTT_HTML)
-   html_reset(outst);
+   html_reset(outst->outdata);
 
mandoc_xr_reset();
meta = mparse_result(mp);

Re: man double free

2020-06-11 Thread Romero Pérez , Abel





On 2020-06-11 16:45, Klemens Nanni wrote:

On Thu, Jun 11, 2020 at 03:59:09PM +0200, Otto Moerbeek wrote:

This already trips the bug;

man -T html -c pfctl id

No need for a custom man function. No clue yet why.

This is in mandoc's HTML parser, but only happens for multiple manuals
in html.c:html_reset_internal():

164 while ((tag = h->tag) != NULL) {
165 h->tag = tag->next;
166 free(tag);
167 }

Note that it crashes differently depending on the optimization level:

$ cd /usr/src/usr.bin/mandoc
$ make DEBUG=-O0
$ ./obj/mandoc -Thtml `man -w id cat` >/dev/null ; echo $?
0

$ make DEBUG=-O1
$ ./obj/mandoc -Thtml `man -w id cat` >/dev/null
Segmentation fault (core dumped)

$ make DEBUG=-O2
$ ./obj/mandoc -Thtml `man -w id cat` >/dev/null
mandoc(32092) in free(): bogus pointer (double free?) 0x6641bab613b
Abort trap (core dumped)

Need to run now, but wanted to share what seems to be the right direction.


Compile with -O0 to fix temporally the bug.
But, I also want to note that a binary is not need to be specified, can 
be a just a file... (as second man entry).

Re: man double free

2020-06-11 Thread Klemens Nanni

On Thu, Jun 11, 2020 at 03:59:09PM +0200, Otto Moerbeek wrote:
> This already trips the bug;
> 
>   man -T html -c pfctl id
> 
> No need for a custom man function. No clue yet why.
This is in mandoc's HTML parser, but only happens for multiple manuals
in html.c:html_reset_internal():

164 while ((tag = h->tag) != NULL) {
165 h->tag = tag->next;
166 free(tag);
167 }

Note that it crashes differently depending on the optimization level:

$ cd /usr/src/usr.bin/mandoc
$ make DEBUG=-O0
$ ./obj/mandoc -Thtml `man -w id cat` >/dev/null ; echo $?
0

$ make DEBUG=-O1
$ ./obj/mandoc -Thtml `man -w id cat` >/dev/null
Segmentation fault (core dumped) 

$ make DEBUG=-O2
$ ./obj/mandoc -Thtml `man -w id cat` >/dev/null
mandoc(32092) in free(): bogus pointer (double free?) 0x6641bab613b
Abort trap (core dumped)

Need to run now, but wanted to share what seems to be the right direction.

Re: man double free

2020-06-11 Thread Romero Pérez , Abel





On 2020-06-11 15:59, Otto Moerbeek wrote:

On Thu, Jun 11, 2020 at 03:15:55PM +0200, Romero Pérez, Abel wrote:


I've got a: man(13835) in free(): bogus pointer (double free?) 0x22c43c2813b

To check please, add the following function to .kshrc and run . ./.kshrc:


function man {
     set -A array "$@"
     tag=${array[$#-1]}
     PAGER="" MANPAGER="" /usr/bin/man -T html -c pfctl $@ > /tmp/man.html |
lynx /tmp/man.html#$tag
     #PAGER="" MANPAGER="" /usr/bin/man -T html -c $@ | lynx -stdin
}

Then launch on prompt: man id


The result if exploited is on screenshot, but on console as follows:

foo$ man id
Abort trap
foo$



This already trips the bug;

man -T html -c pfctl id

No need for a custom man function. No clue yet why.

-Otto


Confirmed, it exploits also with your cmd-line.

Re: man double free

2020-06-11 Thread Otto Moerbeek

On Thu, Jun 11, 2020 at 03:15:55PM +0200, Romero Pérez, Abel wrote:

> I've got a: man(13835) in free(): bogus pointer (double free?) 0x22c43c2813b
> 
> To check please, add the following function to .kshrc and run . ./.kshrc:
> 
> 
> function man {
>     set -A array "$@"
>     tag=${array[$#-1]}
>     PAGER="" MANPAGER="" /usr/bin/man -T html -c pfctl $@ > /tmp/man.html |
> lynx /tmp/man.html#$tag
>     #PAGER="" MANPAGER="" /usr/bin/man -T html -c $@ | lynx -stdin
> }
> 
> Then launch on prompt: man id
> 
> 
> The result if exploited is on screenshot, but on console as follows:
> 
> foo$ man id
> Abort trap
> foo$
> 

This already trips the bug;

man -T html -c pfctl id

No need for a custom man function. No clue yet why.

-Otto

i915_request_create+0x4b: uvm_fault

2020-06-11 Thread Stuart Henderson

I came back to a machine which had died with a uvm_fault in i915_request_create.

traces/sh reg/ps/disassembly of the function/dmesg below.

It's not a GENERIC kernel but I'll send the information anyway because
the changes don't seem especially likely to be implicated. Kernel is
GENERIC.MP + options KQUEUE_DEBUG + pseudo-device dt + clock_gettime diff.

uvm_fault(0xfd86dc92f000, 0x51, 0, 1) -> e
kernel: page fault trap, code=0
Stopped at  i915_request_create+0x4b:   movq0x50(%r14),%rdi
ddb{1}> ps /o
TIDPIDUID PRFLAGS PFLAGS  CPU  COMMAND
 449802  67311   10000x12  00  jq
 407113  31444755 0x2  02  php-7.4
  48223  655467320x200802  0x40020003  mongod
*517013  31216 350x12  01K Xorg
ddb{1}> tr
i915_request_create(fd880f4e99d8) at i915_request_create+0x4b
i915_gem_do_execbuffer(80122078,80e10800,8000340d7020,81760800,0)
 at i915_gem_do_execbuffer+0x2bdb
i915_gem_execbuffer2_ioctl(80122078,8000340d7020,80e10800) 
at i915_gem_execbuffer2_ioctl+0x1ec
drm_do_ioctl(80122078,100,80406469,8000340d7020) at 
drm_do_ioctl+0x274
drmioctl(15700,80406469,8000340d7020,7,800033cba758) at drmioctl+0xdc
VOP_IOCTL(fd87242a14f0,80406469,8000340d7020,7,fd86dd920ec0,800033cba758)
 at VOP_IOCTL+0x55
vn_ioctl(fd872253cb58,80406469,8000340d7020,800033cba758) at 
vn_ioctl+0x75
sys_ioctl(800033cba758,8000340d7130,8000340d7190) at sys_ioctl+0x2df
syscall(8000340d7200) at syscall+0x3b9
Xsyscall() at Xsyscall+0x128
end of kernel
end trace frame: 0x7f7c4310, count: -10
ddb{1}> sh reg
rdi   0x8000340d6790
rsi  0x1
rbp   0x8000340d6870
rbx   0x81732098
rdx0
rcx  0x1
rax 0x51
r8  0x11
r90x81ff85e0rw_ops+0x10
r10  0x3
r11   0xfa811d8f9f11700e
r12   0x8068d900
r13   0x810cf068
r14  0x1
r15   0xfd880f4e99d8
rip   0x813b70abi915_request_create+0x4b
cs   0x8
rflags   0x10207__ALIGN_SIZE+0xf207
rsp   0x8000340d6840
ss  0x10
i915_request_create+0x4b:   movq0x50(%r14),%rdi



ddb{1}> mach ddbcpu 2
Stopped at  x86_ipi_db+0x12:leave
ddb{2}> tr
x86_ipi_db(800022422ff0) at x86_ipi_db+0x12
x86_ipi_handler() at x86_ipi_handler+0x80
Xresume_lapic_ipi() at Xresume_lapic_ipi+0x23
_kernel_lock() at _kernel_lock+0xa2
usertrap(8000358ecb00) at usertrap+0x1b0
recall_trap() at recall_trap+0x8
end of kernel
end trace frame: 0x7f7f6370, count: -6
ddb{2}> sh reg
rdi   0x800022422ff0
rsi0
rbp   0x8000358ec930
rbx   0x821080a8ipifunc+0x38
rdx0
rcx  0x7
rax   0xff7f
r8 0
r9 0
r100
r11   0x81e3a26a7fed6779
r12  0x7
r130
r14   0x800022422ff0
r150
rip   0x814dbd02x86_ipi_db+0x12
cs   0x8
rflags 0x282
rsp   0x8000358ec920
ss 0
x86_ipi_db+0x12:leave



ddb{2}> mach ddbcpu 3
Stopped at  x86_ipi_db+0x12:leave
ddb{3}> tr
x86_ipi_db(80002242bff0) at x86_ipi_db+0x12
x86_ipi_handler() at x86_ipi_handler+0x80
Xresume_lapic_ipi() at Xresume_lapic_ipi+0x23
__mp_acquire_count(8218b388,1) at __mp_acquire_count+0x8b
mi_switch() at mi_switch+0x2a2
yield() at yield+0x8a
coredump_write(800035cb91b0,0,5e8b95c1000,400) at coredump_write+0x60
coredump_elf(800035549af0,800035cb91b0) at coredump_elf+0x133
coredump(800035549af0) at coredump+0x394
trapsignal(800035549af0,a,4,3,5e84ad5c9c9) at trapsignal+0x123
usertrap(800035cb9380) at usertrap+0x14f
recall_trap() at recall_trap+0x8
end of kernel
end trace frame: 0x5e88b139250, count: -12
ddb{3}> sh reg
rdi   0x80002242bff0
rsi0
rbp   0x800035cb8d70
rbx   0x821080a8ipifunc+0x38
rdx0
rcx  0x7
rax   0xff7f
r8 0
r9 0
r100
r11   0x81e3a26a7fed6779
r12  0x7
r13

Re: awk: FS pattern separation issue

2020-06-11 Thread Charlene Wendling

On Thu, 11 Jun 2020 06:09:02 -0600
Todd C. Miller wrote:

> This should be fixed by the commit I just made to awk/lib.c.
> The strlcpy() length parameter was incorrect.
> 
>  - todd

I can confirm that with the neofetch test case, thanks a lot!

Re: awk: i386 broken

2020-06-11 Thread Stuart Henderson

On 2020/06/11 05:59, Todd C. Miller wrote:
> On Thu, 11 Jun 2020 12:36:27 +0100, Stuart Henderson wrote:
> 
> > This "fixes" it ...
> >
> > I think the most sensible approach for now is the backout diff
> > in my previous mail.  Any OKs for that?
> 
> The strlcpy() is wrong now that inputFS is a pointer.
> It should be:
> 
> strlcpy(inputFS, *FS, len_inputFS);
> 
>  - todd
> 

OK

Re: awk: i386 broken

2020-06-11 Thread Todd C . Miller

On Thu, 11 Jun 2020 12:36:27 +0100, Stuart Henderson wrote:

> This "fixes" it ...
>
> I think the most sensible approach for now is the backout diff
> in my previous mail.  Any OKs for that?

The strlcpy() is wrong now that inputFS is a pointer.
It should be:

strlcpy(inputFS, *FS, len_inputFS);

 - todd

Re: Interfaces errors and latency spikes with Intel 82583V

2020-06-11 Thread David Gwynne

The Ifail and Ofail columns are a sum of queue drops and errors. Could you run 
that netstat command with -d and -e so we can see the drops and errors 
separately?

Cheers,
dlg

> On 11 Jun 2020, at 2:21 pm, Gabri Tofano  wrote:
> 
> After extensive testing the latency spikes shown up again:
> 
> To the inside interface of the firewall:
> 
> Reply from 172.16.200.1: bytes=32 time<1ms TTL=254
> Reply from 172.16.200.1: bytes=32 time<1ms TTL=254
> Reply from 172.16.200.1: bytes=32 time<1ms TTL=254
> Reply from 172.16.200.1: bytes=32 time<1ms TTL=254
> Reply from 172.16.200.1: bytes=32 time<1ms TTL=254
> Reply from 172.16.200.1: bytes=32 time<1ms TTL=254
> Reply from 172.16.200.1: bytes=32 time<1ms TTL=254
> Reply from 172.16.200.1: bytes=32 time<1ms TTL=254
> Reply from 172.16.200.1: bytes=32 time<1ms TTL=254
> Reply from 172.16.200.1: bytes=32 time=132ms TTL=254
> Reply from 172.16.200.1: bytes=32 time<1ms TTL=254
> Reply from 172.16.200.1: bytes=32 time<1ms TTL=254
> Reply from 172.16.200.1: bytes=32 time<1ms TTL=254
> Reply from 172.16.200.1: bytes=32 time<1ms TTL=254
> Reply from 172.16.200.1: bytes=32 time<1ms TTL=254
> Reply from 172.16.200.1: bytes=32 time<1ms TTL=254
> Reply from 172.16.200.1: bytes=32 time<1ms TTL=254
> Reply from 172.16.200.1: bytes=32 time<1ms TTL=254
> Reply from 172.16.200.1: bytes=32 time<1ms TTL=254
> 
> And to the firewall's next hop (ISP ONT) at the same time:
> 
> Reply from 74.215.235.1: bytes=32 time=1ms TTL=62
> Reply from 74.215.235.1: bytes=32 time=2ms TTL=62
> Reply from 74.215.235.1: bytes=32 time=1ms TTL=62
> Reply from 74.215.235.1: bytes=32 time=2ms TTL=62
> Reply from 74.215.235.1: bytes=32 time=1ms TTL=62
> Reply from 74.215.235.1: bytes=32 time=3ms TTL=62
> Reply from 74.215.235.1: bytes=32 time=2ms TTL=62
> Reply from 74.215.235.1: bytes=32 time=1ms TTL=62
> Reply from 74.215.235.1: bytes=32 time=3ms TTL=62
> Reply from 74.215.235.1: bytes=32 time=242ms TTL=62
> Reply from 74.215.235.1: bytes=32 time=2ms TTL=62
> Reply from 74.215.235.1: bytes=32 time=2ms TTL=62
> Reply from 74.215.235.1: bytes=32 time=1ms TTL=62
> Reply from 74.215.235.1: bytes=32 time=1ms TTL=62
> Reply from 74.215.235.1: bytes=32 time=2ms TTL=62
> Reply from 74.215.235.1: bytes=32 time=1ms TTL=62
> Reply from 74.215.235.1: bytes=32 time=1ms TTL=62
> Reply from 74.215.235.1: bytes=32 time=3ms TTL=62
> Reply from 74.215.235.1: bytes=32 time=3ms TTL=62
> 
> Interface errors are now showing up just on the output:
> 
> #netstat -i
> NameMtu   Network Address  Ipkts IfailOpkts Ofail 
> Colls
> em0 1500XX:XX:XX:XX:XX:XX22655 041589 0 > 0
> em0 1500  XX.XX.XX.XX XX:XX:XX:XX:XX:XX22655 041589 0 > 0
> em1 1500XX:XX:XX:XX:XX:XX39924 020476 1 > 0
> em1 1500  172.16.200. XX:XX:XX:XX:XX:XX39924 020476 1 > 0
> em2 1500XX:XX:XX:XX:XX:XX  427 0  330 2 > 0
> em2 1500  172.16.103/ XX:XX:XX:XX:XX:XX  427 0  330 2 > 0
> em3*1500XX:XX:XX:XX:XX:XX0 00 0 > 0
> enc0*   00 00 0 > 0
> pflog0  331360 0 1294 0 > 0
> 
> UDP real time traffic is the most affected one as very sensitive and I keep \
> having spikes meanwhile playing online.
> 
> Thank you!
> Gabri
> 
> On 2020-06-10 22:50, Gabri Tofano wrote:
>> Another user pointed out to me that in the OpenBSD 6.7 release notes
>> there is a statement in regards of the em(4) drivers: "Improvements in
>> the em(4) driver." and so I have gave it a try and reinstalled with
>> OpenBSD 6.6. It looks like that the system is now stable and latency
>> spikes/interface errors are not present at all even under heavy
>> traffic loads. I am not sure what introduced the issue but maybe one
>> of the devs can give it a look?
>> Thank you!
>> Gabri
>> On 2020-06-09 13:01, Gabri Tofano wrote:
>>> Hi all,
>>> I'm using a "Protectli FW1" with FreeBSD 12.1 amd64 as a firewall
>>> which is serving me with great performances and no issues at all. The
>>> appliance has 4 Intel Gigabit 82583V Ethernet NIC ports which are
>>> working very well. I have used PFsense as well prior to FreeBSD and it
>>> worked without issues too.
>>> I took the decision to move to OpenBSD 6.7 amd64 in order to benefit
>>> of the latest pf (and other) features but unfortunately the OS is
>>> giving me an issue which I guess is related to the NIC drivers; When I
>>> was connected via ssh I felt some glitches meanwhile I was
>>> typing/moving around with the editor, so I started to ping the inside
>>> interface from a wired connected pc and found out that time to time
>>> the appliance is responding with a 100+/200+ ms response (I have cut
>>> some 1ms reply to make it shorter):
>>> Reply from 172.16.200.1: bytes=32 time=1ms TTL=254
>>> Reply from 172.16.200.1: bytes=32 time=1ms TTL=254

Re: awk: i386 broken

2020-06-11 Thread Stuart Henderson

On 2020/06/11 12:23, Stuart Henderson wrote:
> On 2020/06/11 12:07, Stuart Henderson wrote:
> > On 2020/06/11 12:05, Stuart Henderson wrote:
> > > I am going to start a ports build with the January 5 version, i.e. the
> > > following backout:
> > 
> > This backout fixes neofetch (the problem found by cwen) as well.
> > 
> 
> The upstream version from https://github.com/onetrueawk/awk does not
> show the problem with neofetch. It doesn't like the awk script from
> libgpg-error so I haven't confirmed whether it fixes libgpg-error too.
> 

This "fixes" it ...

I think the most sensible approach for now is the backout diff
in my previous mail.  Any OKs for that?


Index: lib.c
===
RCS file: /cvs/src/usr.bin/awk/lib.c,v
retrieving revision 1.35
diff -u -p -r1.35 lib.c
--- lib.c   10 Jun 2020 21:06:09 -  1.35
+++ lib.c   11 Jun 2020 11:31:38 -
@@ -124,7 +124,7 @@ void savefs(void)
 {
size_t len;
if ((len = strlen(getsval(fsloc))) < len_inputFS) {
-   strlcpy(inputFS, *FS, sizeof(inputFS)); /* for subsequent field 
splitting */
+   strcpy(inputFS, *FS);   /* for subsequent field splitting */
return;
}

Re: awk: i386 broken

2020-06-11 Thread Stuart Henderson

On 2020/06/11 12:07, Stuart Henderson wrote:
> On 2020/06/11 12:05, Stuart Henderson wrote:
> > I am going to start a ports build with the January 5 version, i.e. the
> > following backout:
> 
> This backout fixes neofetch (the problem found by cwen) as well.
> 

The upstream version from https://github.com/onetrueawk/awk does not
show the problem with neofetch. It doesn't like the awk script from
libgpg-error so I haven't confirmed whether it fixes libgpg-error too.

Re: awk: FS pattern separation issue

2020-06-11 Thread Charlene Wendling

On Thu, 11 Jun 2020 11:50:48 +0100
Stuart Henderson wrote:

> On 2020/06/11 11:48, Charlene Wendling wrote:
> > >Synopsis:  FS pattern separation issue 
> > >Category:  awk
> > >Environment:
> > System  : OpenBSD 6.7
> > Details : OpenBSD 6.7-current (GENERIC.MP) #258: Wed
> > Jun 10 20:46:20 MDT 2020
> > dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
> > 
> > Architecture: OpenBSD.amd64
> > Machine : amd64
> > >Description:
> > 
> > In the latest awk we're proposing, FS pattern separation
> > does not work properly and makes neofetch fail to properly get
> > the screen resolution:
> > 
> > $ xrandr --nograb --current | awk -F 'connected |\+|\(' '/
> > connected/ && $2 {printf $2 ", "}'
> > ed primary 1920x1080+0+0 (normal left inverted right x axis y axis)
> > 521mm x 293mm,
> 
> It would be better to show the exact text fed into awk, xrandr depends
> on the machine you run it on.
> 
> For all the old versions I get this with your command:
> 
> illegal primary in regular expression connected |+|( at |(
> 

You're right, and i don't know why but some escapes in my command
vanished. I'm attaching a shell test case that will compare with mawk
and gawk if they're available.


awk_testcase
Description: Binary data

Re: awk: i386 broken

2020-06-11 Thread Stuart Henderson

On 2020/06/11 12:05, Stuart Henderson wrote:
> I am going to start a ports build with the January 5 version, i.e. the
> following backout:

This backout fixes neofetch (the problem found by cwen) as well.

awk: i386 broken

2020-06-11 Thread Stuart Henderson

See attached test case extracted from libgpg-error, it fails on i386
but works on amd64. (ports/lang/gcc failed too, I forgot to save the build
directory).

TZ=GMT cvs up -D '2020/06/10 21:05:00'
January 5, 2020
works as expected

TZ=GMT cvs up -D '2020/06/10 21:05:10'
January 31, 2020
building awk fails

TZ=GMT cvs up -D '2020/06/10 21:05:50'
February 28, 2020
building awk works, test case fails

$ cc -E -P _mkerrcodes.h | grep GPG_ERR_ | awk -f ./mkerrcodes.awk > /dev/null
awk: nonterminated character class [ 
 input record number 2, file 
 source line number 93
*** Error 2 in /tmp/awk-problem (Makefile:4 'all')

I am going to start a ports build with the January 5 version, i.e. the
following backout:

Index: FIXES
===
RCS file: /cvs/src/usr.bin/awk/FIXES,v
retrieving revision 1.33
diff -u -p -r1.33 FIXES
--- FIXES   10 Jun 2020 21:06:09 -  1.33
+++ FIXES   11 Jun 2020 11:02:48 -
@@ -1,4 +1,4 @@
-/* $OpenBSD: FIXES,v 1.33 2020/06/10 21:06:09 millert Exp $*/
+/* $OpenBSD: FIXES,v 1.30 2020/06/10 21:04:40 millert Exp $*/
 /
 Copyright (C) Lucent Technologies 1997
 All Rights Reserved
@@ -26,60 +26,6 @@ THIS SOFTWARE.
 This file lists all bug fixes, changes, etc., made since the AWK book
 was sent to the printers in August, 1987.
 
-June 5, 2020:
-   In fldbld(), make sure that inputFS is set before trying to
-   use it. Thanks to  Steffen Nurpmeso 
-   for the report.
-
-May 5, 2020:
-   Fix checks for compilers that can handle noreturn. Thanks to
-   GitHub user enh-google for pointing it out. Closes Issue #79.
-
-April 16, 2020:
-   Handle old compilers that don't support C11 (for noreturn).
-   Thanks to Arnold Robbins.
-
-April 5, 2020:
-   Use  and noreturn instead of GCC attributes.
-   Thanks to GitHub user awkfan77. Closes PR #77.
-
-February 28, 2020:
-   More cleanups from Christos Zoulas: notably backslash continuation
-   inside strings removes the newline and a fix for RS = "^a".
-   Fix for address sanitizer-found problem. Thanks to GitHub user
-   enh-google.
-
-February 19, 2020:
-   More small cleanups from Christos Zoulas.
-
-February 18, 2020:
-   Additional cleanups from Christos Zoulas. It's no longer necessary
-   to use the -y flag to bison.
-
-February 6, 2020:
-   Additional small cleanups from Christos Zoulas. awk is now
-   a little more robust about reporting I/O errors upon exit.
-
-January 31, 2020:
-   Merge PR #70, which avoids use of variable length arrays. Thanks
-   to GitHub user michaelforney.  Fix issue #60 ({0} in interval
-   expressions doesn't work).  Also get all tests working again.
-   Thanks to Arnold Robbins.
-
-January 24, 2020:
-   A number of small cleanups from Christos Zoulas.  Add the close
-   on exec flag to files/pipes opened for redirection; courtesy of
-   Arnold Robbins.
-
-January 19, 2020:
-   If POSIXLY_CORRECT is set in the environment, then sub and gsub
-   use POSIX rules for multiple backslashes.  This fixes Issue #66,
-   while maintaining backwards compatibility.
-
-January 9, 2020:
-   Input/output errors on closing files are now fatal instead of
-   mere warnings. Thanks to Martijn Dekker .
-
 January 5, 2020:
Fix a bug in the concatentation of two string constants into
one done in the grammar.  Fixes GitHub issue #61.  Thanks
@@ -118,13 +64,13 @@ October 25, 2019:
 
 October 24, 2019:
Import second round of code cleanups from NetBSD. Much thanks
-   to Christos Zoulas (GitHub user zoulasc). Merges PR 53.
+   to Christos Zoulas (Github user zoulasc). Merges PR 53.
Add an optimization for string concatenation, also from
Christos.
 
 October 17, 2019:
Import code cleanups from NetBSD. Much thanks to Christos
-   Zoulas (GitHub user zoulasc). Merges PR 51.
+   Zoulas (Github user zoulasc). Merges PR 51.
 
 October 6, 2019:
Import code from NetBSD awk that implements RS as a regular
@@ -132,7 +78,7 @@ October 6, 2019:
 
 September 10, 2019:
Fixes for various array / memory overruns found via gcc's
-   -fsanitize=unknown. Thanks to Alexander Richardson (GitHub
+   -fsanitize=unknown. Thanks to Alexander Richardson (Github
user arichardson). Merges PRs 47 and 48.
 
 July 28, 2019:
Index: README.md
===
RCS file: /cvs/src/usr.bin/awk/README.md,v
retrieving revision 1.2
diff -u -p -r1.2 README.md
--- README.md   10 Jun 2020 21:05:50 -  1.2
+++ README.md   11 Jun 2020 11:02:48 -
@@ -1,4 +1,4 @@
-$OpenBSD: README.md,v 1.2 2020/06/10 21:05:50 millert Exp $
+$OpenBSD: README.md,v 1.1 2020/06/10 21:04:40 millert Exp $
 
 # The One True Awk
 
@@ -44,19 +44,7 @@ Thanks.
 
 ## Submitting Pull

Re: drm panic

2020-06-11 Thread Jonathan Gray

On Thu, Jun 11, 2020 at 11:07:11AM +0100, Laurence Tratt wrote:
> The recent DRM update has fixed one long-ish-standing bug for me where Xorg
> sometimes would get stuck while executing Xsession (I could never work out
> why) which is really good!
> 
> However I've now had the following kernel panic several times:
> 
>   kernel: page fault trap, code=0
>   Stopped at intel_partial_pages+0xf4:moq 0x58,%rsi
> 
> Unfortunately that also seems to take out my keyboard at ddb so I have no
> further information beyond my dmesg :/

Try this, the code in question does

sg_set_page(sg, NULL, I915_GTT_PAGE_SIZE, 0);
sg_dma_address(sg) =
i915_gem_object_get_dma_address(obj, src_idx);
sg_dma_len(sg) = I915_GTT_PAGE_SIZE;

VM_PAGE_TO_PHYS() will attempt to deref NULL.

Index: sys/dev/pci/drm/include/linux/scatterlist.h
===
RCS file: /cvs/src/sys/dev/pci/drm/include/linux/scatterlist.h,v
retrieving revision 1.2
diff -u -p -r1.2 scatterlist.h
--- sys/dev/pci/drm/include/linux/scatterlist.h 8 Jun 2020 04:48:15 -   
1.2
+++ sys/dev/pci/drm/include/linux/scatterlist.h 11 Jun 2020 10:54:05 -
@@ -115,7 +115,8 @@ sg_set_page(struct scatterlist *sgl, str
 unsigned int length, unsigned int offset)
 {
sgl->__page = page;
-   sgl->dma_address = VM_PAGE_TO_PHYS(page);
+   if (page != NULL)
+   sgl->dma_address = VM_PAGE_TO_PHYS(page);
sgl->offset = offset;
sgl->length = length;
sgl->end = false;

Re: drm panic

2020-06-11 Thread Mark Kettenis

> Date: Thu, 11 Jun 2020 11:07:11 +0100
> From: Laurence Tratt 
> 
> The recent DRM update has fixed one long-ish-standing bug for me where Xorg
> sometimes would get stuck while executing Xsession (I could never work out
> why) which is really good!
> 
> However I've now had the following kernel panic several times:
> 
>   kernel: page fault trap, code=0
>   Stopped at intel_partial_pages+0xf4:moq 0x58,%rsi
> 
> Unfortunately that also seems to take out my keyboard at ddb so I have no
> further information beyond my dmesg :/

That is a bit unforunate; more information would be useful...

Can you put the following line in /etc/sysctl.conf:

  machdep.forceukbd=1

and reboot?  That might help the next time you hit this.

> OpenBSD 6.7-current (GENERIC.MP) #258: Wed Jun 10 20:46:20 MDT 2020
> dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
> real mem = 17028239360 (16239MB)
> avail mem = 16497246208 (15733MB)
> random: good seed from bootblocks
> mpath0 at root
> scsibus0 at mpath0: 256 targets
> mainbus0 at root
> bios0 at mainbus0: SMBIOS rev. 3.0 @ 0xc320 (89 entries)
> bios0: vendor American Megatrends Inc. version "3805" date 05/16/2018
> bios0: ASUSTeK COMPUTER INC. Z170M-PLUS
> acpi0 at bios0: ACPI 6.1
> acpi0: sleep states S0 S3 S4 S5
> acpi0: tables DSDT FACP APIC FPDT BGRT MCFG SSDT FIDT SSDT SSDT HPET SSDT 
> SSDT UEFI SSDT LPIT WSMT SSDT SSDT DBGP DBG2
> acpi0: wakeup devices PEG0(S4) PEGP(S4) PEG1(S4) PEGP(S4) PEG2(S4) PEGP(S4) 
> SIO1(S3) UAR1(S4) RP09(S4) PXSX(S4) RP10(S4) PXSX(S4) RP11(S4) PXSX(S4) 
> RP12(S4) PXSX(S4) [...]
> acpitimer0 at acpi0: 3579545 Hz, 24 bits
> acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
> cpu0 at mainbus0: apid 0 (boot processor)
> cpu0: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz, 4011.41 MHz, 06-5e-03
> cpu0: 
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,SGX,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PT,MD_CLEAR,TSXFA,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES,MELTDOWN
> cpu0: 256KB 64b/line 8-way L2 cache
> cpu0: smt 0, core 0, package 0
> mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges
> cpu0: apic clock running at 24MHz
> cpu0: mwait min=64, max=64, C-substates=0.2.1.2.4.1, IBE
> cpu1 at mainbus0: apid 2 (application processor)
> cpu1: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz, 4009.91 MHz, 06-5e-03
> cpu1: 
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,SGX,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PT,MD_CLEAR,TSXFA,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES,MELTDOWN
> cpu1: 256KB 64b/line 8-way L2 cache
> cpu1: smt 0, core 1, package 0
> cpu2 at mainbus0: apid 4 (application processor)
> cpu2: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz, 4009.91 MHz, 06-5e-03
> cpu2: 
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,SGX,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PT,MD_CLEAR,TSXFA,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES,MELTDOWN
> cpu2: 256KB 64b/line 8-way L2 cache
> cpu2: smt 0, core 2, package 0
> cpu3 at mainbus0: apid 6 (application processor)
> cpu3: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz, 4009.91 MHz, 06-5e-03
> cpu3: 
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,SGX,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PT,MD_CLEAR,TSXFA,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES,MELTDOWN
> cpu3: 256KB 64b/line 8-way L2 cache
> cpu3: smt 0, core 3, package 0
> ioapic0 at mainbus0: apid 2 pa 0xfec0, version 20, 120 pins
> acpimcfg0 at acpi0
> acpimcfg0: addr 0xf800, bus 0-63
> acpihpet0 at acpi0: 2399 Hz
> acpiprt0 at acpi0: bus 0 (PCI0)
> acpiprt1 at acpi0: bus -1 (PEG0)
> acpiprt2 at acpi0: bus -1 (PEG1)
> acpiprt3 at acpi0: bus -1

Re: awk: FS pattern separation issue

2020-06-11 Thread Stuart Henderson

On 2020/06/11 11:48, Charlene Wendling wrote:
> >Synopsis:FS pattern separation issue 
> >Category:awk
> >Environment:
>   System  : OpenBSD 6.7
>   Details : OpenBSD 6.7-current (GENERIC.MP) #258: Wed Jun 10
> 20:46:20 MDT 2020
> dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
> 
>   Architecture: OpenBSD.amd64
>   Machine : amd64
> >Description:
> 
>   In the latest awk we're proposing, FS pattern separation does
>   not work properly and makes neofetch fail to properly get
>   the screen resolution:
> 
> $ xrandr --nograb --current | awk -F 'connected |\+|\(' '/ connected/
> && $2 {printf $2 ", "}'
> ed primary 1920x1080+0+0 (normal left inverted right x axis y axis)
> 521mm x 293mm,

It would be better to show the exact text fed into awk, xrandr depends
on the machine you run it on.

For all the old versions I get this with your command:

illegal primary in regular expression connected |+|( at |(

drm panic

2020-06-11 Thread Laurence Tratt

The recent DRM update has fixed one long-ish-standing bug for me where Xorg
sometimes would get stuck while executing Xsession (I could never work out
why) which is really good!

However I've now had the following kernel panic several times:

  kernel: page fault trap, code=0
  Stopped at intel_partial_pages+0xf4:moq 0x58,%rsi

Unfortunately that also seems to take out my keyboard at ddb so I have no
further information beyond my dmesg :/


Laurie


OpenBSD 6.7-current (GENERIC.MP) #258: Wed Jun 10 20:46:20 MDT 2020
dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 17028239360 (16239MB)
avail mem = 16497246208 (15733MB)
random: good seed from bootblocks
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 3.0 @ 0xc320 (89 entries)
bios0: vendor American Megatrends Inc. version "3805" date 05/16/2018
bios0: ASUSTeK COMPUTER INC. Z170M-PLUS
acpi0 at bios0: ACPI 6.1
acpi0: sleep states S0 S3 S4 S5
acpi0: tables DSDT FACP APIC FPDT BGRT MCFG SSDT FIDT SSDT SSDT HPET SSDT SSDT 
UEFI SSDT LPIT WSMT SSDT SSDT DBGP DBG2
acpi0: wakeup devices PEG0(S4) PEGP(S4) PEG1(S4) PEGP(S4) PEG2(S4) PEGP(S4) 
SIO1(S3) UAR1(S4) RP09(S4) PXSX(S4) RP10(S4) PXSX(S4) RP11(S4) PXSX(S4) 
RP12(S4) PXSX(S4) [...]
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz, 4011.41 MHz, 06-5e-03
cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,SGX,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PT,MD_CLEAR,TSXFA,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES,MELTDOWN
cpu0: 256KB 64b/line 8-way L2 cache
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges
cpu0: apic clock running at 24MHz
cpu0: mwait min=64, max=64, C-substates=0.2.1.2.4.1, IBE
cpu1 at mainbus0: apid 2 (application processor)
cpu1: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz, 4009.91 MHz, 06-5e-03
cpu1: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,SGX,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PT,MD_CLEAR,TSXFA,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES,MELTDOWN
cpu1: 256KB 64b/line 8-way L2 cache
cpu1: smt 0, core 1, package 0
cpu2 at mainbus0: apid 4 (application processor)
cpu2: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz, 4009.91 MHz, 06-5e-03
cpu2: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,SGX,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PT,MD_CLEAR,TSXFA,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES,MELTDOWN
cpu2: 256KB 64b/line 8-way L2 cache
cpu2: smt 0, core 2, package 0
cpu3 at mainbus0: apid 6 (application processor)
cpu3: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz, 4009.91 MHz, 06-5e-03
cpu3: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,SGX,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PT,MD_CLEAR,TSXFA,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES,MELTDOWN
cpu3: 256KB 64b/line 8-way L2 cache
cpu3: smt 0, core 3, package 0
ioapic0 at mainbus0: apid 2 pa 0xfec0, version 20, 120 pins
acpimcfg0 at acpi0
acpimcfg0: addr 0xf800, bus 0-63
acpihpet0 at acpi0: 2399 Hz
acpiprt0 at acpi0: bus 0 (PCI0)
acpiprt1 at acpi0: bus -1 (PEG0)
acpiprt2 at acpi0: bus -1 (PEG1)
acpiprt3 at acpi0: bus -1 (PEG2)
acpiprt4 at acpi0: bus 4 (RP09)
acpiprt5 at acpi0: bus -1 (RP10)
acpiprt6 at acpi0: bus -1 (RP11)
acpiprt7 at acpi0: bus -1 (RP12)
acpiprt8 at acpi0: bus -1 (RP13)
acpiprt9 at acpi0: bus 3 (RP01)
acpiprt10 at acpi0: bus -1 (RP02)
acpiprt11 at acpi0: bus -1 (RP03)
acpiprt12 at acpi0: bus -1 (RP04)
acpiprt13 at acpi0: bus -1 (RP05)
acpiprt14 at acpi0: bus -1 (RP06)
acpiprt15 at

Re: OpenBSD 6.7 crashes on APU2C4 with LTE modem Huawei E3372s-153 HiLink

2020-06-11 Thread Łukasz Lejtkowski

Hi Gerhard,

Today I added Your patches to 6.7-stable and moved back LTE modem to USB 3.0. 
So, just waiting for… nothing or kernel panic. I’ll let you know. 

> On 8 Jun 2020, at 19:13, Patrick Wildt  wrote:
> 
> On Mon, Jun 08, 2020 at 05:31:44PM +0200, Gerhard Roth wrote:
>> On 2020-05-25 13:19, Martin Pieuchot wrote:
>>> On 25/05/20(Mon) 12:56, Gerhard Roth wrote:
 On 5/22/20 9:05 PM, Mark Kettenis wrote:
>> From: Łukasz Lejtkowski 
>> Date: Fri, 22 May 2020 20:51:57 +0200
>> 
>> Probably power supply 12 V is broken. Showing 16,87 V(Fluke 179) -
>> too high. Should be 12,25-12,50 V. I replaced to the new one.
> 
> That might be why the device stops responding.  The fact that cleaning
> up from a failed USB transaction leads to this panic is a bug though.
> 
> And somebody just posted a very similar panic with ure(4).  Something
> in the network stack is holding a mutex when it shouldn't.
 
 I think that holding the mutex is ok. The bug is calling the stop
 routine in case of errors.
 
 This is what common foo_start() does:
 
m_head = ifq_deq_begin(>if_snd);
if (foo_encap(sc, m_head, 0)) {
ifq_deq_rollback(>if_snd, m_head);
...
return;
}
ifq_deq_commit(>if_snd, m_head);
 
 Here, ifq_deq_begin() grabs a mutex and it is held while
 calling foo_encap().
 
 For USB network interfaces foo_encap() mostly does this:
 
err = usbd_transfer(sc->sc_xfer);
if (err != USBD_IN_PROGRESS) {
foo_stop(sc);
return EIO;
}
 
 And foo_stop() calls usbd_abort_pipe() -> xhci_command_submit(),
 which might sleep.
 
 How to fix? We could do the foo_encap() after the ifq_deq_commit(),
 possibly dropping the current mbuf if encap fails (who cares
 for the packets after foo_stop() anyway).
>>> 
>>> That's the approach taken by drivers using ifq_dequeue(9) instead of
>>> ifq_deq_begin/commit().
>>> 
 Or change all the drivers to follow the path that if_aue.c takes:
 
err = usbd_transfer(c->aue_xfer);
if (err != USBD_IN_PROGRESS) {
...
/* Stop the interface from process context. */
usb_add_task(sc->aue_udev, >aue_stop_task);
return (EIO);
}
>>> 
>>> That's just trading the current problem for another one with higher
>>> complexity.
>>> 
 Any ideas, what's better? Or alternative proposals?
>>> 
>>> Using ifq_dequeue(9) would have the advantage of unifying the code base.
>>> It introduces a behavior change.  A simpler fix would be to call
>>> foo_stop() in the error path after ifq_deq_rollback().
>>> 
>> 
>> Hi,
>> 
>> two weeks passed any nobody objected Martin's proposal. So I thought,
>> we could try to move on this way.
>> 
>> Gerhard
>> 
> 
> From what I remember from various discussions, the goal should be to
> check if there's a buffer free in the ring, then dequeue and send, and
> it it can't be sent out, then drop it.  With USB apparently those
> drivers "always" have an open buffer, so we can just dequeue and send,
> like you do in this diff.  And if it gets dropped, that's fine.
> 
> That said, I think IFQ_DEQUEUE() is old compat code, and we actually
> nowadays prefer:
> 
> m_head = ifq_dequeue(>if_snd);
> 
> If you look at the define for IFQ_DEQUEUE() you'll see it's marked
> as compat code.  If you look at a new driver, like ixl(4), you'll
> see that it also uses ifq_dequeue().
> 
> Sorry to to give you some work, but with that fixed: ok patrick@
> 
> Patrick
> 
>> 
>> Index: sys/dev/usb/if_axe.c
>> ===
>> RCS file: /cvs/src/sys/dev/usb/if_axe.c,v
>> retrieving revision 1.139
>> diff -u -p -u -p -r1.139 if_axe.c
>> --- sys/dev/usb/if_axe.c 7 Jul 2019 06:40:10 -   1.139
>> +++ sys/dev/usb/if_axe.c 8 Jun 2020 15:13:25 -
>> @@ -1223,6 +1223,7 @@ axe_encap(struct axe_softc *sc, struct m
>>  /* Transmit */
>>  err = usbd_transfer(c->axe_xfer);
>>  if (err != USBD_IN_PROGRESS) {
>> +c->axe_mbuf = NULL;
>>  axe_stop(sc);
>>  return(EIO);
>>  }
>> @@ -1246,16 +1247,15 @@ axe_start(struct ifnet *ifp)
>>  if (ifq_is_oactive(>if_snd))
>>  return;
>> 
>> -m_head = ifq_deq_begin(>if_snd);
>> +IFQ_DEQUEUE(>if_snd, m_head);
>>  if (m_head == NULL)
>>  return;
>> 
>>  if (axe_encap(sc, m_head, 0)) {
>> -ifq_deq_rollback(>if_snd, m_head);
>> +m_freem(m_head);
>>  ifq_set_oactive(>if_snd);
>>  return;
>>  }
>> -ifq_deq_commit(>if_snd, m_head);
>> 
>>  /*
>>   * If there's a BPF listener, bounce a copy of this frame
>> Index: sys/dev/usb/if_axen.c
>> ===
>> RCS file:

Re: Interfaces errors and latency spikes with Intel 82583V

2020-06-11 Thread Gabri Tofano


After extensive testing the latency spikes shown up again:

To the inside interface of the firewall:

Reply from 172.16.200.1: bytes=32 time<1ms TTL=254
Reply from 172.16.200.1: bytes=32 time<1ms TTL=254
Reply from 172.16.200.1: bytes=32 time<1ms TTL=254
Reply from 172.16.200.1: bytes=32 time<1ms TTL=254
Reply from 172.16.200.1: bytes=32 time<1ms TTL=254
Reply from 172.16.200.1: bytes=32 time<1ms TTL=254
Reply from 172.16.200.1: bytes=32 time<1ms TTL=254
Reply from 172.16.200.1: bytes=32 time<1ms TTL=254
Reply from 172.16.200.1: bytes=32 time<1ms TTL=254
Reply from 172.16.200.1: bytes=32 time=132ms TTL=254
Reply from 172.16.200.1: bytes=32 time<1ms TTL=254
Reply from 172.16.200.1: bytes=32 time<1ms TTL=254
Reply from 172.16.200.1: bytes=32 time<1ms TTL=254
Reply from 172.16.200.1: bytes=32 time<1ms TTL=254
Reply from 172.16.200.1: bytes=32 time<1ms TTL=254
Reply from 172.16.200.1: bytes=32 time<1ms TTL=254
Reply from 172.16.200.1: bytes=32 time<1ms TTL=254
Reply from 172.16.200.1: bytes=32 time<1ms TTL=254
Reply from 172.16.200.1: bytes=32 time<1ms TTL=254

And to the firewall's next hop (ISP ONT) at the same time:

Reply from 74.215.235.1: bytes=32 time=1ms TTL=62
Reply from 74.215.235.1: bytes=32 time=2ms TTL=62
Reply from 74.215.235.1: bytes=32 time=1ms TTL=62
Reply from 74.215.235.1: bytes=32 time=2ms TTL=62
Reply from 74.215.235.1: bytes=32 time=1ms TTL=62
Reply from 74.215.235.1: bytes=32 time=3ms TTL=62
Reply from 74.215.235.1: bytes=32 time=2ms TTL=62
Reply from 74.215.235.1: bytes=32 time=1ms TTL=62
Reply from 74.215.235.1: bytes=32 time=3ms TTL=62
Reply from 74.215.235.1: bytes=32 time=242ms TTL=62
Reply from 74.215.235.1: bytes=32 time=2ms TTL=62
Reply from 74.215.235.1: bytes=32 time=2ms TTL=62
Reply from 74.215.235.1: bytes=32 time=1ms TTL=62
Reply from 74.215.235.1: bytes=32 time=1ms TTL=62
Reply from 74.215.235.1: bytes=32 time=2ms TTL=62
Reply from 74.215.235.1: bytes=32 time=1ms TTL=62
Reply from 74.215.235.1: bytes=32 time=1ms TTL=62
Reply from 74.215.235.1: bytes=32 time=3ms TTL=62
Reply from 74.215.235.1: bytes=32 time=3ms TTL=62

Interface errors are now showing up just on the output:

#netstat -i
NameMtu   Network Address  Ipkts IfailOpkts 
Ofail Colls
em0 1500XX:XX:XX:XX:XX:XX22655 041589 
0 0
em0 1500  XX.XX.XX.XX XX:XX:XX:XX:XX:XX22655 041589 
0 0
em1 1500XX:XX:XX:XX:XX:XX39924 020476 
1 0
em1 1500  172.16.200. XX:XX:XX:XX:XX:XX39924 020476 
1 0
em2 1500XX:XX:XX:XX:XX:XX  427 0  330 
2 0
em2 1500  172.16.103/ XX:XX:XX:XX:XX:XX  427 0  330 
2 0
em3*1500XX:XX:XX:XX:XX:XX0 00 
0 0
enc0*   00 00 
0 0
pflog0  331360 0 1294 
0 0


UDP real time traffic is the most affected one as very sensitive and I 
keep \

having spikes meanwhile playing online.

Thank you!
Gabri

On 2020-06-10 22:50, Gabri Tofano wrote:

Another user pointed out to me that in the OpenBSD 6.7 release notes
there is a statement in regards of the em(4) drivers: "Improvements in
the em(4) driver." and so I have gave it a try and reinstalled with
OpenBSD 6.6. It looks like that the system is now stable and latency
spikes/interface errors are not present at all even under heavy
traffic loads. I am not sure what introduced the issue but maybe one
of the devs can give it a look?

Thank you!
Gabri

On 2020-06-09 13:01, Gabri Tofano wrote:

Hi all,

I'm using a "Protectli FW1" with FreeBSD 12.1 amd64 as a firewall
which is serving me with great performances and no issues at all. The
appliance has 4 Intel Gigabit 82583V Ethernet NIC ports which are
working very well. I have used PFsense as well prior to FreeBSD and it
worked without issues too.

I took the decision to move to OpenBSD 6.7 amd64 in order to benefit
of the latest pf (and other) features but unfortunately the OS is
giving me an issue which I guess is related to the NIC drivers; When I
was connected via ssh I felt some glitches meanwhile I was
typing/moving around with the editor, so I started to ping the inside
interface from a wired connected pc and found out that time to time
the appliance is responding with a 100+/200+ ms response (I have cut
some 1ms reply to make it shorter):

Reply from 172.16.200.1: bytes=32 time=1ms TTL=254
Reply from 172.16.200.1: bytes=32 time=1ms TTL=254
Reply from 172.16.200.1: bytes=32 time=163ms TTL=254
Reply from 172.16.200.1: bytes=32 time=1ms TTL=254
Reply from 172.16.200.1: bytes=32 time=1ms TTL=254
Reply from 172.16.200.1: bytes=32 time=1ms TTL=254
Reply from 172.16.200.1: bytes=32 time=1ms TTL=254
Reply from 172.16.200.1: bytes=32 time<1ms TTL=254
Reply from 172.16.200.1: bytes=32 time=2ms TTL=254
Reply from 172.16.200.1: bytes=32 time=1ms TTL=254
Reply from 172.16.200.1: bytes=32 time<1ms TTL=254

40 matches

Mail list logo