Chrome crashes on U2F prompt

2020-06-17 Thread openbsd
>Synopsis:  Chrome crashes on U2F prompt
>Category:
>Environment:
System  : OpenBSD 6.7
Details : OpenBSD 6.7-current (GENERIC.MP) #272: Mon Jun 15 
01:54:58 MDT 2020
 
dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP

Architecture: OpenBSD.amd64
Machine : amd64
>Description:
Chrome crashes immediately when trying to add or use a Yubikey U2F
security key in Stripe and Github.

The crash occurs with or without a security key inserted into the
computer.

Nothing appears in dmesg indicating a pledge issue.
>How-To-Repeat:
1. Go to the Stripe dashboard. Navigate to your profile
2. Click "Add authentication step"
3. Click "Add security key...". The browser will immediately crash.
>Fix:
Unknown


dmesg:
OpenBSD 6.7-current (GENERIC.MP) #272: Mon Jun 15 01:54:58 MDT 2020
dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 17023057920 (16234MB)
avail mem = 16492204032 (15728MB)
random: good seed from bootblocks
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 3.0 @ 0x8b1a8000 (94 entries)
bios0: vendor American Megatrends Inc. version "3801" date 03/14/2018
bios0: ASUSTeK COMPUTER INC. Z170-AR
acpi0 at bios0: ACPI 6.0
acpi0: sleep states S0 S3 S4 S5
acpi0: tables DSDT FACP APIC FPDT DBG2 MCFG SSDT FIDT SSDT SSDT HPET SSDT SSDT 
UEFI SSDT LPIT WSMT SSDT SSDT DBGP
acpi0: wakeup devices PEG0(S4) PEGP(S4) PEG1(S4) PEGP(S4) PEG2(S4) PEGP(S4) 
SIO1(S3) PS2K(S4) PS2M(S4) UAR1(S4) RP09(S4) PXSX(S4) RP10(S4) PXSX(S4) 
RP11(S4) PXSX(S4) [...]
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Core(TM) i5-6600K CPU @ 3.50GHz, 3510.92 MHz, 06-5e-03
cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,SGX,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PT,MD_CLEAR,TSXFA,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES,MELTDOWN
cpu0: 256KB 64b/line 8-way L2 cache
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges
cpu0: apic clock running at 24MHz
cpu0: mwait min=64, max=64, C-substates=0.2.1.2.4.1, IBE
cpu1 at mainbus0: apid 2 (application processor)
cpu1: Intel(R) Core(TM) i5-6600K CPU @ 3.50GHz, 3509.51 MHz, 06-5e-03
cpu1: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,SGX,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PT,MD_CLEAR,TSXFA,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES,MELTDOWN
cpu1: 256KB 64b/line 8-way L2 cache
cpu1: smt 0, core 1, package 0
cpu2 at mainbus0: apid 4 (application processor)
cpu2: Intel(R) Core(TM) i5-6600K CPU @ 3.50GHz, 3509.51 MHz, 06-5e-03
cpu2: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,SGX,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PT,MD_CLEAR,TSXFA,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES,MELTDOWN
cpu2: 256KB 64b/line 8-way L2 cache
cpu2: smt 0, core 2, package 0
cpu3 at mainbus0: apid 6 (application processor)
cpu3: Intel(R) Core(TM) i5-6600K CPU @ 3.50GHz, 3509.50 MHz, 06-5e-03
cpu3: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,SGX,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PT,MD_CLEAR,TSXFA,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES,MELTDOWN
cpu3: 256KB 64b/line 8-way L2 cache
cpu3: smt 0, core 3, package 0
ioapic0 at mainbus0: apid 2 pa 0xfec0, version 20, 120 pins
acpimcfg0 at acpi0
acpimcfg0: addr 0xe000, bus 0-255
acpihpet0 at acpi0: 2399 Hz
acpiprt0 at acpi0: bus 0 (PCI0)
acpiprt1 at acpi0: bus -1 (PEG0)
acpiprt2 at acpi0: bus -1 (PEG1)

Re: NSD sendto issue

2020-06-17 Thread Joerg Jung


> On 17. Feb 2020, at 15:16, Martin Pieuchot  wrote:
> On 17/02/20(Mon) 14:55, Joerg Jung wrote:
>> 
>>> On 26. Sep 2019, at 15:02, Stuart Henderson  wrote:
>>> On 2019/09/26 13:45, Stuart Henderson wrote:
 On 2019/09/26 11:16, Joerg Jung wrote:
> 
> 
> I run a few busy (~800 req/s) NSD servers which I upgraded 
> to 6.5, all stock/default OpenBSD, e.g. I’ve not tweaked any 
> sysctl values and nsd.conf matches the default as well, just 
> added a few hundred zones.
> 
> Now, when I increase servers from default 1 to 2 in nsd.conf: 
>   server-count: 2
> it starts spamming my log with:
>   nsd[62723]: sendto 1.2.3.4 failed: Resource temporarily unavailable
> 
> checking the source, server.c seems not to handle EAGAIN 
> after sendto() and does not recover or retry, it just increases
> txerr statistic count - so answer seems really lost :(
> 
> I tried higher debug level, as well as increasing socket buffers to: 
>   net.inet.udp.recvspace= 65536
>   net.inet.udp.sendspace=65636
> but both didn’t help and netstat -s -p udp does show 
>   0 dropped due to full socket buffers  
> anyways. So, I don’t believe this is a socket buffer issue.
> 
> The same server-count: 2 setting worked fine with 6.3.
> 
> Any hints, insights, or pointers?
> Does anyone else experience the same?
> 
 
 Maybe it's worth trying to track down further whether this is due to an
 NSD change or something else in the OS - cvs up -r OPENBSD_6_3 .. (be sure
 to use "make -f Makefile.bsd-wrapper [..]" when building).
 
>>> 
>>> Or, following a comment from claudio@, try a kernel built with this:
>> 
>> FYI, I tried that diff and a few other things but neither did help. 

FYI, after upgrade to 6.7 the issue is still the same, 
just with different format for the syslog message not 
explicitly mentioning sendto() anymore, the new error
log repeating gazillion of times looks like this:
nsd[55919]: sendmmsg [0]=1.2.3.4 count=1 failed: Resource temporarily 
unavailable

> Did you ktrace(1) the problem?

Yes, please see below.

> How is sendto(2) called, in particular
> is there any MSG_DONTWAIT or FNONBLOCK set on the file descriptor?  

>From the ktrace dump it seems not sendto() being the culprit, but recvfrom().
Neither, sendto() nor recvfrom() seem to have any flags set.

> Does
> that mean the kernel returns EWOULDBLOCK even if the userland said it is
> fine to block?

I don’t think so. 


From my limited understanding it seems the two processes 
seem to concurrently try to recvfrom() after BOTH received kevent() 
notification with only one being the “winner” and the other one 
logging the failing?

Shouldn’t only one of the two try to call recvfrom()?

Not sure if this is a bug in OpenBSD or NSD, but
it worked fine in earlier releases.

However, this one seems related:
https://www.nlnetlabs.nl/bugs-script/show_bug.cgi?id=385 




 52438 nsd  STRU  struct kevent { ident=5, filter=EVFILT_READ, 
flags=0x1, fflags=0<>, data=62
, udata=0x30562fe4198 }
 55919 nsd  STRU  struct kevent { ident=5, filter=EVFILT_READ, 
flags=0x1, fflags=0<>, data=62
, udata=0x30562fe4198 }
 52438 nsd  RET   kevent 1
 55919 nsd  RET   kevent 1
 52438 nsd  CALL  clock_gettime(CLOCK_MONOTONIC,0x7f7bf1a0)
 55919 nsd  CALL  clock_gettime(CLOCK_MONOTONIC,0x7f7bf1a0)
 52438 nsd  STRU  struct timespec { 1224.307942259 }
 55919 nsd  STRU  struct timespec { 1224.307946954 }
 52438 nsd  RET   clock_gettime 0
 55919 nsd  RET   clock_gettime 0
 52438 nsd  CALL  
recvfrom(5,0x30467ecd000,0x20109,0,0x304e553dc18,0x302669c2aa8)
 55919 nsd  CALL  
recvfrom(5,0x30497a9d000,0x20109,0,0x304e553dc18,0x302669c2aa8)
 52438 nsd  GIO   fd 5 read 46 bytes
   
"\aY\0\^P\0\^A\0\0\0\0\0\^A\^Dwpad\^Efoo\^Fbar\0\0\^\\0\^A\0\0)\^P\0\0\0\M^@\0\0\0"
 55919 nsd  RET   recvfrom -1 errno 35 Resource temporarily unavailable
 52438 nsd  STRU  struct sockaddr { AF_INET, 1.2.3.4:40276 }
 55919 nsd  CALL  clock_gettime(CLOCK_MONOTONIC,0x7f7bf1a0)
 52438 nsd  RET   recvfrom 46/0x2e
 55919 nsd  STRU  struct timespec { 1224.307967366 }
 55919 nsd  RET   clock_gettime 0
 52438 nsd  CALL  gettimeofday(0x7f7beed0,0)
 55919 nsd  CALL  clock_gettime(CLOCK_MONOTONIC,0x7f7bf1a0)
 52438 nsd  STRU  struct timeval { 1592402542<"Jun 17 10:02:22 
2020">.287233 }
 55919 nsd  STRU  struct timespec { 1224.307977172 }
 52438 nsd  RET   gettimeofday 0
 55919 nsd  RET   clock_gettime 0
 52438 nsd  CALL  sendto(5,0x30467ecd000,0x2e,0,0x304e553dc18,0x10)
 55919 nsd  CALL  kevent(11,0,0,0x30549983000,64,0x7f7bf0f8)
 52438 nsd  STRU  struct sockaddr { AF_INET, 1.2.3.4:40276 }
 55919 nsd  STRU  struct timespec { 109.487449000 }
 52438 nsd  GIO   fd 5 wrote 46 bytes
   

Re: OpenBSD 6.7 crashes on APU2C4 with LTE modem Huawei E3372s-153 HiLink

2020-06-17 Thread Łukasz Lejtkowski
> Does it recover after doing
> 
>   # ifconfig cdcef0 down
>   # ifconfig cdcef0 up

root@master[~]ifconfig cdce0
cdce0: flags=8c03 mtu 1500
lladdr xx:xx:xx:xx:xx:xx
index 30 priority 0 llprio 3
inet 192.168.8.100 netmask 0xff00 broadcast 192.168.8.255

root@master[~]ping 192.168.8.1
PING 192.168.8.1 (192.168.8.1): 56 data bytes
ping: sendmsg: Network is down
ping: wrote 192.168.8.1 64 chars, ret=-1
ping: sendmsg: Network is down
ping: wrote 192.168.8.1 64 chars, ret=-1
ping: sendmsg: Network is down

root@master[~]ifconfig cdce0 down
root@master[~]ifconfig cdce0 up

root@master[~]ping 192.168.8.1
PING 192.168.8.1 (192.168.8.1): 56 data bytes
64 bytes from 192.168.8.1: icmp_seq=0 ttl=64 time=20.804 ms
64 bytes from 192.168.8.1: icmp_seq=1 ttl=64 time=13.976 ms
64 bytes from 192.168.8.1: icmp_seq=2 ttl=64 time=14.468 ms

Probably some good sign for You to next patch?


> On 15 Jun 2020, at 20:09, Gerhard Roth  wrote:
> 
> On 2020-06-13 01:24, Łukasz Lejtkowski wrote:
>> Good news - no more kernel panics on USB 3.0(xHCI), it’s fixed.
>> Bad news - after 2-3h LTE modem lost local network connection via USB 
>> 3.0(cdce0). I have to remove modem and put it back to usb port - then local 
>> network connection between OpenBSD and modem back for 2-3h, sometimes 30-40 
>> min. It looks like the same problem as kernel panic, but this time there is 
>> lost network connection via usb 3.0(xhci).
>> root@master[~]ping 192.168.8.1
>> PING 192.168.8.1 (192.168.8.1): 56 data bytes
>> ping: sendmsg: Network is down
>> ping: wrote 192.168.8.1 64 chars, ret=-1
>> ping: sendmsg: Network is down
>> ping: wrote 192.168.8.1 64 chars, ret=-1
>> 192.168.8.1 is default static IP on lte modem.
>> Your changes in if_cdce.c 1.77 not completely fix the problem.
> 
> Hi,
> 
> yes, my patch just targeted to fix the panic as a reaction to USB problems; 
> not the USB problems themself.
> 
> Does it recover after doing
> 
>   # ifconfig cdcef0 down
>   # ifconfig cdcef0 up
> 
> Gerhard
> 
>>> On 11 Jun 2020, at 11:13, Łukasz Lejtkowski >> > wrote:
>>> 
>>> Hi Gerhard,
>>> 
>>> Today I added Your patches to 6.7-stable and moved back LTE modem to USB 
>>> 3.0. So, just waiting for… nothing or kernel panic. I’ll let you know.
>>> 
 On 8 Jun 2020, at 19:13, Patrick Wildt >>> > wrote:
 
 On Mon, Jun 08, 2020 at 05:31:44PM +0200, Gerhard Roth wrote:
> On 2020-05-25 13:19, Martin Pieuchot wrote:
>> On 25/05/20(Mon) 12:56, Gerhard Roth wrote:
>>> On 5/22/20 9:05 PM, Mark Kettenis wrote:
> From: Łukasz Lejtkowski mailto:emig...@gmail.com>>
> Date: Fri, 22 May 2020 20:51:57 +0200
> 
> Probably power supply 12 V is broken. Showing 16,87 V(Fluke 179) -
> too high. Should be 12,25-12,50 V. I replaced to the new one.
 
 That might be why the device stops responding.  The fact that cleaning
 up from a failed USB transaction leads to this panic is a bug though.
 
 And somebody just posted a very similar panic with ure(4).  Something
 in the network stack is holding a mutex when it shouldn't.
>>> 
>>> I think that holding the mutex is ok. The bug is calling the stop
>>> routine in case of errors.
>>> 
>>> This is what common foo_start() does:
>>> 
>>> m_head = ifq_deq_begin(>if_snd);
>>> if (foo_encap(sc, m_head, 0)) {
>>> ifq_deq_rollback(>if_snd, m_head);
>>> ...
>>> return;
>>> }
>>> ifq_deq_commit(>if_snd, m_head);
>>> 
>>> Here, ifq_deq_begin() grabs a mutex and it is held while
>>> calling foo_encap().
>>> 
>>> For USB network interfaces foo_encap() mostly does this:
>>> 
>>> err = usbd_transfer(sc->sc_xfer);
>>> if (err != USBD_IN_PROGRESS) {
>>> foo_stop(sc);
>>> return EIO;
>>> }
>>> 
>>> And foo_stop() calls usbd_abort_pipe() -> xhci_command_submit(),
>>> which might sleep.
>>> 
>>> How to fix? We could do the foo_encap() after the ifq_deq_commit(),
>>> possibly dropping the current mbuf if encap fails (who cares
>>> for the packets after foo_stop() anyway).
>> 
>> That's the approach taken by drivers using ifq_dequeue(9) instead of
>> ifq_deq_begin/commit().
>> 
>>> Or change all the drivers to follow the path that if_aue.c takes:
>>> 
>>> err = usbd_transfer(c->aue_xfer);
>>> if (err != USBD_IN_PROGRESS) {
>>> ...
>>> /* Stop the interface from process context. */
>>> usb_add_task(sc->aue_udev, >aue_stop_task);
>>> return (EIO);
>>> }
>> 
>> That's just trading the current problem for another one with higher
>> complexity.
>> 
>>> Any ideas, what's better? Or alternative proposals?
>> 
>> Using ifq_dequeue(9) would have the advantage of unifying the code base.
>> It introduces a behavior change.  A simpler fix would be to call
>> 

Re: i915_request_create+0x4b: uvm_fault

2020-06-17 Thread Jonathan Gray
On Tue, Jun 16, 2020 at 06:46:41PM +0100, Stuart Henderson wrote:
> On 2020/06/15 21:50, Stuart Henderson wrote:
> > On 2020/06/14 15:45, Jonathan Gray wrote:
> > > On Sat, Jun 13, 2020 at 12:15:13PM +0100, Stuart Henderson wrote:
> > > > Same with a newer kernel.
> > > > 
> > > > OpenBSD 6.7-current (GENERIC.MP) #3: Thu Jun 11 19:47:48 BST 2020
> > > > st...@symphytum.spacehopper.org:/sys/arch/amd64/compile/GENERIC.MP
> > > > 
> > > > uvm_fault(0xfd86e2f6c120, 0x51, 0, 1) -> e
> > > > kernel: page fault trap, code=0
> > > > Stopped at  i915_request_create+0x4b:   movq0x50(%r14),%rdi
> > > > ddb{1}> tr
> > > 
> > > 0x50 is the offset in the struct of requests
> > > r14 in 1 in both traces and appears to be tl
> > > 
> > > I don't yet see how that is possible, can you try this diff and tell me
> > > if the printf triggers?
> > 
> > I'm running with it, hasn't triggered yet (3h uptime).
> 
> After some various reboots (tcp nfs related..) I have now seen
> it a couple of times in my current boot. The timing of the second
> one, pretty much exactly 2h after the first one, seems interesting,
> I wonder if there will be another one in 55 mins..
> 
> 2020-06-16T13:36:45.550Z symphytum /bsd: OpenBSD 6.7-current (GENERIC.MP) #3: 
> Tue Jun 16 13:35:25 BST 2020
> 2020-06-16T14:40:04.649Z symphytum /bsd: i915_request_create tl == 1
> 2020-06-16T16:40:04.413Z symphytum /bsd: i915_request_create tl == 1

The way we implement mutex_lock_interruptible() is
#define mutex_lock_interruptible(rwl)   -rw_enter(rwl, RW_WRITE | RW_INTR)

If something returned -1 we'd return 1 but all the error paths in the
functions involved should be returning positive errno values.

Can you try running with this as well to know which path is involved?

Index: dev/pci/drm/i915/gt/intel_context.h
===
RCS file: /cvs/src/sys/dev/pci/drm/i915/gt/intel_context.h,v
retrieving revision 1.1
diff -u -p -r1.1 intel_context.h
--- dev/pci/drm/i915/gt/intel_context.h 8 Jun 2020 04:48:13 -   1.1
+++ dev/pci/drm/i915/gt/intel_context.h 17 Jun 2020 05:53:14 -
@@ -145,9 +145,14 @@ intel_context_timeline_lock(struct intel
struct intel_timeline *tl = ce->timeline;
int err;
 
+   if ((vaddr_t)ce->timeline == 1)
+   printf("%s ce->timeline == 1\n", __func__);
+
err = mutex_lock_interruptible(>mutex);
-   if (err)
+   if (err) {
+   printf("%s mutex_lock_interruptible() ret %d\n", __func__, err);
return ERR_PTR(err);
+   }
 
return tl;
 }