from:"Patrick Wildt"

Re: xhci zero length transfers 'leak' one transfer buffer count

2020-12-23 Thread Patrick Wildt

Am Wed, Dec 23, 2020 at 10:44:21AM +0100 schrieb Marcus Glocker:
> On Wed, 23 Dec 2020 09:47:44 +0100
> Marcus Glocker  wrote:
> 
> > On Tue, 22 Dec 2020 20:55:41 +0100
> > Marcus Glocker  wrote:
> > 
> > > > > Did you consider incrementing xx->ntrb instead?
> > >   
> > > >That doesn't work either, because the status completion code needs
> > > >xx->ntrb to be correct for the data TD to be handled correctly.
> > > >Incrementing xx->ntrb means the number of TRBs for the data TD is
> > > >incorrect, since it includes the (optional) zero TD's TRB.
> > > >
> > > >In this case the zero TD allocates a TRB but doesn't do proper
> > > >accounting, and currently there's no place where this could be
> > > >accounted properly.
> > > >
> > > >In the end it's all software, so I guess the diff will simply have
> > > >to be bigger than just a one-liner.
> > >   
> > > > > With the diff below the produced TRB isn't accounted which might
> > > > > lead
> > > > > to and off-by-one.
> > > 
> > > Sorry, I missed this thread and had to re-grab the last mail from
> > > MARC.
> > > 
> > > Can't we just take account of the zero trb separately?  
> > 
> > We also want to reset the zerotrb.
> 
> Re-thinking this again I think we should only increase the zerotrb to
> avoid again a possible miss match for free_trbs, and leave the
> responsibility to the caller of xhci_xfer_get_trb() to request the
> right amount of zerotrb.
> 
> 
> Index: dev/usb/xhci.c
> ===
> RCS file: /cvs/src/sys/dev/usb/xhci.c,v
> retrieving revision 1.119
> diff -u -p -u -p -r1.119 xhci.c
> --- dev/usb/xhci.c31 Jul 2020 19:27:57 -  1.119
> +++ dev/usb/xhci.c23 Dec 2020 09:38:58 -
> @@ -1135,8 +1135,10 @@ xhci_xfer_done(struct usbd_xfer *xfer)
>   i = (xp->ring.ntrb - 1);
>   }
>   xp->free_trbs += xx->ntrb;
> + xp->free_trbs += xx->zerotrb;
>   xx->index = -1;
>   xx->ntrb = 0;
> + xx->zerotrb = 0;
>  
>   timeout_del(&xfer->timeout_handle);
>   usb_rem_task(xfer->device, &xfer->abort_task);
> @@ -1842,6 +1844,7 @@ xhci_xfer_get_trb(struct xhci_softc *sc,
>   switch (last) {
>   case -1:/* This will be a zero-length TD. */
>   xp->pending_xfers[xp->ring.index] = NULL;
> + xx->zerotrb += 1;
>   break;
>   case 0: /* This will be in a chain. */
>   xp->pending_xfers[xp->ring.index] = xfer;
> Index: dev/usb/xhcivar.h
> ===
> RCS file: /cvs/src/sys/dev/usb/xhcivar.h,v
> retrieving revision 1.11
> diff -u -p -u -p -r1.11 xhcivar.h
> --- dev/usb/xhcivar.h 6 Oct 2019 17:30:00 -   1.11
> +++ dev/usb/xhcivar.h 23 Dec 2020 09:38:58 -
> @@ -40,6 +40,7 @@ struct xhci_xfer {
>   struct usbd_xfer xfer;
>   int  index; /* Index of the last TRB */
>   size_t   ntrb;  /* Number of associated TRBs */
> + size_t   zerotrb;   /* Is zero len TRB required? */

It's a zero-length TD, not TRB.  I mean, it indeed is a zero-legth TRB,
but the important thing is that it's part of an extra TD.  So at least
update the comment, maybe even the variable name.

The difference is that a TD means that it's a separate transfer.  It
also completes seperately from the TD before.  In theory xfer done will
be called on the initial TD, not on the zero TD, which means that we
could have a race where our accounting "frees" the zero TD, even though
the controller isn't there yet.  In practice I think this is not an
issue, the ring's hopefully long enough that we don't immediately reuse
the TRB that we just freed.

So, I think the approach taken in this diff is fine, the code looks
good, only the naming I think can be improved.  Maybe really just call
it zerotd, then it also fits with the comment.

>  };
>  
>  struct xhci_ring {
>

Re: tsleep(9): add global "nowake" channel

2020-12-23 Thread Patrick Wildt

Am Wed, Dec 23, 2020 at 05:04:23PM -0600 schrieb Scott Cheloha:
> On Wed, Dec 23, 2020 at 02:42:18PM -0700, Theo de Raadt wrote:
> > I agree.  This chunk below is really gross and does not follow the
> > special wakeup channel metaphor.
> > 
> > It is *entirely clear* that a &channel called "nowake" has no wakeup.
> > Like duh.
> > 
> > > +/*
> > > + * nowake is a global sleep channel for threads that do not want
> > > + * to receive wakeup(9) broadcasts.
> > > + */
> > > +int __nowake;
> > > +void *nowake = &__nowake;
> 
> So we'll go with this?
> 
> Index: kern/kern_synch.c
> ===
> RCS file: /cvs/src/sys/kern/kern_synch.c,v
> retrieving revision 1.172
> diff -u -p -r1.172 kern_synch.c
> --- kern/kern_synch.c 7 Dec 2020 16:55:29 -   1.172
> +++ kern/kern_synch.c 23 Dec 2020 23:03:31 -
> @@ -87,6 +87,11 @@ sleep_queue_init(void)
>   TAILQ_INIT(&slpque[i]);
>  }
>  
> +/*
> + * Global sleep channel for threads that do not want to
> + * receive wakeup(9) broadcasts.
> + */
> +int nowake;
>  
>  /*
>   * During autoconfiguration or after a panic, a sleep will simply
> @@ -119,6 +124,7 @@ tsleep(const volatile void *ident, int p
>  #endif
>  
>   KASSERT((priority & ~(PRIMASK | PCATCH)) == 0);
> + KASSERT(ident != nowake || ISSET(priority, PCATCH) || timo != 0);

Sure you compiled this? ident is void *, nowake is int.  Should be ident
!= &nowake?  Same for the other code in the diff.

>  
>  #ifdef MULTIPROCESSOR
>   KASSERT(timo || _kernel_lock_held());
> @@ -213,6 +219,7 @@ msleep(const volatile void *ident, struc
>  #endif
>  
>   KASSERT((priority & ~(PRIMASK | PCATCH | PNORELOCK)) == 0);
> + KASSERT(ident != nowake || ISSET(priority, PCATCH) || timo != 0);
>   KASSERT(mtx != NULL);
>  
>   if (priority & PCATCH)
> @@ -301,6 +308,7 @@ rwsleep(const volatile void *ident, stru
>   int error, status;
>  
>   KASSERT((priority & ~(PRIMASK | PCATCH | PNORELOCK)) == 0);
> + KASSERT(ident != nowake || ISSET(priority, PCATCH) || timo != 0);
>   rw_assert_anylock(rwl);
>   status = rw_status(rwl);
>  
> Index: sys/systm.h
> ===
> RCS file: /cvs/src/sys/sys/systm.h,v
> retrieving revision 1.148
> diff -u -p -r1.148 systm.h
> --- sys/systm.h   26 Aug 2020 03:29:07 -  1.148
> +++ sys/systm.h   23 Dec 2020 23:03:31 -
> @@ -107,6 +107,8 @@ extern struct vnode *rootvp;  /* vnode eq
>  extern dev_t swapdev;/* swapping device */
>  extern struct vnode *swapdev_vp;/* vnode equivalent to above */
>  
> +extern int nowake;   /* dead wakeup(9) channel */
> +
>  struct proc;
>  struct process;
>  #define curproc curcpu()->ci_curproc
>

Re: rasops1

2020-12-23 Thread Patrick Wildt

Am Wed, Dec 23, 2020 at 11:32:58PM +0100 schrieb Frederic Cambus:
> Hi Mark,
> 
> On Fri, Dec 18, 2020 at 10:33:52PM +0100, Mark Kettenis wrote:
> 
> > The diff below disables the optimized functions on little-endian
> > architectures such that we always use rasops1_putchar().  This makes
> > ssdfb(4) work with the default 8x16 font on arm64.
> 
> I noticed it was committed already, but it seems the following
> directives:
> 
> #if defined(RASOPS_SMALL) && BYTE_ORDER == BIG_ENDIAN
> 
> Should have been:
> 
> #if !defined(RASOPS_SMALL) && BYTE_ORDER == BIG_ENDIAN
> 
> We want to include the optimized putchar functions only if RASOPS_SMALL
> is not defined.
> 

True that.  In one #endif comment he actually kept the !, but the actual
ifs lost it.

Re: Swapped arguments on stoeplitz_ip{4,6}port

2021-02-11 Thread Patrick Wildt

Already committed as of 7 minutes ago, heh!

Am Thu, Feb 11, 2021 at 10:48:16AM + schrieb Ricardo Mestre:
> Hi,
> 
> It seems this got the args swapped as described in CID 1501717.
> 
> stoeplitz_ip6port being a macro for stoeplitz_hash_ip6port and the arguments
> placed as shown below. Similarly stoeplitz_ip4port also has the same problem
> most likely due to copypasto.
> 
> Comments? OK?
> 
> 173 uint16_t
> 174 stoeplitz_hash_ip6port(const struct stoeplitz_cache *scache,
> 175 const struct in6_addr *faddr6, const struct in6_addr *laddr6,
> 176 in_port_t fport, in_port_t lport)
> 
> Index: netinet/in_pcb.c
> ===
> RCS file: /cvs/src/sys/netinet/in_pcb.c,v
> retrieving revision 1.253
> diff -u -p -u -r1.253 in_pcb.c
> --- netinet/in_pcb.c  25 Jan 2021 03:40:46 -  1.253
> +++ netinet/in_pcb.c  11 Feb 2021 10:34:50 -
> @@ -522,8 +522,8 @@ in_pcbconnect(struct inpcb *inp, struct 
>   inp->inp_fport = sin->sin_port;
>   in_pcbrehash(inp);
>  #if NSTOEPLITZ > 0
> - inp->inp_flowid = stoeplitz_ip4port(inp->inp_laddr.s_addr,
> - inp->inp_faddr.s_addr, inp->inp_lport, inp->inp_fport);
> + inp->inp_flowid = stoeplitz_ip4port(inp->inp_faddr.s_addr,
> + inp->inp_laddr.s_addr, inp->inp_fport, inp->inp_lport);
>  #endif
>  #ifdef IPSEC
>   {
> Index: netinet6/in6_pcb.c
> ===
> RCS file: /cvs/src/sys/netinet6/in6_pcb.c,v
> retrieving revision 1.111
> diff -u -p -u -r1.111 in6_pcb.c
> --- netinet6/in6_pcb.c25 Jan 2021 03:40:47 -  1.111
> +++ netinet6/in6_pcb.c11 Feb 2021 10:34:50 -
> @@ -303,8 +303,8 @@ in6_pcbconnect(struct inpcb *inp, struct
>   inp->inp_flowinfo |=
>   (htonl(ip6_randomflowlabel()) & IPV6_FLOWLABEL_MASK);
>  #if NSTOEPLITZ > 0
> - inp->inp_flowid = stoeplitz_ip6port(&inp->inp_laddr6,
> - &inp->inp_faddr6, inp->inp_lport, inp->inp_fport);
> + inp->inp_flowid = stoeplitz_ip6port(&inp->inp_faddr6,
> + &inp->inp_laddr6, inp->inp_fport, inp->inp_lport);
>  #endif
>   in_pcbrehash(inp);
>   return (0);
>

Re: isakmpd link dynamically

2021-02-11 Thread Patrick Wildt

Am Thu, Feb 11, 2021 at 11:29:58AM +0100 schrieb Alexander Bluhm:
> - recommit in /usr/src/usr.sbin -> we loose history

I know no one cares about git, but if the move was committed in a
"single cvs commit", git would understand it's simply a move of files.
So yeah, cvs wouldn't cope, but git would.

Re: Add missing break statement on if_rge.c

2021-02-11 Thread Patrick Wildt

Am Thu, Feb 11, 2021 at 12:24:37PM + schrieb Ricardo Mestre:
> Hi,
> 
> Add missing break statement. OK?

makes sense to me, ok patrick@

> CID 1501710
> 
> Index: if_rge.c
> ===
> RCS file: /cvs/src/sys/dev/pci/if_rge.c,v
> retrieving revision 1.11
> diff -u -p -c -u -r1.11 if_rge.c
> --- if_rge.c  24 Dec 2020 06:34:03 -  1.11
> +++ if_rge.c  11 Feb 2021 12:21:33 -
> @@ -311,6 +311,7 @@ rge_activate(struct device *self, int ac
>  #ifndef SMALL_KERNEL
>   rge_wol_power(sc);
>  #endif
> + break;
>   default:
>   rv = config_activate_children(self, act);
>   break;
>

Re: Posted vs. non-posted device access

2021-02-14 Thread Patrick Wildt

Am Mon, Feb 15, 2021 at 09:55:56AM +1000 schrieb David Gwynne:
> 
> 
> > On 15 Feb 2021, at 07:54, Mark Kettenis  wrote:
> > 
> > One of the aspects of device access is whether CPU writes to a device
> > are posted or non-posted.  For non-posted writes, the CPU will wait
> > for the device to acknowledge that the write has performed.  If the
> > device sits on a bus far away, this can take a while and slow things
> > down.  The alternative are so-called posted writes.  The CPU will
> > "post" the write to the bus without waiting for an acknowledgement.
> > The CPU may receive an asynchronous notifaction at a later time that
> > the write didn't succeed or a failing write may be dropped without
> > further botification.  On most architectures whether writes are posted
> > or not is a property of the bus between the CPU and the device.  For
> > example, memory mapped I/O on the PCI bus is always posted and there
> > is nothing the CPU can do about it.
> > 
> > On the ARM architecture though we can indicate to the CPU whether
> > writes to a certain address range should be posted or not.  This is
> > done by specifying certain memory attributes in the mappings used by
> > the MMU.  The OpenBSD kernel always specifies device access as
> > non-posted.  On all ARM implementations we have seen so far this seems
> > to work even for writes to devices connected to a PCIe bus.  There
> > might be a penalty though, so I need to investigate this a bit
> > further.
> > 
> > However, on Apple's M1 SoC, this isn't the case.  Non-posted writes to
> > a bus that uses posted writes fail and vice-versa.  So in order to use
> > the PCIe bus on these SoCs we need to specify the right memory
> > attributes.  The diff below implements this by introducing a new
> > BUS_SPACE_MAP_POSTED flag.  At this point I don't expect generic
> > drivers to use this flag yet.  So there is no need to add it for other
> > architectures.  But I don't rule out we may have to use this flag in
> > sys/dev/fdt sometime in the future.  That is why I posted this to a
> > wider audience.
> 
> You don't want to (ab)use one of the existing flags? If I squint and read 
> kind of quickly I could imagine this is kind of like write combining, like 
> what BUS_SPACE_MAP_PREFETCHABLE can do on pci busses.

BUS_SPACE_MAP_PREFETCHABLE should be "normal uncached" memory on arm64,
which is different to device memory.  That said I have a device where
amdgpu(4) doesn't behave if it's "normal uncached", and I'm not sure if
it's the HW's fault or if there's some barrier missing.  Still, I would
not use BUS_SPACE_MAP_PREFETCHABLE for nGnRnE vs nGnRE.

More info on device vs normal is here:

https://developer.arm.com/documentation/102376/0100/Normal-memory
https://developer.arm.com/documentation/102376/0100/Device-memory

> If this does leak into fdt, would it just be a nop on other archs that use 
> those drivers?
> 
> dlg
> 
> > 
> > ok?
> > 
> > 
> > Index: arch/arm64/arm64/locore.S
> > ===
> > RCS file: /cvs/src/sys/arch/arm64/arm64/locore.S,v
> > retrieving revision 1.32
> > diff -u -p -r1.32 locore.S
> > --- arch/arm64/arm64/locore.S   19 Oct 2020 17:57:40 -  1.32
> > +++ arch/arm64/arm64/locore.S   14 Feb 2021 21:28:26 -
> > @@ -233,9 +233,10 @@ switch_mmu_kernel:
> > mair:
> > /* Device | Normal (no cache, write-back, write-through) */
> > .quad   MAIR_ATTR(0x00, 0) |\
> > -   MAIR_ATTR(0x44, 1) |\
> > -   MAIR_ATTR(0xff, 2) |\
> > -   MAIR_ATTR(0x88, 3)
> > +   MAIR_ATTR(0x04, 1) |\
> > +   MAIR_ATTR(0x44, 2) |\
> > +   MAIR_ATTR(0xff, 3) |\
> > +   MAIR_ATTR(0x88, 4)
> > tcr:
> > .quad (TCR_T1SZ(64 - VIRT_BITS) | TCR_T0SZ(64 - 48) | \
> > TCR_AS | TCR_TG1_4K | TCR_CACHE_ATTRS | TCR_SMP_ATTRS)
> > Index: arch/arm64/arm64/locore0.S
> > ===
> > RCS file: /cvs/src/sys/arch/arm64/arm64/locore0.S,v
> > retrieving revision 1.5
> > diff -u -p -r1.5 locore0.S
> > --- arch/arm64/arm64/locore0.S  28 May 2019 20:32:30 -  1.5
> > +++ arch/arm64/arm64/locore0.S  14 Feb 2021 21:28:26 -
> > @@ -34,8 +34,8 @@
> > #include 
> > 
> > #define DEVICE_MEM  0
> > -#defineNORMAL_UNCACHED 1
> > -#defineNORMAL_MEM  2
> > +#defineNORMAL_UNCACHED 2
> > +#defineNORMAL_MEM  3
> > 
> > /*
> >  * We assume:
> > Index: arch/arm64/arm64/machdep.c
> > ===
> > RCS file: /cvs/src/sys/arch/arm64/arm64/machdep.c,v
> > retrieving revision 1.57
> > diff -u -p -r1.57 machdep.c
> > --- arch/arm64/arm64/machdep.c  11 Feb 2021 23:55:48 -  1.57
> > +++ arch/arm64/arm64/machdep.c  14 Feb 2021 21:28:27 -
> > @@ -1188,7 +1188,7 @@ pmap_bootstrap_bs_map(bus_space_tag_t t,
> > 
> > for (pa = startpa; pa < endpa; pa += PAGE_SIZE, v

Re: add simplepmbus(4)

2021-02-17 Thread Patrick Wildt

Am Wed, Feb 17, 2021 at 11:56:16AM +1100 schrieb Jonathan Gray:
> Add a driver for "simple-pm-bus" a way to enable a clock and/or
> power domain for a group of devices.
> 
> https://www.kernel.org/doc/Documentation/devicetree/bindings/bus/simple-pm-bus.txt
> 
> This is needed to use am335x/omap4 dtbs from linux 5.11.

"A Simple Power-Managed Bus is a transparent bus that doesn't need a real
driver, as it's typically initialized by the boot loader."

That's a bit funny though. :-)  But I think they meant "it doesn't need
a real driver, apart from the generic simple-pm-bus driver".


> Index: sys/dev/fdt/files.fdt
> ===
> RCS file: /cvs/src/sys/dev/fdt/files.fdt,v
> retrieving revision 1.146
> diff -u -p -r1.146 files.fdt
> --- sys/dev/fdt/files.fdt 5 Feb 2021 00:05:20 -   1.146
> +++ sys/dev/fdt/files.fdt 17 Feb 2021 00:49:02 -
> @@ -23,6 +23,10 @@ device simplepanel
>  attach   simplepanel at fdt
>  file dev/fdt/simplepanel.c   simplepanel
>  
> +device   simplepmbus: fdt
> +attach   simplepmbus at fdt
> +file dev/fdt/simplepmbus.c   simplepmbus
> +
>  device   sxiccmu
>  attach   sxiccmu at fdt
>  file dev/fdt/sxiccmu.c   sxiccmu
> Index: share/man/man4/Makefile
> ===
> RCS file: /cvs/src/share/man/man4/Makefile,v
> retrieving revision 1.792
> diff -u -p -r1.792 Makefile
> --- share/man/man4/Makefile   4 Feb 2021 16:25:38 -   1.792
> +++ share/man/man4/Makefile   17 Feb 2021 00:49:02 -
> @@ -72,7 +72,8 @@ MAN=aac.4 abcrtc.4 abl.4 ac97.4 acphy.4
>   rl.4 rlphy.4 route.4 rsu.4 rtsx.4 rum.4 run.4 rtw.4 rtwn.4 \
>   safe.4 safte.4 sbus.4 schsio.4 scsi.4 sd.4 \
>   sdmmc.4 sdhc.4 se.4 ses.4 sf.4 sili.4 \
> - simpleamp.4 simpleaudio.4 simplefb.4 simplepanel.4 siop.4 sis.4 sk.4 \
> + simpleamp.4 simpleaudio.4 simplefb.4 simplepanel.4 simplepmbus.4 \
> + siop.4 sis.4 sk.4 \
>   sm.4 smsc.4 softraid.4 spdmem.4 sdtemp.4 speaker.4 sppp.4 sqphy.4 \
>   ssdfb.4 st.4 ste.4 stge.4 sti.4 stp.4 sv.4 switch.4 sxiccmu.4 \
>   sxidog.4 sximmc.4 sxipio.4 sxipwm.4 sxirsb.4 sxirtc.4 sxisid.4 \
> Index: sys/arch/armv7/conf/GENERIC
> ===
> RCS file: /cvs/src/sys/arch/armv7/conf/GENERIC,v
> retrieving revision 1.134
> diff -u -p -r1.134 GENERIC
> --- sys/arch/armv7/conf/GENERIC   4 Feb 2021 16:25:39 -   1.134
> +++ sys/arch/armv7/conf/GENERIC   17 Feb 2021 00:49:02 -
> @@ -31,6 +31,7 @@ config  bsd swap generic
>  # The main bus device
>  mainbus0 at root
>  simplebus*   at fdt?
> +simplepmbus* at fdt?
>  cpu0 at mainbus?
>  
>  # Cortex-A9
> Index: sys/arch/armv7/conf/RAMDISK
> ===
> RCS file: /cvs/src/sys/arch/armv7/conf/RAMDISK,v
> retrieving revision 1.119
> diff -u -p -r1.119 RAMDISK
> --- sys/arch/armv7/conf/RAMDISK   18 Jun 2020 08:48:04 -  1.119
> +++ sys/arch/armv7/conf/RAMDISK   17 Feb 2021 00:49:02 -
> @@ -30,6 +30,7 @@ config  bsd root on rd0a swap on rd0b
>  mainbus0 at root
>  softraid0at root
>  simplebus*   at fdt?
> +simplepmbus* at fdt?
>  cpu0 at mainbus?
>  
>  # Cortex-A9
> --- /dev/null Wed Feb 17 11:49:05 2021
> +++ sys/dev/fdt/simplepmbus.c Tue Feb 16 17:24:55 2021
> @@ -0,0 +1,62 @@
> +/*   $OpenBSD$   */
> +/*
> + * Copyright (c) 2021 Jonathan Gray 
> + *
> + * Permission to use, copy, modify, and distribute this software for any
> + * purpose with or without fee is hereby granted, provided that the above
> + * copyright notice and this permission notice appear in all copies.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
> + * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
> + * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
> + * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
> + * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
> + * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
> + * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
> + */
> +
> +#include 
> +#include 
> +
> +#include 
> +
> +#include 
> +#include 
> +#include 
> +
> +#include 

Maybe like this, was we have in rkpinctrl?

#ifdef __armv7__
#include 
#else
#include 
#endif

Generally I was wondering if we shouldn't just add this to simplebus.c
itself?  You'd need to add a compatible to the match function, and in
the attach function you'd need to check for the compatible and then
do

power_domain_enable(faa->fa_node);
clock_enable_all(faa->fa_node);

in the if-block.

Haven't made my mind up yet what I like better, maybe kettenis@ has an
opinion.

Patrick

> +
> +int  simplepmbus_matc

sdmmc(4): check and retry bus width change

2021-02-22 Thread Patrick Wildt

Hi,

it seems like some eMMCs are not capable of doing 8-bit operation,
even if the controller supports it.  I was questioning our drivers
first, but it looks like it's the same on Linux.  In the case that
8-bit doesn't work, they seem to fall back to lower values to make
that HW work.

This diff implements a mechanism that tries 8-bit, if available,
then 4-bit and in the end falls back to 1-bit.  This makes my HW
work, but I would like to have this tested by a broader audience.

Apparently there's a "bus test" command, but it isn't implemented
on all host controllers.  Hence I simply try to read the EXT_CSD
to make sure the transfer works.

For testing, a print like

printf("%s: using %u-bit width\n", DEVNAME(sc), width);

could be added at line 928.

What could possible regressions be?  The width could become smaller
then previously.  This would reduce the read/write transfer speed.
Also it's possible that eMMCs are not recognized/initialized anymore.

What could possible improvements be?  eMMCs that previously didn't
work now work, with at least 1-bit or 4-bit wide transfers.

Please note that this only works for eMMCs.  SD cards are *not* using
this code path.  SD cards have a different initialization code path.

Please report any changes or non-changes.  If nothing changes, that's
perfect.

Patrick

diff --git a/sys/dev/sdmmc/sdmmc_mem.c b/sys/dev/sdmmc/sdmmc_mem.c
index 59bcb1b4a11..5856b9bb1b3 100644
--- a/sys/dev/sdmmc/sdmmc_mem.c
+++ b/sys/dev/sdmmc/sdmmc_mem.c
@@ -56,6 +56,8 @@ int   sdmmc_mem_signal_voltage(struct sdmmc_softc *, int);
 
 intsdmmc_mem_sd_init(struct sdmmc_softc *, struct sdmmc_function *);
 intsdmmc_mem_mmc_init(struct sdmmc_softc *, struct sdmmc_function *);
+intsdmmc_mem_mmc_select_bus_width(struct sdmmc_softc *,
+   struct sdmmc_function *, int);
 intsdmmc_mem_single_read_block(struct sdmmc_function *, int, u_char *,
size_t);
 intsdmmc_mem_read_block_subr(struct sdmmc_function *, bus_dmamap_t,
@@ -908,31 +910,20 @@ sdmmc_mem_mmc_init(struct sdmmc_softc *sc, struct 
sdmmc_function *sf)
ext_csd[EXT_CSD_CARD_TYPE]);
}
 
-   if (ISSET(sc->sc_caps, SMC_CAPS_8BIT_MODE)) {
+   if (ISSET(sc->sc_caps, SMC_CAPS_8BIT_MODE) &&
+   sdmmc_mem_mmc_select_bus_width(sc, sf, 8) == 0)
width = 8;
-   value = EXT_CSD_BUS_WIDTH_8;
-   } else if (ISSET(sc->sc_caps, SMC_CAPS_4BIT_MODE)) {
+   else if (ISSET(sc->sc_caps, SMC_CAPS_4BIT_MODE) &&
+   sdmmc_mem_mmc_select_bus_width(sc, sf, 4) == 0)
width = 4;
-   value = EXT_CSD_BUS_WIDTH_4;
-   } else {
-   width = 1;
-   value = EXT_CSD_BUS_WIDTH_1;
-   }
-
-   if (width != 1) {
-   error = sdmmc_mem_mmc_switch(sf, EXT_CSD_CMD_SET_NORMAL,
-   EXT_CSD_BUS_WIDTH, value);
-   if (error == 0)
-   error = sdmmc_chip_bus_width(sc->sct,
-   sc->sch, width);
-   else {
+   else {
+   error = sdmmc_mem_mmc_select_bus_width(sc, sf, 1);
+   if (error != 0) {
DPRINTF(("%s: can't change bus width"
" (%d bit)\n", DEVNAME(sc), width));
return error;
}
-
-   /* : need bus test? (using by CMD14 & CMD19) */
-   sdmmc_delay(1);
+   width = 1;
}
 
if (timing != SDMMC_TIMING_LEGACY) {
@@ -1041,6 +1032,59 @@ sdmmc_mem_mmc_init(struct sdmmc_softc *sc, struct 
sdmmc_function *sf)
return error;
 }
 
+int
+sdmmc_mem_mmc_select_bus_width(struct sdmmc_softc *sc, struct sdmmc_function 
*sf,
+int width)
+{
+   u_int8_t ext_csd[512];
+   int error, value;
+
+   switch (width) {
+   case 8:
+   value = EXT_CSD_BUS_WIDTH_8;
+   break;
+   case 4:
+   value = EXT_CSD_BUS_WIDTH_4;
+   break;
+   case 1:
+   value = EXT_CSD_BUS_WIDTH_1;
+   break;
+   default:
+   printf("%s: invalid bus width\n", DEVNAME(sc));
+   return EINVAL;
+   }
+
+   error = sdmmc_mem_mmc_switch(sf, EXT_CSD_CMD_SET_NORMAL,
+   EXT_CSD_BUS_WIDTH, value);
+   if (error != 0) {
+   DPRINTF(("%s: can't change card bus width"
+   " (%d bit)\n", DEVNAME(sc), width));
+   return error;
+   }
+
+   error = sdmmc_chip_bus_width(sc->sct,
+   sc->sch, width);
+   if (error != 0) {
+   DPRINTF(("%s: can't change host bus width"
+   " (%d bit)\n", DEVNAM

Re: fix nvme(4): NULL deref. and empty device attachments

2021-02-24 Thread Patrick Wildt

Am Wed, Feb 24, 2021 at 05:34:48PM +0100 schrieb Jan Klemkow:
> Hi,
> 
> While attaching the following disks, the nvme driver runs into a NULL
> dereference in nvme_scsi_capacity16() and nvme_scsi_capacity().
> 
> nvme0 at pci1 dev 0 function 0 vendor "Intel", unknown product 0x0a54 rev 
> 0x00: msix, NVMe 1.2
> nvme0: INTEL SSDPE2KX040T8, firmware VDV10170, serial PHLJ0413002P4P0DGN
> scsibus1 at nvme0: 129 targets, initiator 0
> sd0 at scsibus1 targ 1 lun 0: 
> sd0: 3815447MB, 512 bytes/sector, 7814037168 sectors
> sd1 at scsibus1 targ 2 lun 0: 
> uvm_fault(0x821d00e8, 0x0, 0, 1) -> e
> kernel: page fault trap, code=0
> Stopped at  nvme_scsi_capacity16+0x39:  movq0(%rax),%rcx
> ddb{0}>
> 
> "ns" in both functions will be NULL, if "identify" is not allocated in
> nvme_scsi_probe().  Thus, its better to just not attach this empty
> disks/LUNs.
> 
> nvme0 at pci1 dev 0 function 0 vendor "Intel", unknown product 0x0a54 rev 
> 0x00: msix, NVMe 1.2
> nvme0: INTEL SSDPE2KX040T8, firmware VDV10170, serial PHLJ0413002P4P0DGN
> scsibus1 at nvme0: 129 targets, initiator 0
> sd0 at scsibus1 targ 1 lun 0: 
> sd0: 3815447MB, 512 bytes/sector, 7814037168 sectors
> ppb1 at pci0 dev 3 function 2 "AMD 17h PCIE" rev 0x00: msi
> pci2 at ppb1 bus 98
> nvme1 at pci2 dev 0 function 0 vendor "Intel", unknown product 0x0a54 rev 
> 0x00: msix, NVMe 1.2
> nvme1: INTEL SSDPE2KX040T8, firmware VDV10170, serial PHLJ041500C34P0DGN
> scsibus2 at nvme1: 129 targets, initiator 0
> sd1 at scsibus2 targ 1 lun 0: 
> sd1: 3815447MB, 512 bytes/sector, 7814037168 sectors
> ppb2 at pci0 dev 3 function 3 "AMD 17h PCIE" rev 0x00: msi
> pci3 at ppb2 bus 99
> nvme2 at pci3 dev 0 function 0 vendor "Intel", unknown product 0x0a54 rev 
> 0x00: msix, NVMe 1.2
> nvme2: INTEL SSDPE2KX040T8, firmware VDV10170, serial PHLJ041402Z64P0DGN
> scsibus3 at nvme2: 129 targets, initiator 0
> sd2 at scsibus3 targ 1 lun 0: 
> sd2: 3815447MB, 512 bytes/sector, 7814037168 sectors
> ppb3 at pci0 dev 3 function 4 "AMD 17h PCIE" rev 0x00: msi
> pci4 at ppb3 bus 100
> nvme3 at pci4 dev 0 function 0 vendor "Intel", unknown product 0x0a54 rev 
> 0x00: msix, NVMe 1.2
> nvme3: INTEL SSDPE2KX040T8, firmware VDV10170, serial PHLJ041403134P0DGN
> scsibus4 at nvme3: 129 targets, initiator 0
> sd3 at scsibus4 targ 1 lun 0: 
> sd3: 3815447MB, 512 bytes/sector, 7814037168 sectors
> 
> The following diff signals an error for the upper probing function in
> the SCSI layer to prevents further function calls in nvme(4) which would
> just leads to the upper described error and hundreds of not configured
> devices.
> 
> OK?

I think this is the correct way to fix it.  The issue essentially is
that we still return "ok" even though the size is zero.  And we should
probably fail similarly as if it didn't exist.  FreeBSD has a similar
check here, which explains it a little:

/*
 * If the size of is zero, chances are this isn't a valid
 * namespace (eg one that's not been configured yet). The
 * standard says the entire id will be zeros, so this is a
 * cheap way to test for that.
 */
if (ns->data.nsze == 0)
return (ENXIO);

I wondered if this block of code could be written a bit differently, but
I couldn't make it look any better.

ok patrick@ fwiw

> bye,
> Jan
> 
> Index: dev/ic/nvme.c
> ===
> RCS file: /cvs//src/sys/dev/ic/nvme.c,v
> retrieving revision 1.90
> diff -u -p -r1.90 nvme.c
> --- dev/ic/nvme.c 9 Feb 2021 01:50:10 -   1.90
> +++ dev/ic/nvme.c 24 Feb 2021 16:01:48 -
> @@ -463,11 +463,16 @@ nvme_scsi_probe(struct scsi_link *link)
>   scsi_io_put(&sc->sc_iopool, ccb);
>  
>   identify = NVME_DMA_KVA(mem);
> - if (rv == 0 && lemtoh64(&identify->nsze) > 0) {
> - /* Commit namespace if it has a size greater than zero. */
> - identify = malloc(sizeof(*identify), M_DEVBUF, M_WAITOK);
> - memcpy(identify, NVME_DMA_KVA(mem), sizeof(*identify));
> - sc->sc_namespaces[link->target].ident = identify;
> + if (rv == 0) {
> + if (lemtoh64(&identify->nsze) > 0) {
> + /* Commit namespace if it has a size greater than zero. 
> */
> + identify = malloc(sizeof(*identify), M_DEVBUF, 
> M_WAITOK);
> + memcpy(identify, NVME_DMA_KVA(mem), sizeof(*identify));
> + sc->sc_namespaces[link->target].ident = identify;
> + } else {
> + /* Don't attach a namespace if its size is zero. */
> + rv = ENXIO;
> + }
>   }
>  
>   nvme_dmamem_free(sc, mem);
>

Re: Read `ps_single' once

2021-03-04 Thread Patrick Wildt

Am Thu, Mar 04, 2021 at 10:42:24AM +0100 schrieb Mark Kettenis:
> > Date: Thu, 4 Mar 2021 10:34:24 +0100
> > From: Martin Pieuchot 
> > 
> > Running t/rw/msleep(9) w/o KERNEL_LOCK() implies that a thread can
> > change the value of `ps_single' while one of its siblings might be
> > dereferencing it.  
> > 
> > To prevent inconsistencies in the code executed by sibling thread, the
> > diff below makes sure `ps_single' is dereferenced only once in various
> > parts of the kernel.
> > 
> > ok?
> 
> I think that means that ps_single has to be declared "volatile".

Isn't there the READ_ONCE(x) macro, that does exactly that?

> > Index: kern/kern_exit.c
> > ===
> > RCS file: /cvs/src/sys/kern/kern_exit.c,v
> > retrieving revision 1.196
> > diff -u -p -r1.196 kern_exit.c
> > --- kern/kern_exit.c15 Feb 2021 09:35:59 -  1.196
> > +++ kern/kern_exit.c4 Mar 2021 09:29:27 -
> > @@ -274,6 +274,8 @@ exit1(struct proc *p, int xexit, int xsi
> >  */
> > if (qr->ps_flags & PS_TRACED &&
> > !(qr->ps_flags & PS_EXITING)) {
> > +   struct proc *st;
> > +
> > process_untrace(qr);
> >  
> > /*
> > @@ -281,9 +283,9 @@ exit1(struct proc *p, int xexit, int xsi
> >  * direct the signal to the active
> >  * thread to avoid deadlock.
> >  */
> > -   if (qr->ps_single)
> > -   ptsignal(qr->ps_single, SIGKILL,
> > -   STHREAD);
> > +   st = qr->ps_single;
> > +   if (st != NULL)
> > +   ptsignal(st, SIGKILL, STHREAD);
> > else
> > prsignal(qr, SIGKILL);
> > } else {
> > @@ -510,7 +512,7 @@ dowait4(struct proc *q, pid_t pid, int *
> >  {
> > int nfound;
> > struct process *pr;
> > -   struct proc *p;
> > +   struct proc *p, *st;
> > int error;
> >  
> > if (pid == 0)
> > @@ -541,10 +543,11 @@ loop:
> > proc_finish_wait(q, p);
> > return (0);
> > }
> > +
> > +   st = pr->ps_single;
> > if (pr->ps_flags & PS_TRACED &&
> > -   (pr->ps_flags & PS_WAITED) == 0 && pr->ps_single &&
> > -   pr->ps_single->p_stat == SSTOP &&
> > -   (pr->ps_single->p_flag & P_SUSPSINGLE) == 0) {
> > +   (pr->ps_flags & PS_WAITED) == 0 && st != NULL &&
> > +   st->p_stat == SSTOP && (st->p_flag & P_SUSPSINGLE) == 0) {
> > if (single_thread_wait(pr, 0))
> > goto loop;
> >  
> > Index: kern/sys_process.c
> > ===
> > RCS file: /cvs/src/sys/kern/sys_process.c,v
> > retrieving revision 1.86
> > diff -u -p -r1.86 sys_process.c
> > --- kern/sys_process.c  8 Feb 2021 10:51:02 -   1.86
> > +++ kern/sys_process.c  4 Mar 2021 09:29:27 -
> > @@ -273,7 +273,7 @@ sys_ptrace(struct proc *p, void *v, regi
> >  int
> >  ptrace_ctrl(struct proc *p, int req, pid_t pid, caddr_t addr, int data)
> >  {
> > -   struct proc *t; /* target thread */
> > +   struct proc *st, *t;/* target thread */
> > struct process *tr; /* target process */
> > int error = 0;
> > int s;
> > @@ -433,8 +433,9 @@ ptrace_ctrl(struct proc *p, int req, pid
> >  * from where it stopped."
> >  */
> >  
> > -   if (pid < THREAD_PID_OFFSET && tr->ps_single)
> > -   t = tr->ps_single;
> > +   st = tr->ps_single;
> > +   if (pid < THREAD_PID_OFFSET && st != NULL)
> > +   t = st;
> >  
> > /* If the address parameter is not (int *)1, set the pc. */
> > if ((int *)addr != (int *)1)
> > @@ -464,8 +465,9 @@ ptrace_ctrl(struct proc *p, int req, pid
> >  * from where it stopped."
> >  */
> >  
> > -   if (pid < THREAD_PID_OFFSET && tr->ps_single)
> > -   t = tr->ps_single;
> > +   st = tr->ps_single;
> > +   if (pid < THREAD_PID_OFFSET && st != NULL)
> > +   t = st;
> >  
> >  #ifdef PT_STEP
> > /*
> > @@ -495,8 +497,9 @@ ptrace_ctrl(struct proc *p, int req, pid
> > break;
> >  
> > case PT_KILL:
> > -   if (pid < THREAD_PID_OFFSET && tr->ps_single)
> > -   t = tr->ps_single;
> > +   st = tr->ps_single;
> > +   if (pid < THREAD_PID_OFFSET && st != NULL)
> > +   t = st;
> >  
> > /* just send the process a KILL signal. */
> >

acpi(4): pass DMA tag to ACPI tables

2021-03-06 Thread Patrick Wildt

Hi,

to be able to have acpiiort(4) pass a DMA tag to smmu(4), acpiiort(4)
needs to be passed a DMA tag.  So far acpi(4) only seems to pass it on
acpi_foundhid(), but the ACPI table drivers don't get it.  So, let's
just pass the default DMA tag.

ok?

Patrick

diff --git a/sys/dev/acpi/acpi.c b/sys/dev/acpi/acpi.c
index 4c824ee6cbb..67dd7f14435 100644
--- a/sys/dev/acpi/acpi.c
+++ b/sys/dev/acpi/acpi.c
@@ -1202,6 +1202,7 @@ acpi_attach_common(struct acpi_softc *sc, paddr_t base)
memset(&aaa, 0, sizeof(aaa));
aaa.aaa_iot = sc->sc_iot;
aaa.aaa_memt = sc->sc_memt;
+   aaa.aaa_dmat = sc->sc_ci_dmat;
aaa.aaa_table = entry->q_table;
config_found_sm(&sc->sc_dev, &aaa, acpi_print, acpi_submatch);
}

Re: ixl(4): add ID for X710 10G SFP+

2021-03-15 Thread Patrick Wildt

Am Mon, Mar 15, 2021 at 08:59:05AM +0100 schrieb Jan Klemkow:
> On Mon, Mar 15, 2021 at 01:35:28AM -0600, Theo de Raadt wrote:
> > My comments are about the "text name", which goes into every kernel
> > anyone compiles.
> > 
> > It should be as short as possible.
> 
> Sorry, I missed that point.
> 
> > But the reason why 10G is incorrect is because surely the port can
> > accept 1G, or a variety of other SFPs...  It is simply too exact,
> > and wasting kernel bytes.
> 
> OK?
> 
> Thanks,
> Jan
> 
> Index: if_ixl.c
> ===
> RCS file: /cvs/src/sys/dev/pci/if_ixl.c,v
> retrieving revision 1.73
> diff -u -p -r1.73 if_ixl.c
> --- if_ixl.c  26 Feb 2021 10:36:45 -  1.73
> +++ if_ixl.c  15 Mar 2021 07:42:48 -
> @@ -1611,6 +1611,7 @@ struct ixl_device {
>  
>  static const struct ixl_device ixl_devices[] = {
>   { &ixl_710, PCI_VENDOR_INTEL, PCI_PRODUCT_INTEL_X710_10G_SFP },
> + { &ixl_710, PCI_VENDOR_INTEL, PCI_PRODUCT_INTEL_X710_10G_SFP_2 },

Looks like so far we have ordered this list in the same order as it is
in pcidevs (for X710).  If we want to keep that order, _2 should be
the first entry.  If we don't want to keep that order, then this diff
should be fine.  jsg@, dlg@: any preference or do you not care?

Patrick

>   { &ixl_710, PCI_VENDOR_INTEL, PCI_PRODUCT_INTEL_XL710_40G_BP },
>   { &ixl_710, PCI_VENDOR_INTEL, PCI_PRODUCT_INTEL_X710_10G_BP, },
>   { &ixl_710, PCI_VENDOR_INTEL, PCI_PRODUCT_INTEL_XL710_QSFP_1 },
> Index: pcidevs
> ===
> RCS file: /cvs/src/sys/dev/pci/pcidevs,v
> retrieving revision 1.1960
> diff -u -p -r1.1960 pcidevs
> --- pcidevs   14 Mar 2021 01:09:29 -  1.1960
> +++ pcidevs   15 Mar 2021 07:42:19 -
> @@ -3702,6 +3702,7 @@ product INTEL ICH8_IGP_AMT  0x104a  ICH8 I
>  product INTEL ICH8_IGP_C 0x104b  ICH8 IGP C
>  product INTEL ICH8_IFE   0x104c  ICH8 IFE
>  product INTEL ICH8_IGP_M 0x104d  ICH8 IGP M
> +product INTEL X710_10G_SFP_2 0x104e  X710 SFP+
>  product INTEL PRO_100_VE_4   0x1050  PRO/100 VE
>  product INTEL PRO_100_VE_5   0x1051  PRO/100 VE
>  product INTEL PRO_100_VM_6   0x1052  PRO/100 VM
> Index: pcidevs.h
> ===
> RCS file: /cvs/src/sys/dev/pci/pcidevs.h,v
> retrieving revision 1.1954
> diff -u -p -r1.1954 pcidevs.h
> --- pcidevs.h 14 Mar 2021 01:10:35 -  1.1954
> +++ pcidevs.h 15 Mar 2021 07:42:21 -
> @@ -3707,6 +3707,7 @@
>  #define  PCI_PRODUCT_INTEL_ICH8_IGP_C0x104b  /* ICH8 IGP C */
>  #define  PCI_PRODUCT_INTEL_ICH8_IFE  0x104c  /* ICH8 IFE */
>  #define  PCI_PRODUCT_INTEL_ICH8_IGP_M0x104d  /* ICH8 IGP M */
> +#define  PCI_PRODUCT_INTEL_X710_10G_SFP_20x104e  /* X710 
> SFP+ */
>  #define  PCI_PRODUCT_INTEL_PRO_100_VE_4  0x1050  /* PRO/100 VE */
>  #define  PCI_PRODUCT_INTEL_PRO_100_VE_5  0x1051  /* PRO/100 VE */
>  #define  PCI_PRODUCT_INTEL_PRO_100_VM_6  0x1052  /* PRO/100 VM */
> Index: pcidevs_data.h
> ===
> RCS file: /cvs/src/sys/dev/pci/pcidevs_data.h,v
> retrieving revision 1.1949
> diff -u -p -r1.1949 pcidevs_data.h
> --- pcidevs_data.h14 Mar 2021 01:10:35 -  1.1949
> +++ pcidevs_data.h15 Mar 2021 07:42:21 -
> @@ -12252,6 +12252,10 @@ static const struct pci_known_product pc
>   "ICH8 IGP M",
>   },
>   {
> + PCI_VENDOR_INTEL, PCI_PRODUCT_INTEL_X710_10G_SFP_2,
> + "X710 SFP+",
> + },
> + {
>   PCI_VENDOR_INTEL, PCI_PRODUCT_INTEL_PRO_100_VE_4,
>   "PRO/100 VE",
>   },
>

Re: Huawei ME906s-158 LTE, cdce(4) vs umb(4)

2021-03-24 Thread Patrick Wildt

On Wed, Mar 24, 2021 at 11:23:11PM +, Stuart Henderson wrote:
> In my ongoing search to find a umb(4) that will actually work that
> isn't the one in my laptop (since my EM7305 has been a failure),
> I picked up a Huawei ME906s-158 off ebay. It attaches to cdce and ugen
> and fails to work:
> 
> cdce0 at uhub2 port 4 configuration 2 interface 0 "Huawei Technologies Co., 
> Ltd. HUAWEI Mobile" rev 2.10/1.02 addr 3
> cdce0: could not find data bulk in
> ugen0 at uhub2 port 4 configuration 2 "Huawei Technologies Co., Ltd. HUAWEI 
> Mobile" rev 2.10/1.02 addr 3
> 
> Any information I can find for it suggests that it does MBIM, and
> indeed if I disable cdce in the kernel it picks up on a different
> config:
> 
> umb0 at uhub2 port 4 configuration 3 interface 0 "Huawei Technologies Co., 
> Ltd. HUAWEI Mobile" rev 2.10/1.02 addr 3
> ugen0 at uhub2 port 4 configuration 3 "Huawei Technologies Co., Ltd. HUAWEI 
> Mobile" rev 2.10/1.02 addr 3
> 
> and after I figured out which of the APU2's mPCIe slots is routed to
> the SIM slot (the middle one) it actually negotiates with the network
> 
> umb0: flags=8855 mtu 1500
> index 11 priority 6 llprio 3
> roaming disabled registration home network
> state up cell-class LTE rssi -83dBm speed 143Mbps up 143Mbps down
> SIM initialized PIN valid (3 attempts left)
> subscriber-id 234 ICC-id 8944xxx provider 3
> device ML1ME906SM IMEI 867 firmware 11.617.04.00.00
> phone# +44xx APN 3internet
> dns 217.171.132.0 217.171.135.0
> groups: egress
> status: active
> inet 94.197.84.2 --> 94.197.84.1 netmask 0xfffc
> 
> It doesn't seem to be working properly yet (packets transmitted over it
> don't arrive at the destination) but I'm not 100% convinced that it's
> not the network yet, I'll find some more SIMs to try.
> 
> In the meantime, any suggestions how to knock it out from attaching
> to cdce? Is there a way to drop out in the attach routine (e.g. if there's
> no data bulk in) to allow another driver to attempt attaching,
> or is it committed to a specific driver once it has matched?
> 
> If the latter, would it make sense for cdce to look for MBIM on
> another configuration and skip matching in that case? Or do (move
> or replicate) some of cdce_attach()'s checks in cdce_match so it
> can skip attaching if cdce isn't going to work?
> 
> There are loads of these showing up (probably from laptops broken for
> parts) for about 20 $currency_units on ebay/similar now so it would
> be quite nice to get them working. A few similar devices showed up
> on the lists before but I haven't noticed anyone trying to disable
> cdce on them yet.
> https://www.google.com/search?q=huawei+%22cdce0:+could+not+find+data+bulk+in%22+site:openbsd-archive.7691.n7.nabble.com
> 
> If anyone else is thinking of getting one to poke at, to use in an APU
> they also need a M.2 -> mPCIe adapter (aka NGFF -> mPCIe) with 'B' key
> (doesn't need a sim carrier on the adapter), plus whatever u.fl pigtails
> and antennas (the proper multiband LTE antennas usually have SMA
> connectors rather than the RP-SMA common with wifi).
> 
> Descriptors from lsusb below (seach for "HUAWEI Mobile Broadband Module"
> for the 'right' one).

Without having looked at anything, it might be worth looking at the most
recent mail in this thread:

'Re: [PATCH] umb(4) fix for X20 (DW5821e) in Dell Latitude 7300'

Re: fyi: get HP EliteBook 830 G7/G8 booting

2021-03-26 Thread Patrick Wildt

Am Fri, Mar 26, 2021 at 12:12:44PM +0100 schrieb Mark Kettenis:
> > Date: Fri, 26 Mar 2021 19:43:23 +0900 (JST)
> > From: YASUOKA Masahiko 
> > 
> > Hi,
> > 
> > On Fri, 26 Mar 2021 09:30:43 +0100
> > Jan Klemkow  wrote:
> > > If you want to boot OpenBSD on an HP EliteBook 830 G7/G8, the bootloader
> > > will hang while loading the kernel.  Because, the UEFI loads the
> > > bootloader on the same place in memory, where the bootloader will copy
> > > the kernel.  We are unable to load the kernel on arbitrary memory.
> > > Thus, the following diff will help you, to get OpenBSD running on these
> > > machines.  It moves the hardcoded Kernel address to a free place.
> > 
> > The openbsd efiboot copies the kernel to that place after
> > ExitBootServices().
> > 
> > sys/arch/amd64/stand/efiboot/exec_i386.c
> > 152 /*
> > 153  * Move the loaded kernel image to the usual place after 
> > calling
> > 154  * ExitBootServices().
> > 155  */
> > 156 #ifdef __amd64__
> > 157 protect_writeable(marks[MARK_START] + delta,
> > 158 marks[MARK_END] - marks[MARK_START]);
> > 159 #endif
> > 160 memmove((void *)marks[MARK_START] + delta, (void 
> > *)marks[MARK_START],
> > 161 marks[MARK_END] - marks[MARK_START]);
> > 162 for (i = 0; i < MARK_MAX; i++)
> > 163 marks[i] += delta;
> > 164 
> > 165 #ifdef __amd64__
> > 166 (*run_i386)((u_long)run_i386, entry, howto, bootdev, 
> > BOOTARG_APIVER,
> > 167 marks[MARK_END], extmem, cnvmem, ac, (intptr_t)av);
> > 
> > 
> > I think it should work without the ld.script change..
> 
> The (likely) problem is that the memmove() on line 160 is overwriting
> the bootloader code itself.

More than just likely.  We have debugged it, looked at the memory table
etc. and it really is the case that efiboot(8) starts at *the same*
address as where the kernel is supposed to be copied to.  Hence *all*
of efiboot(8) is overwritten, even the code that is currectly doing
the copying.

> There are essentially two ways to fix this:
> 
> 1. Have the bootloader relocate itself to an address that doesn't
>conflict with the kernel to be loaded.
> 
> 2. Make it possible for the kernel to be loaded at a (somewhat)
>arbitrary physical address.
> 
> In my view #2 is the way forward.  There are other reasons why that
> would be beneficial as it would make it less predictable at which
> physical address the kernel code lives which could prevent some
> attacks that use the direct map.
> 
> #2 is also the approach taken by the EFIBOOT on armv7 and arm64.  On
> arm64 for example, EFIBOOT loads the kernel into a 64MB memory block
> that is aligned on a 2MB boundary.  The kernel then figures out its
> load address based on that and and patches things up accordingly.
> 
> mlarkin@ was doing some work to change how we load the amd64 kernel.
> His approach was to let the bootloader build the initial page tables
> and jump into the kernel in 64-bit mode with the MMU enabled.  That
> was more focussed on running the kernel at a randomized virtual
> address.  But it should be fairly easy to make it run at a different
> physical address as well this way.  Unfortunately that effort was
> mostly focussed on the legacy bootloader.

I'm also more in favour of #2, but I'm still working on my thesis (==
no time) and x86 isn't my area of expertise.

Re: cwfg: flag sensor as invalid on bogus reads

2021-03-26 Thread Patrick Wildt

Am Fri, Mar 26, 2021 at 12:26:51AM +0100 schrieb Klemens Nanni:
> Follow-up to "arm64: make cwfg(4) report battery information to apm(4)".
> 
> This driver continues to report stale hw.sensors values when reading
> them fails, which can easily be observed on a Pinebook Pro after
> plugging in the AC cable.
> 
> Running on battery looks like this (note sensors and apm are in sync):
> 
>   $ sysctl hw.sensors
>   hw.sensors.cwfg0.volt0=3.68 VDC (battery voltage)
>   hw.sensors.cwfg0.raw0=104 (battery remaining minutes)
>   hw.sensors.cwfg0.percent0=27.00% (battery percent)
>   $ apm
>   Battery state: high, 27% remaining, 104 minutes life estimate
>   A/C adapter state: not known
>   Performance adjustment mode: auto (408 MHz)
> 
> When I plug in the AC cable, `raw0' jumps around considerable one or two
> times before stalling on the last value (note how `percent0' stayed the
> same while `raw0' trippled):
> 
>   $ sysctl hw.sensors
>   hw.sensors.cwfg0.volt0=3.98 VDC (battery voltage)
>   hw.sensors.cwfg0.raw0=359 (battery remaining minutes)
>   hw.sensors.cwfg0.percent0=27.00% (battery percent)
>   $ apm
>   Battery state: high, 27% remaining, unknown life estimate
>   A/C adapter state: not known
>   Performance adjustment mode: auto (408 MHz)
> 
> `raw0' aka. `CWFG_SENSOR_RTT' stops yielding valid values as long as AC
> is plugged in (no idea if by design or a bug).
> 
> Hence hw.sensors.cwf0 values become incoherent and drift away from apm's
> output which --due to the reset logic discussed in the aforementioned
> tech@ thread-- properly picks up the stalled value as "unknown".
> 
> 
> I see two approaches to fix this:
> 
> 1. simple but less user-friendly:  flag sensors invalid upfront in apm's
>fashion and mark them OK iff they yield valid values;   this is what
>other drivers such as rktemp(4) do, but the consequence/intention of
>`SENSOR_FINVALID' is sysctl(8) and systat(8) skipping such sensors,
>i.e. above sysctl output would omit the `raw0' line if AC is plugged
>in (arguably better than printing outdated/misleading values).
> 
> 2. elaborate but informative:  set sensor value/status to 0/unknown like
>acpibat(4) does for example;  the advantage is that sensors no longer
>come and go but could look like this:
>   hw.sensors.cwfg0.raw0=0 (battery remaining minutes), UNKNOWN
>I'd do prefer this but am not yet sure if that's how the sensor
>framework is supposed to be used and/or I'd need to tinker with the
>code (on multiple notebooks/sensors) to see if it works out.
> 
> 
> Either way, diff below implements the first approach such that we avoid
> bogus sysctl/systat lines and hw.sensors gets in sync with apm.
> One could still switch to the second approach afterwards.
> 
> Feedback? Objections? OK?
> 

It's pretty normal for voltage to go up once AC is connected.  In the
end, afaik batteries are charged by applying voltage.  Additionally
if an external supply provides power, there's a smaller voltage drop.

The "remaining battery time" to become invalid makes sense as well,
I mean, with an AC it's gonna be endless and there's no way to measure
the battery change.  Well, the only thing it could maybe try and
estimate is time until charged.

What happens to battery percentage?  Does it change while it's charging?

As mentioned, connecting the charger will make the voltage go up, but
the battery charge will not have changed, hence I expect the percentage
to stay the same value, even though the voltage changes.  But with time,
percentage should go up.

> 
> Index: cwfg.c
> ===
> RCS file: /cvs/src/sys/dev/fdt/cwfg.c,v
> retrieving revision 1.3
> diff -u -p -r1.3 cwfg.c
> --- cwfg.c25 Mar 2021 12:18:27 -  1.3
> +++ cwfg.c25 Mar 2021 12:25:31 -
> @@ -348,9 +348,12 @@ cwfg_update_sensors(void *arg)
>   uint8_t val;
>   int error, n;
>  
> -#if NAPM > 0
> - /* reset previous reads so apm(4) never gets stale values
> + /* invalidate all previous reads to avoid stale/incoherent values
>* in case of transient cwfg_read() failures below */
> + sc->sc_sensor[CWFG_SENSOR_VCELL].flags |= SENSOR_FINVALID;
> + sc->sc_sensor[CWFG_SENSOR_SOC].flags |= SENSOR_FINVALID;
> + sc->sc_sensor[CWFG_SENSOR_RTT].flags |= SENSOR_FINVALID;

I'd probably put a newline here, but that's just personal nitpicking.

I think it makes sense that outdated information should be marked
invalid.  Doing that upfront makes sense.  Doing it for VCELL is
not strictly necessary, but makes sense for consistency.

ok patrick@

> +#if NAPM > 0
>   cwfg_power.battery_state = APM_BATT_UNKNOWN;
>   cwfg_power.ac_state = APM_AC_UNKNOWN;
>   cwfg_power.battery_life = 0;
>

Re: efiboot/arm64: fix "mach dtb" return code to avoid bogus boot

2021-03-26 Thread Patrick Wildt

Am Wed, Mar 24, 2021 at 07:20:29PM +0100 schrieb Klemens Nanni:
> Bootloader command functions must return zero in case of failure,
> returning 1 tells the bootloader to boot the file.
> 
> arm64's `machine dtb' command has it the wrong way so using it triggers
> a boot that doesn't make any sense:
> 
>   >> OpenBSD/arm64 BOOTAA64 1.4
>   boot> mach dtb
>   booting sd0a:/etc/boot.conf: open sd0a:/etc/boot.conf: No such file or 
> directory
>   failed(2). will try /bsd
>   boot> mach dtb /foo
>   cannot open sd0a:/foo
>   NOTE: random seed is being reused.
>   booting sd0a:/etc/boot.conf: open sd0a:/etc/boot.conf: No such file or 
> directory
>   failed(2). will try /bsd
> 
> With this diff:
> 
>   >> OpenBSD/arm64 BOOTAA64 1.4
>   boot> mach dtb
>   dtb file
>   boot> mach dtb /foo
>   cannot open sd0a:/foo
> 
> While here, tell users how to use that command (like other commands
> such as `hexdump' do).
> 
> Feedback? OK?
> 
> Index: arch/arm64/stand/efiboot/efiboot.c
> ===
> RCS file: /cvs/src/sys/arch/arm64/stand/efiboot/efiboot.c,v
> retrieving revision 1.31
> diff -u -p -r1.31 efiboot.c
> --- arch/arm64/stand/efiboot/efiboot.c9 Mar 2021 21:11:24 -   
> 1.31
> +++ arch/arm64/stand/efiboot/efiboot.c24 Mar 2021 17:59:52 -
> @@ -980,28 +980,30 @@ Xdtb_efi(void)
>  
>  #define O_RDONLY 0
>  
> - if (cmd.argc != 2)
> - return (1);
> + if (cmd.argc != 2) {
> + printf("dtb file\n");
> + return (0);
> + }
>  
>   snprintf(path, sizeof(path), "%s:%s", cmd.bootdev, cmd.argv[1]);
>  
>   fd = open(path, O_RDONLY);
>   if (fd < 0 || fstat(fd, &sb) == -1) {
>   printf("cannot open %s\n", path);
> - return (1);
> + return (0);
>   }
>   if (efi_memprobe_find(EFI_SIZE_TO_PAGES(sb.st_size),
>   0x1000, &addr) != EFI_SUCCESS) {
>   printf("cannot allocate memory for %s\n", path);
> - return (1);
> + return (0);
>   }
>   if (read(fd, (void *)addr, sb.st_size) != sb.st_size) {
>   printf("cannot read from %s\n", path);
> - return (1);
> + return (0);
>   }
>  
>   fdt = (void *)addr;
> - return (0);
> + return (1);

Wait, you've been saying that return code 1 makes it boot.  So now you
changed it so that "mach dtb" kicks of booting the kernel?  That does
not seem tight to me.  This should stay 0, right?

>  }
>  
>  int
>

Re: apm/arm64: fix errno, merge ioctl cases

2021-03-26 Thread Patrick Wildt

Am Sat, Mar 20, 2021 at 08:01:51PM +0100 schrieb Klemens Nanni:
> The EBADF error is always overwritten for the standby, suspend and
> hibernate ioctls, only the mode ioctl has it right.
> 
> Merge the now identical casese while here.
> 
> Tested on a Pinebook Pro.
> 
> OK?

ok patrick@

> Index: apm.c
> ===
> RCS file: /cvs/src/sys/arch/arm64/dev/apm.c,v
> retrieving revision 1.6
> diff -u -p -r1.6 apm.c
> --- apm.c 25 Dec 2020 12:59:51 -  1.6
> +++ apm.c 20 Mar 2021 18:54:23 -
> @@ -213,17 +213,15 @@ apmioctl(dev_t dev, u_long cmd, caddr_t 
>   case APM_IOC_STANDBY_REQ:
>   case APM_IOC_SUSPEND:
>   case APM_IOC_SUSPEND_REQ:
> - if ((flag & FWRITE) == 0)
> - error = EBADF;
> - error = EOPNOTSUPP;
> - break;
>  #ifdef HIBERNATE
>   case APM_IOC_HIBERNATE:
> +#endif
> + case APM_IOC_DEV_CTL:
>   if ((flag & FWRITE) == 0)
>   error = EBADF;
> - error = EOPNOTSUPP;
> + else
> + error = EOPNOTSUPP; /* XXX */
>   break;
> -#endif
>   case APM_IOC_PRN_CTL:
>   if ((flag & FWRITE) == 0)
>   error = EBADF;
> @@ -247,12 +245,6 @@ apmioctl(dev_t dev, u_long cmd, caddr_t 
>   break;
>   }
>   }
> - break;
> - case APM_IOC_DEV_CTL:
> - if ((flag & FWRITE) == 0)
> - error = EBADF;
> - else
> - error = EOPNOTSUPP; /* XXX */
>   break;
>   case APM_IOC_GETPOWER:
>   power = (struct apm_power_info *)data;
>

Re: efiboot/arm64: fix "mach dtb" return code to avoid bogus boot

2021-03-26 Thread Patrick Wildt

Am Sat, Mar 27, 2021 at 12:09:25AM +0100 schrieb Klemens Nanni:
> On Fri, Mar 26, 2021 at 11:28:37PM +0100, Patrick Wildt wrote:
> > >   fdt = (void *)addr;
> > > - return (0);
> > > + return (1);
> > 
> > Wait, you've been saying that return code 1 makes it boot.  So now you
> > changed it so that "mach dtb" kicks of booting the kernel?  That does
> > not seem tight to me.  This should stay 0, right?
> Absolutely, my bad.
> 
> I've tested all scenarios with and without this fixed diff below to
> double check.
> 
> OK?
> 

Yes.

> 
> Index: efiboot.c
> ===
> RCS file: /cvs/src/sys/arch/arm64/stand/efiboot/efiboot.c,v
> retrieving revision 1.31
> diff -u -p -r1.31 efiboot.c
> --- efiboot.c 9 Mar 2021 21:11:24 -   1.31
> +++ efiboot.c 26 Mar 2021 23:05:46 -
> @@ -980,24 +980,26 @@ Xdtb_efi(void)
>  
>  #define O_RDONLY 0
>  
> - if (cmd.argc != 2)
> - return (1);
> + if (cmd.argc != 2) {
> + printf("dtb file\n");
> + return (0);
> + }
>  
>   snprintf(path, sizeof(path), "%s:%s", cmd.bootdev, cmd.argv[1]);
>  
>   fd = open(path, O_RDONLY);
>   if (fd < 0 || fstat(fd, &sb) == -1) {
>   printf("cannot open %s\n", path);
> - return (1);
> + return (0);
>   }
>   if (efi_memprobe_find(EFI_SIZE_TO_PAGES(sb.st_size),
>   0x1000, &addr) != EFI_SUCCESS) {
>   printf("cannot allocate memory for %s\n", path);
> - return (1);
> + return (0);
>   }
>   if (read(fd, (void *)addr, sb.st_size) != sb.st_size) {
>   printf("cannot read from %s\n", path);
> - return (1);
> + return (0);
>   }
>  
>   fdt = (void *)addr;

Re: cwfg: flag sensor as invalid on bogus reads

2021-03-26 Thread Patrick Wildt

Am Sat, Mar 27, 2021 at 12:00:32AM +0100 schrieb Klemens Nanni:
> On Fri, Mar 26, 2021 at 11:22:32PM +0100, Patrick Wildt wrote:
> > It's pretty normal for voltage to go up once AC is connected.  In the
> > end, afaik batteries are charged by applying voltage.  Additionally
> > if an external supply provides power, there's a smaller voltage drop.
> I thought as much, thanks.
> 
> > The "remaining battery time" to become invalid makes sense as well,
> > I mean, with an AC it's gonna be endless and there's no way to measure
> > the battery change.  Well, the only thing it could maybe try and
> > estimate is time until charged.
> That's stuff for another diff (perhaps;  I've heard guessing estimates
> is hard).
> 
> > What happens to battery percentage?  Does it change while it's charging?
> >
> > As mentioned, connecting the charger will make the voltage go up, but
> > the battery charge will not have changed, hence I expect the percentage
> > to stay the same value, even though the voltage changes.  But with time,
> > percentage should go up.
> the `percent0' values does increase steadily over time while AC is
> plugged in.

Cool, that's what I'd hope it does.  So voltage goes up as expected,
remaining battery is invalid (because it's on AC), and the percentage
goes up.  Sounds good to me. :)

> > > @@ -348,9 +348,12 @@ cwfg_update_sensors(void *arg)
> > >   uint8_t val;
> > >   int error, n;
> > >  
> > > -#if NAPM > 0
> > > - /* reset previous reads so apm(4) never gets stale values
> > > + /* invalidate all previous reads to avoid stale/incoherent values
> > >* in case of transient cwfg_read() failures below */
> > > + sc->sc_sensor[CWFG_SENSOR_VCELL].flags |= SENSOR_FINVALID;
> > > + sc->sc_sensor[CWFG_SENSOR_SOC].flags |= SENSOR_FINVALID;
> > > + sc->sc_sensor[CWFG_SENSOR_RTT].flags |= SENSOR_FINVALID;
> > 
> > I'd probably put a newline here, but that's just personal nitpicking.
> Sure, committed with it.
> 
> > I think it makes sense that outdated information should be marked
> > invalid.  Doing that upfront makes sense.  Doing it for VCELL is
> > not strictly necessary, but makes sense for consistency.
> I did this for consistency, yes.
>

Re: Huawei ME906s-158 LTE, cdce(4) vs umb(4)

2021-03-28 Thread Patrick Wildt

Am Sun, Mar 28, 2021 at 10:53:53AM +0100 schrieb Stuart Henderson:
> On 2021/03/25 00:14, Stuart Henderson wrote:
> > On 2021/03/25 00:30, Patrick Wildt wrote:
> > > Without having looked at anything, it might be worth looking at the most
> > > recent mail in this thread:
> > > 
> > > 'Re: [PATCH] umb(4) fix for X20 (DW5821e) in Dell Latitude 7300'
> > > 
> > 
> > oh, usb runs through all drivers looking for a VID/PID match before
> > then running through all looking for a class match? I didn't realise
> > (thought it would try driver-by-driver with first VID/PID then class)
> > but that does make sense.
> > 
> > Updated diff below adding my pid/vid and tweaking some comments. My card
> > attaches to umb with this. I've only added it commented-out to the manual
> > for now until I see it actually pass traffic.
> > 
> > So this fixes Bryan's, at least improves mine (will look for another
> > sim / sim-adapter tomorrow), and seems targetted enough to not risk
> > fallout with existing working devices.
> > 
> > I think this makes sense to commit. OK with me if you'd like to commit
> > it Gerhard (I think it was your diff originally).
> 
> So this isn't enough for the Huawei yet but it doesn't make things
> any worse, and helps for DW5821e.
> 
> If Gerhard isn't around, can I commit this bit so I'm not wrangling
> things which should be separate commits? OK?

Yes, please do.  ok patrick@

> > I added this product to usbdevs as a short string; the other Huawei
> > devices all say "HUAWEI Mobile xyz" rather than just "xyz" in the device
> > string which I think should be trimmed as well, probably worth doing
> > that on top.
> > 
> > 
> > Index: share/man/man4/umb.4
> > ===
> > RCS file: /cvs/src/share/man/man4/umb.4,v
> > retrieving revision 1.11
> > diff -u -p -r1.11 umb.4
> > --- share/man/man4/umb.412 May 2020 13:03:52 -  1.11
> > +++ share/man/man4/umb.425 Mar 2021 00:03:58 -
> > @@ -44,8 +44,10 @@ PIN again even if the system was reboote
> >  The following devices should work:
> >  .Pp
> >  .Bl -tag -width Ds -offset indent -compact
> > +.It Dell DW5821e
> >  .It Ericsson H5321gw and N5321gw
> >  .It Fibocom L831-EAU
> > +.\" .It Huawei ME906s -- attaches but may need more work
> >  .It Medion Mobile S4222 (MediaTek OEM)
> >  .It Sierra Wireless EM7345
> >  .It Sierra Wireless EM7455
> > Index: sys/dev/usb/if_umb.c
> > ===
> > RCS file: /cvs/src/sys/dev/usb/if_umb.c,v
> > retrieving revision 1.37
> > diff -u -p -r1.37 if_umb.c
> > --- sys/dev/usb/if_umb.c29 Jan 2021 17:06:19 -  1.37
> > +++ sys/dev/usb/if_umb.c24 Mar 2021 23:52:13 -
> > @@ -225,6 +225,28 @@ const struct cfattach umb_ca = {
> >  int umb_delay = 4000;
> >  
> >  /*
> > + * Normally, MBIM devices are detected by their interface class and 
> > subclass.
> > + * But for some models that have multiple configurations, it is better to
> > + * match by vendor and product id so that we can select the desired
> > + * configuration ourselves, e.g. to override a class-based match to another
> > + * driver.
> > + *
> > + * OTOH, some devices identify themselves as an MBIM device but fail to 
> > speak
> > + * the MBIM protocol.
> > + */
> > +struct umb_products {
> > +   struct usb_devno dev;
> > +   int  confno;
> > +};
> > +const struct umb_products umb_devs[] = {
> > +   { { USB_VENDOR_DELL, USB_PRODUCT_DELL_DW5821E }, 2 },
> > +   { { USB_VENDOR_HUAWEI, USB_PRODUCT_HUAWEI_ME906S }, 3 },
> > +};
> > +
> > +#define umb_lookup(vid, pid)   \
> > +   ((const struct umb_products *)usb_lookup(umb_devs, vid, pid))
> > +
> > +/*
> >   * These devices require an "FCC Authentication" command.
> >   */
> >  const struct usb_devno umb_fccauth_devs[] = {
> > @@ -263,6 +285,8 @@ umb_match(struct device *parent, void *m
> > struct usb_attach_arg *uaa = aux;
> > usb_interface_descriptor_t *id;
> >  
> > +   if (umb_lookup(uaa->vendor, uaa->product) != NULL)
> > +   return UMATCH_VENDOR_PRODUCT;
> > if (!uaa->iface)
> > return UMATCH_NONE;
> > if ((id = usbd_get_interface_descriptor(uaa->iface)) == NULL)
> > @@ -315,6 +339,43 @@ umb_attach(st

Re: cwfg: Use meaningful alert level, track apm's battery state better

2021-03-31 Thread Patrick Wildt

Am Mon, Mar 29, 2021 at 07:16:18AM +0200 schrieb Klemens Nanni:
> The datasheet says the hardware's default State-Of-Charge threshold is
> three percent, i.e. the gauge pulls down the pin to logic low at 3%
> remaining battery life.
> 
> My Pinebook Pro's fuel gauge actually shows an alert level of zero
> percent however and the latest device tree (both from our dtb package
> and other sources) no longer provide the "cellwise,alert-level"
> property.

If there's no alert-level property, then maybe we should just remove it.
Then you could hard code a value for "below will be critical", like you
now do with the 50%?

> The current code still looks for that property but falls back to the
> define;  crank it such that apm(8) does not always report "high" battery
> state.
> 
> While here, use all three available states in the same way acpibat(4)
> sys/dev/acpi/acpi.c does.
> 
> Feedback? OK?
> 
> Index: cwfg.c
> ===
> RCS file: /cvs/src/sys/dev/fdt/cwfg.c,v
> retrieving revision 1.4
> diff -u -p -r1.4 cwfg.c
> --- cwfg.c26 Mar 2021 22:54:41 -  1.4
> +++ cwfg.c29 Mar 2021 05:03:58 -
> @@ -101,7 +101,7 @@ struct cwfg_softc {
>  
>  #define  CWFG_MONITOR_INTERVAL_DEFAULT   5000
>  #define  CWFG_DESIGN_CAPACITY_DEFAULT2000
> -#define  CWFG_ALERT_LEVEL_DEFAULT0
> +#define  CWFG_ALERT_LEVEL_DEFAULT25
>  
>  int cwfg_match(struct device *, void *, void *);
>  void cwfg_attach(struct device *, struct device *, void *);
> @@ -387,9 +387,13 @@ cwfg_update_sensors(void *arg)
>   sc->sc_sensor[CWFG_SENSOR_SOC].value = val * 1000;
>   sc->sc_sensor[CWFG_SENSOR_SOC].flags &= ~SENSOR_FINVALID;
>  #if NAPM > 0
> - cwfg_power.battery_state = val > sc->sc_alert_level ?
> - APM_BATT_HIGH : APM_BATT_LOW;
>   cwfg_power.battery_life = val;
> + if (val > 50)
> + cwfg_power.battery_state = APM_BATT_HIGH;
> + else if (val > sc->sc_alert_level)
> + cwfg_power.battery_state = APM_BATT_LOW;
> + else
> + cwfg_power.battery_state = APM_BATT_CRITICAL;
>  #endif
>   }
>  
>

Re: simpleaudio: set sysclk before using it

2021-04-05 Thread Patrick Wildt

Am Sun, Apr 04, 2021 at 11:17:54PM +0200 schrieb Mark Kettenis:
> > Date: Sun, 4 Apr 2021 22:24:57 +0200
> > From: Klemens Nanni 
> > Cc: Mark Kettenis 
> > 
> > On Sun, Apr 04, 2021 at 10:01:50PM +0200, Mark Kettenis wrote:
> > > > Date: Sun, 4 Apr 2021 21:08:09 +0200
> > > > From: Klemens Nanni 
> > > > 
> > > > Feedback? Objections? OK?
> > > 
> > > Explanation?
> > > 
> > > Not sure what happened here, but the reply-to was completely garbled...
> > Sorry, I must've crippled the body before sending.
> > 
> > simpleaudio_set_params() calls set_params() which reads sysclk off the
> > "i2s_clk" property before it sets that very clock's dd_set_sysclk()
> > (in case there's multiplier specified).
> > 
> > Hence reverse the order so set_params() picks up the newly set rate.
> > 
> > The rate is still off on the Pinebook Pro, but I came across this when
> > reading the code and it seemed wrong, so I also checked NetBSD which did
> > the same with
> > 
> > sys/dev/fdt/ausoc.c r1.6
> > "Set sysclk rate at set_format time, so the link set_format callback can 
> > read the new sysclk"
> > https://github.com/NetBSD/src/commit/ac8f47d1e5f46949b081c8e9d95211cdfda1e327
> 
> OK. So NetBSD's _set_format() is the equivalent of our _set_params().
> So changing the order makes us match how NetBSD does things.  That
> makes some sense.
> 
> ok kettenis@, but give Patrick a chance to comment as well.
> 

ok patrick@

> 
> > Index: dev/fdt/simpleaudio.c
> > ===
> > RCS file: /cvs/src/sys/dev/fdt/simpleaudio.c,v
> > retrieving revision 1.1
> > diff -u -p -r1.1 simpleaudio.c
> > --- dev/fdt/simpleaudio.c   10 Jun 2020 23:55:19 -  1.1
> > +++ dev/fdt/simpleaudio.c   4 Apr 2021 20:23:39 -
> > @@ -300,24 +300,6 @@ simpleaudio_set_params(void *cookie, int
> > uint32_t rate;
> > int error;
> >  
> > -   dai = sc->sc_dai_cpu;
> > -   hwif = dai->dd_hw_if;
> > -   if (hwif->set_params) {
> > -   error = hwif->set_params(dai->dd_cookie,
> > -   setmode, usemode, play, rec);
> > -   if (error)
> > -   return error;
> > -   }
> > -
> > -   dai = sc->sc_dai_codec;
> > -   hwif = dai->dd_hw_if;
> > -   if (hwif->set_params) {
> > -   error = hwif->set_params(dai->dd_cookie,
> > -   setmode, usemode, play, rec);
> > -   if (error)
> > -   return error;
> > -   }
> > -
> > if (sc->sc_mclk_fs) {
> > if (setmode & AUMODE_PLAY)
> > rate = play->sample_rate * sc->sc_mclk_fs;
> > @@ -337,6 +319,24 @@ simpleaudio_set_params(void *cookie, int
> > if (error)
> > return error;
> > }
> > +   }
> > +
> > +   dai = sc->sc_dai_cpu;
> > +   hwif = dai->dd_hw_if;
> > +   if (hwif->set_params) {
> > +   error = hwif->set_params(dai->dd_cookie,
> > +   setmode, usemode, play, rec);
> > +   if (error)
> > +   return error;
> > +   }
> > +
> > +   dai = sc->sc_dai_codec;
> > +   hwif = dai->dd_hw_if;
> > +   if (hwif->set_params) {
> > +   error = hwif->set_params(dai->dd_cookie,
> > +   setmode, usemode, play, rec);
> > +   if (error)
> > +   return error;
> > }
> >  
> > return 0;
> > 
>

Re: uvideo(4) new quirk flag UVIDEO_FLAG_NOATTACH

2021-04-05 Thread Patrick Wildt

Am Mon, Apr 05, 2021 at 11:19:02PM +0200 schrieb Mark Kettenis:
> > Date: Mon, 5 Apr 2021 23:15:23 +0200
> > From: Marcus Glocker 
> > 
> > On Mon, Apr 05, 2021 at 07:30:43AM -0700, Greg Steuck wrote:
> > 
> > > OK gnezdo with a usability question inline.
> > 
> > Thanks.  See below.
> >  
> > > Marcus Glocker  writes:
> > > 
> > > > martijn@ has recently reported that in his machine he has two cams
> > > > of which one is doing IR, which isn't really supported by uvideo(4).
> > > > This IR device attaches always first as uvideo0, so he needs to swap
> > > > that regularly with his working cam which by default attaches to 
> > > > uvideo1.
> > > >
> > > > I came up with a new quirk flag to *not* attach certain devices.  Tested
> > > > successfully by martijn@, the IR cam attaches to ugen0 and the supported
> > > > cam to uvideo0.
> > > >
> > > > This patch shouldn't affect any supported uvideo(4) devices.
> > > >
> > > > OK?
> > > >
> > > > Index: sys/dev/usb/uvideo.c
> > > > ===
> > > > RCS file: /cvs/src/sys/dev/usb/uvideo.c,v
> > > > retrieving revision 1.211
> > > > diff -u -p -u -p -r1.211 uvideo.c
> > > > --- sys/dev/usb/uvideo.c27 Jan 2021 17:28:19 -  1.211
> > > > +++ sys/dev/usb/uvideo.c8 Mar 2021 22:06:51 -
> > > > @@ -307,6 +307,7 @@ struct video_hw_if uvideo_hw_if = {
> > > >  #define UVIDEO_FLAG_ISIGHT_STREAM_HEADER   0x1
> > > >  #define UVIDEO_FLAG_REATTACH   0x2
> > > >  #define UVIDEO_FLAG_VENDOR_CLASS   0x4
> > > > +#define UVIDEO_FLAG_NOATTACH   0x8
> > > >  struct uvideo_devs {
> > > > struct usb_devno uv_dev;
> > > > char*ucode_name;
> > > > @@ -382,6 +383,12 @@ struct uvideo_devs {
> > > > NULL,
> > > > UVIDEO_FLAG_VENDOR_CLASS
> > > > },
> > > > +   {   /* Infrared camera not supported */
> > > > +   { USB_VENDOR_CHICONY, USB_PRODUCT_CHICONY_IRCAMERA },
> > > > +   NULL,
> > > > +   NULL,
> > > > +   UVIDEO_FLAG_NOATTACH
> > > > +   },
> > > >  };
> > > >  #define uvideo_lookup(v, p) \
> > > > ((struct uvideo_devs *)usb_lookup(uvideo_devs, v, p))
> > > > @@ -480,13 +487,12 @@ uvideo_match(struct device *parent, void
> > > > if (id == NULL)
> > > > return (UMATCH_NONE);
> > > >  
> > > > -   if (id->bInterfaceClass == UICLASS_VIDEO &&
> > > > -   id->bInterfaceSubClass == UISUBCLASS_VIDEOCONTROL)
> > > > -   return (UMATCH_VENDOR_PRODUCT_CONF_IFACE);
> > > > -
> > > > -   /* quirk devices which we want to attach */
> > > > +   /* quirk devices */
> > > > quirk = uvideo_lookup(uaa->vendor, uaa->product);
> > > > if (quirk != NULL) {
> > > > +   if (quirk->flags & UVIDEO_FLAG_NOATTACH)
> > > 
> > > How common is it to explain the system behavior in cases like this?
> > > Would it be less surprising (generate less misc@ traffic) if we printed
> > > a note explaining why the camera was skipped?
> > 
> > I wouldn't print a specific message per unsupported device, but I think
> > a generic message that the video device isn't supported would make
> > sense.  Something like that maybe?  Obviously this can print more than
> > once, but I'm not sure if it's worth to make that a unique print.
> > 
> > E.g.:
> > 
> > vmm0 at mainbus0: VMX/EPT
> > uvideo: device 13d3:56b2 isn't supported
> > uvideo: device 13d3:56b2 isn't supported
> > ugen0 at uhub0 port 8 "SunplusIT Inc Integrated Camera" rev 2.01/17.11 addr 
> > 2
> 
> No; match functions shouldn't print stuff.

Agreed.  If something like that was wanted, I'd rather have uvideo(4)
attach, but print that it's not supported, and then not attach video(4)
to it.

> 
> > Index: sys/dev/usb/uvideo.c
> > ===
> > RCS file: /cvs/src/sys/dev/usb/uvideo.c,v
> > retrieving revision 1.212
> > diff -u -p -u -p -r1.212 uvideo.c
> > --- sys/dev/usb/uvideo.c5 Apr 2021 20:45:49 -   1.212
> > +++ sys/dev/usb/uvideo.c5 Apr 2021 21:09:36 -
> > @@ -490,8 +490,11 @@ uvideo_match(struct device *parent, void
> > /* quirk devices */
> > quirk = uvideo_lookup(uaa->vendor, uaa->product);
> > if (quirk != NULL) {
> > -   if (quirk->flags & UVIDEO_FLAG_NOATTACH)
> > +   if (quirk->flags & UVIDEO_FLAG_NOATTACH) {
> > +   printf("uvideo: device %x:%x isn't supported\n",
> > +   uaa->vendor, uaa->product);
> > return (UMATCH_NONE);
> > +   }
> >  
> > if (quirk->flags & UVIDEO_FLAG_REATTACH)
> > return (UMATCH_VENDOR_PRODUCT_CONF_IFACE);
> > 
> > 
>

Re: arm64 pwmbl(4): simplify ramp case

2022-11-10 Thread Patrick Wildt

On Mon, Jul 04, 2022 at 06:47:33PM +, Miod Vallat wrote:
> When the fdt does not provide a list of brightness states, pwmbl(4)
> builds a 256 state ramp (i.e. state[i] = i with 0 <= i < 256).
> 
> The following diff keeps that behaviour, but gets rid of the malloc
> call for that ramp, since the values are trivially known.
> 
> Compiles but not tested due to the lack of such hardware.
> 
> Index: sys/dev/fdt/pwmbl.c
> ===
> RCS file: /OpenBSD/src/sys/dev/fdt/pwmbl.c,v
> retrieving revision 1.6
> diff -u -p -r1.6 pwmbl.c
> --- sys/dev/fdt/pwmbl.c   24 Oct 2021 17:52:26 -  1.6
> +++ sys/dev/fdt/pwmbl.c   4 Jul 2022 18:45:16 -
> @@ -35,7 +35,7 @@ struct pwmbl_softc {
>   struct device   sc_dev;
>   uint32_t*sc_pwm;
>   int sc_pwm_len;
> - uint32_t*sc_levels;
> + uint32_t*sc_levels; /* NULL if simple ramp */
>   int sc_nlevels;
>   uint32_tsc_max_level;
>   uint32_tsc_def_level;
> @@ -73,7 +73,7 @@ pwmbl_attach(struct device *parent, stru
>   struct pwmbl_softc *sc = (struct pwmbl_softc *)self;
>   struct fdt_attach_args *faa = aux;
>   uint32_t *gpios;
> - int i, len;
> + int len;
>  
>   len = OF_getproplen(faa->fa_node, "pwms");
>   if (len < 0) {
> @@ -95,7 +95,7 @@ pwmbl_attach(struct device *parent, stru
>   }
>  
>   len = OF_getproplen(faa->fa_node, "brightness-levels");
> - if (len > 0) {
> + if (len >= sizeof(uint32_t)) {

This actually breaks my machine.  malloc() is saying allocation too
large.  OF_getproplen will return -1 on that.  Is it possible that
len is treated as uint64_t as it is an int and sizeof is effectively
uint64_t?

Moving len to ssize_t doesn't fix it, but doing

if (len >= (int)sizeof(uint32_t)) {

works.  So I wonder if

if (len > 0 && len >= sizeof(uint32_t)) {

would work as well.  Or maybe let's just keep it as it is?

>   sc->sc_levels = malloc(len, M_DEVBUF, M_WAITOK);
>   OF_getpropintarray(faa->fa_node, "brightness-levels",
>   sc->sc_levels, len);
> @@ -107,13 +107,9 @@ pwmbl_attach(struct device *parent, stru
>   sc->sc_def_level = sc->sc_nlevels - 1;
>   sc->sc_def_level = sc->sc_levels[sc->sc_def_level];
>   } else {
> + /* No levels, assume a simple 0..255 ramp. */
>   sc->sc_nlevels = 256;
> - sc->sc_levels = mallocarray(sc->sc_nlevels,
> - sizeof(uint32_t), M_DEVBUF, M_WAITOK);
> - for (i = 0; i < sc->sc_nlevels; i++)
> - sc->sc_levels[i] = i;
> - sc->sc_max_level = sc->sc_levels[sc->sc_nlevels - 1];
> - sc->sc_def_level = sc->sc_levels[sc->sc_nlevels - 1];
> + sc->sc_max_level = sc->sc_def_level = sc->sc_nlevels - 1;
>   }
>  
>   printf("\n");
> @@ -144,17 +140,22 @@ pwmbl_find_brightness(struct pwmbl_softc
>   uint32_t mid;
>   int i;

Might be easier to have a check like:

if (sc->sc_channels == NULL)
return level < sc->sc_nlevels ? level : sc->sc_nlevels - 1;

Then you don't need to indent the whole block.  Makes the diff smaller
and a bit easier to understand?

Cheers,
Patrick

>  
> - for (i = 0; i < sc->sc_nlevels - 1; i++) {
> - mid = (sc->sc_levels[i] + sc->sc_levels[i + 1]) / 2;
> - if (sc->sc_levels[i] <= level && level <= mid)
> + if (sc->sc_levels) {
> + for (i = 0; i < sc->sc_nlevels - 1; i++) {
> + mid = (sc->sc_levels[i] + sc->sc_levels[i + 1]) / 2;
> + if (sc->sc_levels[i] <= level && level <= mid)
> + return sc->sc_levels[i];
> + if (mid < level && level <= sc->sc_levels[i + 1])
> + return sc->sc_levels[i + 1];
> + }
> + if (level < sc->sc_levels[0])
> + return sc->sc_levels[0];
> + else
>   return sc->sc_levels[i];
> - if (mid < level && level <= sc->sc_levels[i + 1])
> - return sc->sc_levels[i + 1];
> +
> + } else {
> + return level < sc->sc_nlevels ? level : sc->sc_nlevels - 1;
>   }
> - if (level < sc->sc_levels[0])
> - return sc->sc_levels[0];
> - else
> - return sc->sc_levels[i];
>  }
>  
>  int
>

Re: qcpmic.4, qcpmicgpio.4, etc.: Sort SEE ALSO

2022-11-10 Thread Patrick Wildt

There are other drivers with intro after others. Are we sorting those 
alphabetically or by relevancy?

Von meinem iPhone gesendet

> Am 10.11.2022 um 16:46 schrieb Josiah Frentsos :
> 
> Index: qcpmic.4
> ===
> RCS file: /cvs/src/share/man/man4/qcpmic.4,v
> retrieving revision 1.2
> diff -u -p -r1.2 qcpmic.4
> --- qcpmic.410 Nov 2022 13:08:57 -1.2
> +++ qcpmic.410 Nov 2022 16:43:07 -
> @@ -34,12 +34,12 @@ Snapdragon SoCs.
> The functionality for the hardware blocks found in each PMIC is
> implemented in the children attaching to this driver.
> .Sh SEE ALSO
> -.Xr qcspmi 4 ,
> +.Xr intro 4 ,
> .Xr qcpmicgpio 4 ,
> .Xr qcpon 4 ,
> .Xr qcpwm 4 ,
> .Xr qcrtc 4 ,
> -.Xr intro 4
> +.Xr qcspmi 4
> .Sh HISTORY
> The
> .Nm
> Index: qcpmicgpio.4
> ===
> RCS file: /cvs/src/share/man/man4/qcpmicgpio.4,v
> retrieving revision 1.1
> diff -u -p -r1.1 qcpmicgpio.4
> --- qcpmicgpio.410 Nov 2022 12:57:08 -1.1
> +++ qcpmicgpio.410 Nov 2022 16:43:07 -
> @@ -30,8 +30,8 @@ SoCs.
> It does not provide direct device driver entry points but makes its
> functions available to other drivers.
> .Sh SEE ALSO
> -.Xr qcpmic 4 ,
> -.Xr intro 4
> +.Xr intro 4 ,
> +.Xr qcpmic 4
> .Sh HISTORY
> The
> .Nm
> Index: qcpon.4
> ===
> RCS file: /cvs/src/share/man/man4/qcpon.4,v
> retrieving revision 1.1
> diff -u -p -r1.1 qcpon.4
> --- qcpon.410 Nov 2022 13:08:57 -1.1
> +++ qcpon.410 Nov 2022 16:43:07 -
> @@ -26,10 +26,11 @@
> The
> .Nm
> driver provides support for the PON controllers integrated on various
> -Qualcomm Snapdragon SoCs.  This controller contains the power button.
> +Qualcomm Snapdragon SoCs.
> +This controller contains the power button.
> .Sh SEE ALSO
> -.Xr qcpmic 4 ,
> -.Xr intro 4
> +.Xr intro 4 ,
> +.Xr qcpmic 4
> .Sh HISTORY
> The
> .Nm
> Index: qcpwm.4
> ===
> RCS file: /cvs/src/share/man/man4/qcpwm.4,v
> retrieving revision 1.1
> diff -u -p -r1.1 qcpwm.4
> --- qcpwm.410 Nov 2022 13:08:57 -1.1
> +++ qcpwm.410 Nov 2022 16:43:07 -
> @@ -28,8 +28,8 @@ The
> driver provides support for the PWM controllers integrated on various
> Qualcomm Snapdragon SoCs.
> .Sh SEE ALSO
> -.Xr qcpmic 4 ,
> -.Xr intro 4
> +.Xr intro 4 ,
> +.Xr qcpmic 4
> .Sh HISTORY
> The
> .Nm
> Index: qcrtc.4
> ===
> RCS file: /cvs/src/share/man/man4/qcrtc.4,v
> retrieving revision 1.1
> diff -u -p -r1.1 qcrtc.4
> --- qcrtc.410 Nov 2022 13:08:57 -1.1
> +++ qcrtc.410 Nov 2022 16:43:07 -
> @@ -28,8 +28,8 @@ The
> driver provides support for the RTC integrated on various Qualcomm
> Snapdragon SoCs.
> .Sh SEE ALSO
> -.Xr qcpmic 4 ,
> -.Xr intro 4
> +.Xr intro 4 ,
> +.Xr qcpmic 4
> .Sh HISTORY
> The
> .Nm
> Index: qcspmi.4
> ===
> RCS file: /cvs/src/share/man/man4/qcspmi.4,v
> retrieving revision 1.1
> diff -u -p -r1.1 qcspmi.4
> --- qcspmi.410 Nov 2022 12:57:08 -1.1
> +++ qcspmi.410 Nov 2022 16:43:07 -
> @@ -29,8 +29,8 @@ The
> driver provides support for the SPMI controller found on various
> Qualcomm Snapdragon SoCs.
> .Sh SEE ALSO
> -.Xr qcpmic 4 ,
> -.Xr intro 4
> +.Xr intro 4 ,
> +.Xr qcpmic 4
> .Sh HISTORY
> The
> .Nm
>

Re: arm64 pwmbl(4): simplify ramp case

2022-11-11 Thread Patrick Wildt

On Fri, Nov 11, 2022 at 06:48:21AM +, Miod Vallat wrote:
> > This actually breaks my machine.  malloc() is saying allocation too
> > large.  OF_getproplen will return -1 on that.  Is it possible that
> > len is treated as uint64_t as it is an int and sizeof is effectively
> > uint64_t?
> 
> Ah, yes; size_t is unsigned and wider than int on 64-bit platforms,
> therefore int is converted to unsigned for the comparison. Casting
> sizeof to int will do.
> 
> > Might be easier to have a check like:
> > 
> > if (sc->sc_channels == NULL)
> > return level < sc->sc_nlevels ? level : sc->sc_nlevels - 1;
> > 
> > Then you don't need to indent the whole block.  Makes the diff smaller
> > and a bit easier to understand?
> 
> Sure; what about this new version, then?

Works for me, thanks!  ok patrick@

> 
> Index: pwmbl.c
> ===
> RCS file: /OpenBSD/src/sys/dev/fdt/pwmbl.c,v
> retrieving revision 1.6
> diff -u -p -r1.6 pwmbl.c
> --- pwmbl.c   24 Oct 2021 17:52:26 -  1.6
> +++ pwmbl.c   11 Nov 2022 06:46:41 -
> @@ -35,7 +35,7 @@ struct pwmbl_softc {
>   struct device   sc_dev;
>   uint32_t*sc_pwm;
>   int sc_pwm_len;
> - uint32_t*sc_levels;
> + uint32_t*sc_levels; /* NULL if simple ramp */
>   int sc_nlevels;
>   uint32_tsc_max_level;
>   uint32_tsc_def_level;
> @@ -73,7 +73,7 @@ pwmbl_attach(struct device *parent, stru
>   struct pwmbl_softc *sc = (struct pwmbl_softc *)self;
>   struct fdt_attach_args *faa = aux;
>   uint32_t *gpios;
> - int i, len;
> + int len;
>  
>   len = OF_getproplen(faa->fa_node, "pwms");
>   if (len < 0) {
> @@ -95,7 +95,7 @@ pwmbl_attach(struct device *parent, stru
>   }
>  
>   len = OF_getproplen(faa->fa_node, "brightness-levels");
> - if (len > 0) {
> + if (len >= (int)sizeof(uint32_t)) {
>   sc->sc_levels = malloc(len, M_DEVBUF, M_WAITOK);
>   OF_getpropintarray(faa->fa_node, "brightness-levels",
>   sc->sc_levels, len);
> @@ -107,13 +107,9 @@ pwmbl_attach(struct device *parent, stru
>   sc->sc_def_level = sc->sc_nlevels - 1;
>   sc->sc_def_level = sc->sc_levels[sc->sc_def_level];
>   } else {
> + /* No levels, assume a simple 0..255 ramp. */
>   sc->sc_nlevels = 256;
> - sc->sc_levels = mallocarray(sc->sc_nlevels,
> - sizeof(uint32_t), M_DEVBUF, M_WAITOK);
> - for (i = 0; i < sc->sc_nlevels; i++)
> - sc->sc_levels[i] = i;
> - sc->sc_max_level = sc->sc_levels[sc->sc_nlevels - 1];
> - sc->sc_def_level = sc->sc_levels[sc->sc_nlevels - 1];
> + sc->sc_max_level = sc->sc_def_level = sc->sc_nlevels - 1;
>   }
>  
>   printf("\n");
> @@ -143,6 +139,9 @@ pwmbl_find_brightness(struct pwmbl_softc
>  {
>   uint32_t mid;
>   int i;
> +
> + if (sc->sc_levels == NULL)
> + return level < sc->sc_nlevels ? level : sc->sc_nlevels - 1;
>  
>   for (i = 0; i < sc->sc_nlevels - 1; i++) {
>   mid = (sc->sc_levels[i] + sc->sc_levels[i + 1]) / 2;
>

Re: Some bwfm(4) diffs

2023-10-09 Thread Patrick Wildt

On Sun, Oct 08, 2023 at 07:42:54PM +0200, Mark Kettenis wrote:
> Hector Martin has added support for the BCM4388 that is found on the
> last generation of Apple Macs.  Based on his commits I've managed to
> get it working on my M2 Pro mini.  I still have to clean up some of
> that stuff, but here is a forst batch of two diffs.
> 
> The changes to dev/ic/bwfm.c correspond to:
> 
> https://github.com/AsahiLinux/linux/commit/81e3cc7bec8b9d9c436f63662d8fcfda4f637807

The changes here look good, ok patrick@ for this part

> The changes to dev/pci/if_bwfm_pci.c corrspond to:
> 
> https://github.com/AsahiLinux/linux/commit/8190add8671fc49c12d04b5ac8fced70f835e69f
> 
> Both changes seem to be a good idea and potentially affect other chips
> as well.  So if you have a machine with bwfm(4), please test.

This scares me a little, I'm gonna find a machine and put a bwfm(4) in
it to test.

> ok?
> 
> 
> Index: dev/ic/bwfm.c
> ===
> RCS file: /cvs/src/sys/dev/ic/bwfm.c,v
> retrieving revision 1.109
> diff -u -p -r1.109 bwfm.c
> --- dev/ic/bwfm.c 28 Mar 2023 14:01:42 -  1.109
> +++ dev/ic/bwfm.c 8 Oct 2023 17:29:35 -
> @@ -1089,15 +1089,9 @@ void
>  bwfm_chip_ai_reset(struct bwfm_softc *sc, struct bwfm_core *core,
>  uint32_t prereset, uint32_t reset, uint32_t postreset)
>  {
> - struct bwfm_core *core2 = NULL;
>   int i;
>  
> - if (core->co_id == BWFM_AGENT_CORE_80211)
> - core2 = bwfm_chip_get_core_idx(sc, BWFM_AGENT_CORE_80211, 1);
> -
>   bwfm_chip_ai_disable(sc, core, prereset, reset);
> - if (core2)
> - bwfm_chip_ai_disable(sc, core2, prereset, reset);
>  
>   for (i = 50; i > 0; i--) {
>   if ((sc->sc_buscore_ops->bc_read(sc,
> @@ -1110,32 +1104,12 @@ bwfm_chip_ai_reset(struct bwfm_softc *sc
>   }
>   if (i == 0)
>   printf("%s: timeout on core reset\n", DEVNAME(sc));
> - if (core2) {
> - for (i = 50; i > 0; i--) {
> - if ((sc->sc_buscore_ops->bc_read(sc,
> - core2->co_wrapbase + BWFM_AGENT_RESET_CTL) &
> - BWFM_AGENT_RESET_CTL_RESET) == 0)
> - break;
> - sc->sc_buscore_ops->bc_write(sc,
> - core2->co_wrapbase + BWFM_AGENT_RESET_CTL, 0);
> - delay(60);
> - }
> - if (i == 0)
> - printf("%s: timeout on core reset\n", DEVNAME(sc));
> - }
>  
>   sc->sc_buscore_ops->bc_write(sc,
>   core->co_wrapbase + BWFM_AGENT_IOCTL,
>   postreset | BWFM_AGENT_IOCTL_CLK);
>   sc->sc_buscore_ops->bc_read(sc,
>   core->co_wrapbase + BWFM_AGENT_IOCTL);
> - if (core2) {
> - sc->sc_buscore_ops->bc_write(sc,
> - core2->co_wrapbase + BWFM_AGENT_IOCTL,
> - postreset | BWFM_AGENT_IOCTL_CLK);
> - sc->sc_buscore_ops->bc_read(sc,
> - core2->co_wrapbase + BWFM_AGENT_IOCTL);
> - }
>  }
>  
>  void
> @@ -1338,6 +1312,7 @@ bwfm_chip_ca7_set_passive(struct bwfm_so
>  {
>   struct bwfm_core *core;
>   uint32_t val;
> + int i = 0;
>  
>   core = bwfm_chip_get_core(sc, BWFM_AGENT_CORE_ARM_CA7);
>   val = sc->sc_buscore_ops->bc_read(sc,
> @@ -1347,10 +1322,11 @@ bwfm_chip_ca7_set_passive(struct bwfm_so
>   BWFM_AGENT_IOCTL_ARMCR4_CPUHALT,
>   BWFM_AGENT_IOCTL_ARMCR4_CPUHALT);
>  
> - core = bwfm_chip_get_core(sc, BWFM_AGENT_CORE_80211);
> - sc->sc_chip.ch_core_reset(sc, core, BWFM_AGENT_D11_IOCTL_PHYRESET |
> - BWFM_AGENT_D11_IOCTL_PHYCLOCKEN, BWFM_AGENT_D11_IOCTL_PHYCLOCKEN,
> - BWFM_AGENT_D11_IOCTL_PHYCLOCKEN);
> + while ((core = bwfm_chip_get_core_idx(sc, BWFM_AGENT_CORE_80211, i++)))
> + sc->sc_chip.ch_core_disable(sc, core,
> + BWFM_AGENT_D11_IOCTL_PHYRESET |
> + BWFM_AGENT_D11_IOCTL_PHYCLOCKEN,
> + BWFM_AGENT_D11_IOCTL_PHYCLOCKEN);
>  }
>  
>  int
> Index: dev/pci/if_bwfm_pci.c
> ===
> RCS file: /cvs/src/sys/dev/pci/if_bwfm_pci.c,v
> retrieving revision 1.75
> diff -u -p -r1.75 if_bwfm_pci.c
> --- dev/pci/if_bwfm_pci.c 30 Dec 2022 14:10:17 -  1.75
> +++ dev/pci/if_bwfm_pci.c 8 Oct 2023 17:29:35 -
> @@ -134,6 +134,10 @@ struct bwfm_pci_softc {
>   bus_space_handle_t   sc_reg_ioh;
>   bus_size_t   sc_reg_ios;
>  
> + bus_space_tag_t  sc_pcie_iot;
> + bus_space_handle_t   sc_pcie_ioh;
> + bus_size_t   sc_pcie_ios;
> +
>   bus_space_tag_t  sc_tcm_iot;
>   bus_space_handle_t   sc_tcm_ioh;
>   bus_size_t   sc_tcm_ios;
> @@ -379,6 +383,10 @@ bwfm_pci_attach(struct device *parent, s
>   goto bar1;
>   }
>  
> + sc->sc_pcie_iot = sc->sc_reg_i

com(4) at pci

2020-03-03 Thread Patrick Wildt

Hi,

I would like to use the serial ports on the Apollo Lake machine on
my desk.  My first attempt was adjusting puc(4), but then I realized
that this would bloat puc(4) too much.

Intel has been removing legacy I/O-Ports on recent machines.  I think
that started with the Skylake PCH, which added the LPSS (Low Power
Subsystem).  Now the LPSS is used for three different kind of con-
trollers:  I2C, UART, SPI.

Each LPSS controller contain the actual device, and some registers to
control clocks, resets, etc.  These private registers need to be saved
and restored upon suspend/resume.  Also we should read the current
clock settings to calculate the frequency supplied to the device.

So the dance for the LPSS is the same for all three types of contro-
llers, and not surprisingly our dwiic@pci attachment driver already
implements a less sophisticated version of what is in the attached
diff.  Anyway, dwiic@pci is also only for the Intel LPSS-based I2C.

The UART is also the Synopsys Designware version, which means we can
have com(4) attach to it with the same options as we use on all those
ARM boards with a Designware based com(4).

So my second attempt was writing dwuart(4) with com(4) attaching at
dwuart(4).  kettenis@ argued it might be better to just implement it
as com@pci, which is the third attempt that you can read here.

Feedback?

Patrick

diff --git a/sys/arch/amd64/conf/GENERIC b/sys/arch/amd64/conf/GENERIC
index e64388b5815..7312452c046 100644
--- a/sys/arch/amd64/conf/GENERIC
+++ b/sys/arch/amd64/conf/GENERIC
@@ -338,6 +338,7 @@ bwfm*   at uhub?# Broadcom FullMAC
 
 puc*   at pci? # PCI "universal" communication device
 com*   at cardbus?
+com*   at pci?
 
 sdhc*  at pci? # SD Host Controller
 sdmmc* at sdhc?# SD/MMC bus
diff --git a/sys/dev/pci/com_pci.c b/sys/dev/pci/com_pci.c
new file mode 100644
index 000..cfb6e238bea
--- /dev/null
+++ b/sys/dev/pci/com_pci.c
@@ -0,0 +1,242 @@
+/* $OpenBSD$ */
+/*
+ * Copyright (c) 2020 Patrick Wildt 
+ *
+ * Permission to use, copy, modify, and distribute this software for any
+ * purpose with or without fee is hereby granted, provided that the above
+ * copyright notice and this permission notice appear in all copies.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
+ * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
+ * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
+ * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
+ * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
+ * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
+ * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+
+#define com_usr 31 /* Synopsys DesignWare UART */
+
+/* Intel Low Power Subsystem */
+#define LPSS_CLK   0x200
+#define  LPSS_CLK_GATE (1 << 0)
+#define  LPSS_CLK_MDIV_SHIFT   1
+#define  LPSS_CLK_MDIV_MASK0x3fff
+#define  LPSS_CLK_NDIV_SHIFT   16
+#define  LPSS_CLK_NDIV_MASK0x3fff
+#define  LPSS_CLK_UPDATE   (1U << 31)
+#define LPSS_RESETS0x204
+#define  LPSS_RESETS_FUNC  (3 << 0)
+#define  LPSS_RESETS_IDMA  (1 << 2)
+#define LPSS_ACTIVELTR 0x210
+#define LPSS_IDLELTR   0x214
+#define  LPSS_LTR_VALUE_MASK   (0x3ff << 0)
+#define  LPSS_LTR_SCALE_MASK   (0x3 << 10)
+#define  LPSS_LTR_SCALE_1US(2 << 10)
+#define  LPSS_LTR_SCALE_32US   (3 << 10)
+#define  LPSS_LTR_REQ  (1 << 15)
+#define LPSS_SSP   0x220
+#define  LPSS_SSP_DIS_DMA_FIN  (1 << 0)
+#define LPSS_REMAP_ADDR0x240
+#define LPSS_CAPS  0x2fc
+#define  LPSS_CAPS_TYPE_I2C(0 << 4)
+#define  LPSS_CAPS_TYPE_UART   (1 << 4)
+#define  LPSS_CAPS_TYPE_SPI(2 << 4)
+#define  LPSS_CAPS_TYPE_MASK   (0xf << 4)
+#define  LPSS_CAPS_NO_IDMA (1 << 8)
+
+#define LPSS_REG_OFF   0x200
+#define LPSS_REG_SIZE  0x100
+#define LPSS_REG_NUM   (LPSS_REG_SIZE / sizeof(uint32_t))
+
+#define HREAD4(sc, reg)
\
+   (bus_space_read_4((sc)->sc.sc_iot, (sc)->sc.sc_ioh, (reg)))
+#define HWRITE4(sc, reg, val)  \
+   bus_space_write_4((sc)->sc.sc_iot, (sc)->sc.sc_ioh, (reg), (val))
+#define HSET4(sc, reg, bits)   \
+   HWRITE4((sc), (reg), HREAD4((sc), (reg)) | (bits))
+#define HCLR4(sc, reg, bits)

Re: New bwfm device (bcm43341)

2020-03-13 Thread Patrick Wildt

On Fri, Mar 13, 2020 at 02:49:33PM +0100, Rob Schmersel wrote:
> Hi,
> 
> I got a new toy to play with recently which uses a bcm43341 wireless
> chipset. This chipset is the same as the bcm43340 apart from an
> additional nfc mode and can re-use the brcmfmac43340-sdio.bin firmware.
> (I simply added a symbolic link from firmware supplied in the 
> bwfm-firmware package)
> 
> In order to get the chipset recognized however I needed to make some
> changes as attached

Thanks for the diff!  Since Linux uses the same firmware (43340) for
both 43340 and 43341, I commited the diff slightly differently to
follow their scheme.  Thus there's no need for a symbolic link.

Patrick

Index: ic/bwfmvar.h
===
RCS file: /cvs/src/sys/dev/ic/bwfmvar.h,v
retrieving revision 1.18
diff -u -p -u -r1.18 bwfmvar.h
--- ic/bwfmvar.h6 Mar 2020 08:41:57 -   1.18
+++ ic/bwfmvar.h13 Mar 2020 15:28:04 -
@@ -27,6 +27,7 @@
 #define BRCM_CC_4330_CHIP_ID   0x4330
 #define BRCM_CC_4334_CHIP_ID   0x4334
 #define BRCM_CC_43340_CHIP_ID  43340
+#define BRCM_CC_43341_CHIP_ID  43341
 #define BRCM_CC_43362_CHIP_ID  43362
 #define BRCM_CC_4335_CHIP_ID   0x4335
 #define BRCM_CC_4339_CHIP_ID   0x4339
Index: sdmmc/if_bwfm_sdio.c
===
RCS file: /cvs/src/sys/dev/sdmmc/if_bwfm_sdio.c,v
retrieving revision 1.33
diff -u -p -u -r1.33 if_bwfm_sdio.c
--- sdmmc/if_bwfm_sdio.c7 Mar 2020 09:56:46 -   1.33
+++ sdmmc/if_bwfm_sdio.c13 Mar 2020 15:28:04 -
@@ -372,6 +372,7 @@ bwfm_sdio_preinit(struct bwfm_softc *bwf
chip = "43455";
break;
case BRCM_CC_43340_CHIP_ID:
+   case BRCM_CC_43341_CHIP_ID:
chip = "43340";
break;
case BRCM_CC_4335_CHIP_ID:

Re: New bwfm device (bcm43341)

2020-03-13 Thread Patrick Wildt

On Fri, Mar 13, 2020 at 05:54:01PM +0100, Rob Schmersel wrote:
> On Fri, 13 Mar 2020 16:31:18 +0100
> Patrick Wildt  wrote:
> > Thanks for the diff!  Since Linux uses the same firmware (43340) for
> > both 43340 and 43341, I commited the diff slightly differently to
> > follow their scheme.  Thus there's no need for a symbolic link.
> > 
> > Patrick
> 
> Reason I had it as a separate identifier is that the NVRAM file I
> found for the 43341 chip was different compared to the 43340 chip
> Might be a slump 
> 
> BR/Rob

Yeah, but the NVRAM is highly specific anyway, as I explained in your
NVRAM thread on misc. :)

Patrick

Re: Fix brightness control on ASUS 1005PXD

2020-03-16 Thread Patrick Wildt

On Sat, Mar 14, 2020 at 04:28:26AM +0100, Alexandre Ratchov wrote:
> On ASUS 1001PXD, _BQC returns an out of range value which makes
> acpivout_get_brightness() return -1, in turn breaking the
> display.brightness control (wsconsctl displays a mangled value).
> 
> This diff ignores the out of range value and makes the brighness
> control just work again.
> 
> OK?

With the current code _BQC is only called on attach, where we
probably want a value higher than the lowest one.  I wonder if
maybe we should move this check into the attach functions, and
in case of an error like this, set a reasonable brightness?

Otherwise, if we want to do that in acpivout_get_brightness(),
I guess we can update acpivout_select_brightness() and its caller
to remove the check for -1, since there will be no -1 anymore?

> Index: acpivout.c
> ===
> RCS file: /cvs/src/sys/dev/acpi/acpivout.c,v
> retrieving revision 1.19
> diff -u -p -r1.19 acpivout.c
> --- acpivout.c8 Feb 2020 19:08:17 -   1.19
> +++ acpivout.c14 Mar 2020 03:19:02 -
> @@ -227,8 +227,10 @@ acpivout_get_brightness(struct acpivout_
>   aml_freevalue(&res);
>   DPRINTF(("%s: BQC = %d\n", DEVNAME(sc), level));
>  
> - if (level < sc->sc_bcl[0] || level > sc->sc_bcl[sc->sc_bcl_len -1])
> - level = -1;
> + if (level < sc->sc_bcl[0])
> + level = sc->sc_bcl[0];
> + else if (level > sc->sc_bcl[sc->sc_bcl_len - 1])
> + level = sc->sc_bcl[sc->sc_bcl_len - 1];
>  
>   return (level);
>  }
>

usb(4): use cacheable buffers for data transfers (massive speedup)

2020-03-18 Thread Patrick Wildt

Hi,

I've spent a few days investigating why USB ethernet adapters are so
horribly slow on my ARMs.  Using dt(4) I realized that it was spending
most of its time in memcpy.  But, why?  As it turns out, all USB data
buffers are mapped COHERENT, which on some/most ARMs means uncached.
Using cached data buffers makes the performance rise from 20 mbit/s to
200 mbit/s.  Quite a difference.

sys/dev/usb/usb_mem.c:
error = bus_dmamem_map(tag, p->segs, p->nsegs, p->size,
   &p->kaddr, BUS_DMA_NOWAIT|BUS_DMA_COHERENT);

On x86, COHERENT is essentially a no-op.  On ARM, it depends on the SoC.
Some SoCs have cache-coherent USB controllers, some don't.  Mine does
not, so mapping it COHERENT means uncached and thus slow.

Why do we do that?  Well, when the code was imported in 99, it was
already there.  Since then we have gained infrastructure for DMA
syncs in the USB stack, which I think are proper.

sys/dev/usb/usbdi.c - usbd_transfer() (before transfer)

if (!usbd_xfer_isread(xfer)) {
if ((xfer->flags & USBD_NO_COPY) == 0)
memcpy(KERNADDR(&xfer->dmabuf, 0), xfer->buffer,
xfer->length);
usb_syncmem(&xfer->dmabuf, 0, xfer->length,
BUS_DMASYNC_PREWRITE);
} else
usb_syncmem(&xfer->dmabuf, 0, xfer->length,
BUS_DMASYNC_PREREAD);
err = pipe->methods->transfer(xfer);

sys/dev/usb/usbdi.c - usb_transfer_complete() (after transfer)

if (xfer->actlen != 0) {
if (usbd_xfer_isread(xfer)) {
usb_syncmem(&xfer->dmabuf, 0, xfer->actlen,
BUS_DMASYNC_POSTREAD);
if (!(xfer->flags & USBD_NO_COPY))
memcpy(xfer->buffer, KERNADDR(&xfer->dmabuf, 0),
xfer->actlen);
} else
usb_syncmem(&xfer->dmabuf, 0, xfer->actlen,
BUS_DMASYNC_POSTWRITE);
}

We cannot just remove COHERENT, since some drivers, like ehci(4), use
the same backend to allocate their rings.  And I can't vouch for those
drivers' sanity.

As a first step, I would like to go ahead with another solution, which
is based on a diff from Marius Strobl, who added those syncs in the
first place.  Essentially it splits the memory handling into cacheable
and non-cacheable blocks.  The USB data transfers and everyone who uses
usbd_alloc_buffer() then use cacheable buffers, while code like ehci(4)
still don't.  This is a bit of a safer approach imho, since we don't
hurt the controller drivers, but speed up the data buffers.

Once we have verified that there are no regressions, we can adjust
ehci(4) and the like, add proper syncs, make sure they still work as
well as before, and maybe then back this out again.

Keep note that this is all a no-op on X86, but all the other archs will
profit from this.

ok?

Patrick

diff --git a/sys/dev/usb/usb_mem.c b/sys/dev/usb/usb_mem.c
index c65906b43f4..95993093b5a 100644
--- a/sys/dev/usb/usb_mem.c
+++ b/sys/dev/usb/usb_mem.c
@@ -72,7 +72,7 @@ struct usb_frag_dma {
 };
 
 usbd_statususb_block_allocmem(bus_dma_tag_t, size_t, size_t,
-   struct usb_dma_block **);
+   struct usb_dma_block **, int);
 void   usb_block_freemem(struct usb_dma_block *);
 
 LIST_HEAD(, usb_dma_block) usb_blk_freelist =
@@ -84,7 +84,7 @@ LIST_HEAD(, usb_frag_dma) usb_frag_freelist =
 
 usbd_status
 usb_block_allocmem(bus_dma_tag_t tag, size_t size, size_t align,
-struct usb_dma_block **dmap)
+struct usb_dma_block **dmap, int cacheable)
 {
int error;
 struct usb_dma_block *p;
@@ -96,7 +96,8 @@ usb_block_allocmem(bus_dma_tag_t tag, size_t size, size_t 
align,
s = splusb();
/* First check the free list. */
for (p = LIST_FIRST(&usb_blk_freelist); p; p = LIST_NEXT(p, next)) {
-   if (p->tag == tag && p->size >= size && p->align >= align) {
+   if (p->tag == tag && p->size >= size && p->align >= align &&
+   p->cacheable == cacheable) {
LIST_REMOVE(p, next);
usb_blk_nfree--;
splx(s);
@@ -116,6 +117,7 @@ usb_block_allocmem(bus_dma_tag_t tag, size_t size, size_t 
align,
p->tag = tag;
p->size = size;
p->align = align;
+   p->cacheable = cacheable;
error = bus_dmamem_alloc(tag, p->size, align, 0,
 p->segs, nitems(p->segs),
 &p->nsegs, BUS_DMA_NOWAIT);
@@ -123,7 +125,8 @@ usb_block_allocmem(bus_dma_tag_t tag, size_t size, size_t 
align,
goto free0;
 
error = bus_dmamem_map(tag, p->segs, p->nsegs, p->size,
-  &p->kaddr, BUS_DMA_NOWAIT|BUS_DMA_COHERENT);
+  &p->kaddr, BUS_DMA_NOWAIT |

Re: usb(4): use cacheable buffers for data transfers (massive speedup)

2020-03-18 Thread Patrick Wildt

On Wed, Mar 18, 2020 at 11:22:40AM +0100, Patrick Wildt wrote:
> Hi,
> 
> I've spent a few days investigating why USB ethernet adapters are so
> horribly slow on my ARMs.  Using dt(4) I realized that it was spending
> most of its time in memcpy.  But, why?  As it turns out, all USB data
> buffers are mapped COHERENT, which on some/most ARMs means uncached.
> Using cached data buffers makes the performance rise from 20 mbit/s to
> 200 mbit/s.  Quite a difference.
> 
> sys/dev/usb/usb_mem.c:
>   error = bus_dmamem_map(tag, p->segs, p->nsegs, p->size,
>  &p->kaddr, BUS_DMA_NOWAIT|BUS_DMA_COHERENT);
> 
> On x86, COHERENT is essentially a no-op.  On ARM, it depends on the SoC.
> Some SoCs have cache-coherent USB controllers, some don't.  Mine does
> not, so mapping it COHERENT means uncached and thus slow.
> 
> Why do we do that?  Well, when the code was imported in 99, it was
> already there.  Since then we have gained infrastructure for DMA
> syncs in the USB stack, which I think are proper.
> 
> sys/dev/usb/usbdi.c - usbd_transfer() (before transfer)
> 
>   if (!usbd_xfer_isread(xfer)) {
>   if ((xfer->flags & USBD_NO_COPY) == 0)
>   memcpy(KERNADDR(&xfer->dmabuf, 0), xfer->buffer,
>   xfer->length);
>   usb_syncmem(&xfer->dmabuf, 0, xfer->length,
>   BUS_DMASYNC_PREWRITE);
>   } else
>   usb_syncmem(&xfer->dmabuf, 0, xfer->length,
>   BUS_DMASYNC_PREREAD);
>   err = pipe->methods->transfer(xfer);
> 
> sys/dev/usb/usbdi.c - usb_transfer_complete() (after transfer)
> 
>   if (xfer->actlen != 0) {
>   if (usbd_xfer_isread(xfer)) {
>   usb_syncmem(&xfer->dmabuf, 0, xfer->actlen,
>   BUS_DMASYNC_POSTREAD);
>   if (!(xfer->flags & USBD_NO_COPY))
>   memcpy(xfer->buffer, KERNADDR(&xfer->dmabuf, 0),
>   xfer->actlen);
>   } else
>   usb_syncmem(&xfer->dmabuf, 0, xfer->actlen,
>   BUS_DMASYNC_POSTWRITE);
>   }
> 
> We cannot just remove COHERENT, since some drivers, like ehci(4), use
> the same backend to allocate their rings.  And I can't vouch for those
> drivers' sanity.
> 
> As a first step, I would like to go ahead with another solution, which
> is based on a diff from Marius Strobl, who added those syncs in the
> first place.  Essentially it splits the memory handling into cacheable
> and non-cacheable blocks.  The USB data transfers and everyone who uses
> usbd_alloc_buffer() then use cacheable buffers, while code like ehci(4)
> still don't.  This is a bit of a safer approach imho, since we don't
> hurt the controller drivers, but speed up the data buffers.
> 
> Once we have verified that there are no regressions, we can adjust
> ehci(4) and the like, add proper syncs, make sure they still work as
> well as before, and maybe then back this out again.
> 
> Keep note that this is all a no-op on X86, but all the other archs will
> profit from this.
> 
> ok?
> 
> Patrick

Update diff with inverted logic.  kettenis@ argues that we should
invert the logic, and those who need COHERENT memory should ask
for that explicitly, since for bus_dmamem_map() it also needs to
be passed explicitly.  This also points out all those users that
use usb_allocmem() internally, where we might want to have a look
if COHERENT is actually needed or not, or if it can be refactored
in another way.

diff --git a/sys/dev/usb/dwc2/dwc2.c b/sys/dev/usb/dwc2/dwc2.c
index 6f035467213..099dfa26da1 100644
--- a/sys/dev/usb/dwc2/dwc2.c
+++ b/sys/dev/usb/dwc2/dwc2.c
@@ -473,6 +473,7 @@ dwc2_open(struct usbd_pipe *pipe)
switch (xfertype) {
case UE_CONTROL:
pipe->methods = &dwc2_device_ctrl_methods;
+   dpipe->req_dma.flags |= USB_DMA_COHERENT;
err = usb_allocmem(&sc->sc_bus, sizeof(usb_device_request_t),
0, &dpipe->req_dma);
if (err)
diff --git a/sys/dev/usb/dwc2/dwc2_hcd.c b/sys/dev/usb/dwc2/dwc2_hcd.c
index 7e5c91481d5..d44e3196e61 100644
--- a/sys/dev/usb/dwc2/dwc2_hcd.c
+++ b/sys/dev/usb/dwc2/dwc2_hcd.c
@@ -679,6 +679,7 @@ STATIC int dwc2_hc_setup_align_buf(struct dwc2_hsotg 
*hsotg, struct dwc2_qh *qh,
 
qh->dw_align_buf = NULL;
qh->dw_align_buf_dma = 0;
+   qh->dw_align_buf_usbdma.flags |= USB_DMA_COHERENT;
err = usb_allocmem(&hsotg->hsotg_sc->sc_bus, buf_si

Re: usb(4): use cacheable buffers for data transfers (massive speedup)

2020-04-01 Thread Patrick Wildt

On Wed, Apr 01, 2020 at 04:47:10PM +1100, Jonathan Gray wrote:
> On Wed, Apr 01, 2020 at 12:58:23PM +1100, Jonathan Gray wrote:
> > On Wed, Mar 18, 2020 at 01:41:06PM +0100, Patrick Wildt wrote:
> > > On Wed, Mar 18, 2020 at 11:22:40AM +0100, Patrick Wildt wrote:
> > > > Hi,
> > > > 
> > > > I've spent a few days investigating why USB ethernet adapters are so
> > > > horribly slow on my ARMs.  Using dt(4) I realized that it was spending
> > > > most of its time in memcpy.  But, why?  As it turns out, all USB data
> > > > buffers are mapped COHERENT, which on some/most ARMs means uncached.
> > > > Using cached data buffers makes the performance rise from 20 mbit/s to
> > > > 200 mbit/s.  Quite a difference.
> > > > 
> > > > sys/dev/usb/usb_mem.c:
> > > > error = bus_dmamem_map(tag, p->segs, p->nsegs, p->size,
> > > >&p->kaddr, 
> > > > BUS_DMA_NOWAIT|BUS_DMA_COHERENT);
> > > > 
> > > > On x86, COHERENT is essentially a no-op.  On ARM, it depends on the SoC.
> > > > Some SoCs have cache-coherent USB controllers, some don't.  Mine does
> > > > not, so mapping it COHERENT means uncached and thus slow.
> > > > 
> > > > Why do we do that?  Well, when the code was imported in 99, it was
> > > > already there.  Since then we have gained infrastructure for DMA
> > > > syncs in the USB stack, which I think are proper.
> > > > 
> > > > sys/dev/usb/usbdi.c - usbd_transfer() (before transfer)
> > > > 
> > > > if (!usbd_xfer_isread(xfer)) {
> > > > if ((xfer->flags & USBD_NO_COPY) == 0)
> > > > memcpy(KERNADDR(&xfer->dmabuf, 0), xfer->buffer,
> > > > xfer->length);
> > > > usb_syncmem(&xfer->dmabuf, 0, xfer->length,
> > > > BUS_DMASYNC_PREWRITE);
> > > > } else
> > > > usb_syncmem(&xfer->dmabuf, 0, xfer->length,
> > > > BUS_DMASYNC_PREREAD);
> > > > err = pipe->methods->transfer(xfer);
> > > > 
> > > > sys/dev/usb/usbdi.c - usb_transfer_complete() (after transfer)
> > > > 
> > > > if (xfer->actlen != 0) {
> > > > if (usbd_xfer_isread(xfer)) {
> > > > usb_syncmem(&xfer->dmabuf, 0, xfer->actlen,
> > > > BUS_DMASYNC_POSTREAD);
> > > > if (!(xfer->flags & USBD_NO_COPY))
> > > > memcpy(xfer->buffer, 
> > > > KERNADDR(&xfer->dmabuf, 0),
> > > > xfer->actlen);
> > > > } else
> > > > usb_syncmem(&xfer->dmabuf, 0, xfer->actlen,
> > > > BUS_DMASYNC_POSTWRITE);
> > > > }
> > > > 
> > > > We cannot just remove COHERENT, since some drivers, like ehci(4), use
> > > > the same backend to allocate their rings.  And I can't vouch for those
> > > > drivers' sanity.
> > > > 
> > > > As a first step, I would like to go ahead with another solution, which
> > > > is based on a diff from Marius Strobl, who added those syncs in the
> > > > first place.  Essentially it splits the memory handling into cacheable
> > > > and non-cacheable blocks.  The USB data transfers and everyone who uses
> > > > usbd_alloc_buffer() then use cacheable buffers, while code like ehci(4)
> > > > still don't.  This is a bit of a safer approach imho, since we don't
> > > > hurt the controller drivers, but speed up the data buffers.
> > > > 
> > > > Once we have verified that there are no regressions, we can adjust
> > > > ehci(4) and the like, add proper syncs, make sure they still work as
> > > > well as before, and maybe then back this out again.
> > > > 
> > > > Keep note that this is all a no-op on X86, but all the other archs will
> > > > profit from this.
> > > > 
> > > > ok?
> > > > 
> > > > Patrick
> > > 
> > > Update diff with inverted logic.  kettenis@ argues that we should
> > > invert the lo

Re: usb(4): use cacheable buffers for data transfers (massive speedup)

2020-04-01 Thread Patrick Wildt

On Wed, Apr 01, 2020 at 09:22:07AM +0200, Patrick Wildt wrote:
> On Wed, Apr 01, 2020 at 04:47:10PM +1100, Jonathan Gray wrote:
> > On Wed, Apr 01, 2020 at 12:58:23PM +1100, Jonathan Gray wrote:
> > > On Wed, Mar 18, 2020 at 01:41:06PM +0100, Patrick Wildt wrote:
> > > > On Wed, Mar 18, 2020 at 11:22:40AM +0100, Patrick Wildt wrote:
> > > > > Hi,
> > > > > 
> > > > > I've spent a few days investigating why USB ethernet adapters are so
> > > > > horribly slow on my ARMs.  Using dt(4) I realized that it was spending
> > > > > most of its time in memcpy.  But, why?  As it turns out, all USB data
> > > > > buffers are mapped COHERENT, which on some/most ARMs means uncached.
> > > > > Using cached data buffers makes the performance rise from 20 mbit/s to
> > > > > 200 mbit/s.  Quite a difference.
> > > > > 
> > > > > sys/dev/usb/usb_mem.c:
> > > > >   error = bus_dmamem_map(tag, p->segs, p->nsegs, p->size,
> > > > >  &p->kaddr, 
> > > > > BUS_DMA_NOWAIT|BUS_DMA_COHERENT);
> > > > > 
> > > > > On x86, COHERENT is essentially a no-op.  On ARM, it depends on the 
> > > > > SoC.
> > > > > Some SoCs have cache-coherent USB controllers, some don't.  Mine does
> > > > > not, so mapping it COHERENT means uncached and thus slow.
> > > > > 
> > > > > Why do we do that?  Well, when the code was imported in 99, it was
> > > > > already there.  Since then we have gained infrastructure for DMA
> > > > > syncs in the USB stack, which I think are proper.
> > > > > 
> > > > > sys/dev/usb/usbdi.c - usbd_transfer() (before transfer)
> > > > > 
> > > > >   if (!usbd_xfer_isread(xfer)) {
> > > > >   if ((xfer->flags & USBD_NO_COPY) == 0)
> > > > >   memcpy(KERNADDR(&xfer->dmabuf, 0), xfer->buffer,
> > > > >   xfer->length);
> > > > >   usb_syncmem(&xfer->dmabuf, 0, xfer->length,
> > > > >   BUS_DMASYNC_PREWRITE);
> > > > >   } else
> > > > >   usb_syncmem(&xfer->dmabuf, 0, xfer->length,
> > > > >   BUS_DMASYNC_PREREAD);
> > > > >   err = pipe->methods->transfer(xfer);
> > > > > 
> > > > > sys/dev/usb/usbdi.c - usb_transfer_complete() (after transfer)
> > > > > 
> > > > >   if (xfer->actlen != 0) {
> > > > >   if (usbd_xfer_isread(xfer)) {
> > > > >   usb_syncmem(&xfer->dmabuf, 0, xfer->actlen,
> > > > >   BUS_DMASYNC_POSTREAD);
> > > > >   if (!(xfer->flags & USBD_NO_COPY))
> > > > >   memcpy(xfer->buffer, 
> > > > > KERNADDR(&xfer->dmabuf, 0),
> > > > >   xfer->actlen);
> > > > >   } else
> > > > >   usb_syncmem(&xfer->dmabuf, 0, xfer->actlen,
> > > > >   BUS_DMASYNC_POSTWRITE);
> > > > >   }
> > > > > 
> > > > > We cannot just remove COHERENT, since some drivers, like ehci(4), use
> > > > > the same backend to allocate their rings.  And I can't vouch for those
> > > > > drivers' sanity.
> > > > > 
> > > > > As a first step, I would like to go ahead with another solution, which
> > > > > is based on a diff from Marius Strobl, who added those syncs in the
> > > > > first place.  Essentially it splits the memory handling into cacheable
> > > > > and non-cacheable blocks.  The USB data transfers and everyone who 
> > > > > uses
> > > > > usbd_alloc_buffer() then use cacheable buffers, while code like 
> > > > > ehci(4)
> > > > > still don't.  This is a bit of a safer approach imho, since we don't
> > > > > hurt the controller drivers, but speed up the data buffers.
> > > > > 
> > > > > Once we have verified that there are no regressions, we can adjust
> > > > > ehci(4) and the like, add proper syncs, make sure they still work as
&g

Re: usb(4): use cacheable buffers for data transfers (massive speedup)

2020-04-01 Thread Patrick Wildt

On Wed, Apr 01, 2020 at 09:40:06AM +0200, Patrick Wildt wrote:
> On Wed, Apr 01, 2020 at 09:22:07AM +0200, Patrick Wildt wrote:
> > On Wed, Apr 01, 2020 at 04:47:10PM +1100, Jonathan Gray wrote:
> > > On Wed, Apr 01, 2020 at 12:58:23PM +1100, Jonathan Gray wrote:
> > > > On Wed, Mar 18, 2020 at 01:41:06PM +0100, Patrick Wildt wrote:
> > > > > On Wed, Mar 18, 2020 at 11:22:40AM +0100, Patrick Wildt wrote:
> > > > > > Hi,
> > > > > > 
> > > > > > I've spent a few days investigating why USB ethernet adapters are so
> > > > > > horribly slow on my ARMs.  Using dt(4) I realized that it was 
> > > > > > spending
> > > > > > most of its time in memcpy.  But, why?  As it turns out, all USB 
> > > > > > data
> > > > > > buffers are mapped COHERENT, which on some/most ARMs means uncached.
> > > > > > Using cached data buffers makes the performance rise from 20 mbit/s 
> > > > > > to
> > > > > > 200 mbit/s.  Quite a difference.
> > > > > > 
> > > > > > sys/dev/usb/usb_mem.c:
> > > > > > error = bus_dmamem_map(tag, p->segs, p->nsegs, p->size,
> > > > > >&p->kaddr, 
> > > > > > BUS_DMA_NOWAIT|BUS_DMA_COHERENT);
> > > > > > 
> > > > > > On x86, COHERENT is essentially a no-op.  On ARM, it depends on the 
> > > > > > SoC.
> > > > > > Some SoCs have cache-coherent USB controllers, some don't.  Mine 
> > > > > > does
> > > > > > not, so mapping it COHERENT means uncached and thus slow.
> > > > > > 
> > > > > > Why do we do that?  Well, when the code was imported in 99, it was
> > > > > > already there.  Since then we have gained infrastructure for DMA
> > > > > > syncs in the USB stack, which I think are proper.
> > > > > > 
> > > > > > sys/dev/usb/usbdi.c - usbd_transfer() (before transfer)
> > > > > > 
> > > > > > if (!usbd_xfer_isread(xfer)) {
> > > > > > if ((xfer->flags & USBD_NO_COPY) == 0)
> > > > > > memcpy(KERNADDR(&xfer->dmabuf, 0), xfer->buffer,
> > > > > > xfer->length);
> > > > > > usb_syncmem(&xfer->dmabuf, 0, xfer->length,
> > > > > > BUS_DMASYNC_PREWRITE);
> > > > > > } else
> > > > > > usb_syncmem(&xfer->dmabuf, 0, xfer->length,
> > > > > > BUS_DMASYNC_PREREAD);
> > > > > > err = pipe->methods->transfer(xfer);
> > > > > > 
> > > > > > sys/dev/usb/usbdi.c - usb_transfer_complete() (after transfer)
> > > > > > 
> > > > > > if (xfer->actlen != 0) {
> > > > > > if (usbd_xfer_isread(xfer)) {
> > > > > > usb_syncmem(&xfer->dmabuf, 0, xfer->actlen,
> > > > > > BUS_DMASYNC_POSTREAD);
> > > > > > if (!(xfer->flags & USBD_NO_COPY))
> > > > > > memcpy(xfer->buffer, 
> > > > > > KERNADDR(&xfer->dmabuf, 0),
> > > > > > xfer->actlen);
> > > > > > } else
> > > > > > usb_syncmem(&xfer->dmabuf, 0, xfer->actlen,
> > > > > > BUS_DMASYNC_POSTWRITE);
> > > > > > }
> > > > > > 
> > > > > > We cannot just remove COHERENT, since some drivers, like ehci(4), 
> > > > > > use
> > > > > > the same backend to allocate their rings.  And I can't vouch for 
> > > > > > those
> > > > > > drivers' sanity.
> > > > > > 
> > > > > > As a first step, I would like to go ahead with another solution, 
> > > > > > which
> > > > > > is based on a diff from Marius Strobl, who added those syncs in the
> > > > > > first place.  Essentially it splits the memory handling into 
> > > > > > cacheable
> > > > > > and non-cacheable blocks.  The USB data tran

Re: usb(4): use cacheable buffers for data transfers (massive speedup)

2020-04-01 Thread Patrick Wildt

On Wed, Apr 01, 2020 at 12:04:25PM +0200, Patrick Wildt wrote:
> On Wed, Apr 01, 2020 at 09:40:06AM +0200, Patrick Wildt wrote:
> > On Wed, Apr 01, 2020 at 09:22:07AM +0200, Patrick Wildt wrote:
> > > On Wed, Apr 01, 2020 at 04:47:10PM +1100, Jonathan Gray wrote:
> > > > On Wed, Apr 01, 2020 at 12:58:23PM +1100, Jonathan Gray wrote:
> > > > > On Wed, Mar 18, 2020 at 01:41:06PM +0100, Patrick Wildt wrote:
> > > > > > On Wed, Mar 18, 2020 at 11:22:40AM +0100, Patrick Wildt wrote:
> > > > > > > Hi,
> > > > > > > 
> > > > > > > I've spent a few days investigating why USB ethernet adapters are 
> > > > > > > so
> > > > > > > horribly slow on my ARMs.  Using dt(4) I realized that it was 
> > > > > > > spending
> > > > > > > most of its time in memcpy.  But, why?  As it turns out, all USB 
> > > > > > > data
> > > > > > > buffers are mapped COHERENT, which on some/most ARMs means 
> > > > > > > uncached.
> > > > > > > Using cached data buffers makes the performance rise from 20 
> > > > > > > mbit/s to
> > > > > > > 200 mbit/s.  Quite a difference.
> > > > > > > 
> > > > > > > sys/dev/usb/usb_mem.c:
> > > > > > >   error = bus_dmamem_map(tag, p->segs, p->nsegs, p->size,
> > > > > > >  &p->kaddr, 
> > > > > > > BUS_DMA_NOWAIT|BUS_DMA_COHERENT);
> > > > > > > 
> > > > > > > On x86, COHERENT is essentially a no-op.  On ARM, it depends on 
> > > > > > > the SoC.
> > > > > > > Some SoCs have cache-coherent USB controllers, some don't.  Mine 
> > > > > > > does
> > > > > > > not, so mapping it COHERENT means uncached and thus slow.
> > > > > > > 
> > > > > > > Why do we do that?  Well, when the code was imported in 99, it was
> > > > > > > already there.  Since then we have gained infrastructure for DMA
> > > > > > > syncs in the USB stack, which I think are proper.
> > > > > > > 
> > > > > > > sys/dev/usb/usbdi.c - usbd_transfer() (before transfer)
> > > > > > > 
> > > > > > >   if (!usbd_xfer_isread(xfer)) {
> > > > > > >   if ((xfer->flags & USBD_NO_COPY) == 0)
> > > > > > >   memcpy(KERNADDR(&xfer->dmabuf, 0), xfer->buffer,
> > > > > > >   xfer->length);
> > > > > > >   usb_syncmem(&xfer->dmabuf, 0, xfer->length,
> > > > > > >   BUS_DMASYNC_PREWRITE);
> > > > > > >   } else
> > > > > > >   usb_syncmem(&xfer->dmabuf, 0, xfer->length,
> > > > > > >   BUS_DMASYNC_PREREAD);
> > > > > > >   err = pipe->methods->transfer(xfer);
> > > > > > > 
> > > > > > > sys/dev/usb/usbdi.c - usb_transfer_complete() (after transfer)
> > > > > > > 
> > > > > > >   if (xfer->actlen != 0) {
> > > > > > >   if (usbd_xfer_isread(xfer)) {
> > > > > > >   usb_syncmem(&xfer->dmabuf, 0, xfer->actlen,
> > > > > > >   BUS_DMASYNC_POSTREAD);
> > > > > > >   if (!(xfer->flags & USBD_NO_COPY))
> > > > > > >   memcpy(xfer->buffer, 
> > > > > > > KERNADDR(&xfer->dmabuf, 0),
> > > > > > >   xfer->actlen);
> > > > > > >   } else
> > > > > > >   usb_syncmem(&xfer->dmabuf, 0, xfer->actlen,
> > > > > > >   BUS_DMASYNC_POSTWRITE);
> > > > > > >   }
> > > > > > > 
> > > > > > > We cannot just remove COHERENT, since some drivers, like ehci(4), 
> > > > > > > use
> > > > > > > the same backend to allocate their rings.  And I can't vouch for 
> > > > > > > those
> > > > > > > drivers' sanity.
> > > > > > > 
> >

Re: usb(4): use cacheable buffers for data transfers (massive speedup)

2020-04-02 Thread Patrick Wildt

On Wed, Apr 01, 2020 at 12:23:53PM +0200, Patrick Wildt wrote:
> On Wed, Apr 01, 2020 at 12:04:25PM +0200, Patrick Wildt wrote:
> > On Wed, Apr 01, 2020 at 09:40:06AM +0200, Patrick Wildt wrote:
> > > On Wed, Apr 01, 2020 at 09:22:07AM +0200, Patrick Wildt wrote:
> > > > On Wed, Apr 01, 2020 at 04:47:10PM +1100, Jonathan Gray wrote:
> > > > > On Wed, Apr 01, 2020 at 12:58:23PM +1100, Jonathan Gray wrote:
> > > > > > On Wed, Mar 18, 2020 at 01:41:06PM +0100, Patrick Wildt wrote:
> > > > > > > On Wed, Mar 18, 2020 at 11:22:40AM +0100, Patrick Wildt wrote:
> > > > > > > > Hi,
> > > > > > > > 
> > > > > > > > I've spent a few days investigating why USB ethernet adapters 
> > > > > > > > are so
> > > > > > > > horribly slow on my ARMs.  Using dt(4) I realized that it was 
> > > > > > > > spending
> > > > > > > > most of its time in memcpy.  But, why?  As it turns out, all 
> > > > > > > > USB data
> > > > > > > > buffers are mapped COHERENT, which on some/most ARMs means 
> > > > > > > > uncached.
> > > > > > > > Using cached data buffers makes the performance rise from 20 
> > > > > > > > mbit/s to
> > > > > > > > 200 mbit/s.  Quite a difference.
> > > > > > > > 
> > > > > > > > sys/dev/usb/usb_mem.c:
> > > > > > > > error = bus_dmamem_map(tag, p->segs, p->nsegs, p->size,
> > > > > > > >&p->kaddr, 
> > > > > > > > BUS_DMA_NOWAIT|BUS_DMA_COHERENT);
> > > > > > > > 
> > > > > > > > On x86, COHERENT is essentially a no-op.  On ARM, it depends on 
> > > > > > > > the SoC.
> > > > > > > > Some SoCs have cache-coherent USB controllers, some don't.  
> > > > > > > > Mine does
> > > > > > > > not, so mapping it COHERENT means uncached and thus slow.
> > > > > > > > 
> > > > > > > > Why do we do that?  Well, when the code was imported in 99, it 
> > > > > > > > was
> > > > > > > > already there.  Since then we have gained infrastructure for DMA
> > > > > > > > syncs in the USB stack, which I think are proper.
> > > > > > > > 
> > > > > > > > sys/dev/usb/usbdi.c - usbd_transfer() (before transfer)
> > > > > > > > 
> > > > > > > > if (!usbd_xfer_isread(xfer)) {
> > > > > > > > if ((xfer->flags & USBD_NO_COPY) == 0)
> > > > > > > > memcpy(KERNADDR(&xfer->dmabuf, 0), 
> > > > > > > > xfer->buffer,
> > > > > > > > xfer->length);
> > > > > > > > usb_syncmem(&xfer->dmabuf, 0, xfer->length,
> > > > > > > > BUS_DMASYNC_PREWRITE);
> > > > > > > > } else
> > > > > > > > usb_syncmem(&xfer->dmabuf, 0, xfer->length,
> > > > > > > > BUS_DMASYNC_PREREAD);
> > > > > > > > err = pipe->methods->transfer(xfer);
> > > > > > > > 
> > > > > > > > sys/dev/usb/usbdi.c - usb_transfer_complete() (after transfer)
> > > > > > > > 
> > > > > > > > if (xfer->actlen != 0) {
> > > > > > > > if (usbd_xfer_isread(xfer)) {
> > > > > > > > usb_syncmem(&xfer->dmabuf, 0, 
> > > > > > > > xfer->actlen,
> > > > > > > > BUS_DMASYNC_POSTREAD);
> > > > > > > > if (!(xfer->flags & USBD_NO_COPY))
> > > > > > > > memcpy(xfer->buffer, 
> > > > > > > > KERNADDR(&xfer->dmabuf, 0),
> > > > > > > > xfer->actlen);
> > > > > > > > } else
> > > > > > > >

coherent em(4) descriptor rings (to be able to run on my arm64 machine)

2020-04-26 Thread Patrick Wildt

Hi,

I have a HummingBoard Pulse, which is an NXP i.MX8MQ based board
featuring two ethernets, with the second ethernet being an em(4)
on a PCIe controller.

I had trouble getting it to work, but realized that the issue is the
descriptor ring coherency.  I looked into the code, tried to find if
there are incorrect flushes, or something else, but nothing worked.

Looking at the Linux driver I realized that their descriptor rings are
allocated coherent.  Some arm64 machines are fine with em(4), since the
PCIe controller is coherent, but on my machine it is not.  Explicitly
mapping the rings coherent made my machine happy, so I believe that
maybe the way that em(4) works, we need to make sure the rings are
coherent.

So I'd propose the following diff, which *only* makes the desciptor
rings coherent, the packets stay cached and fast.  This allows me to
push plenty of traffic through my machine!

This is a no-op on all x86, and on arm64-machines with coherent PCIe
controllers.

Opinions? ok?

Patrick

diff --git a/sys/dev/pci/if_em.c b/sys/dev/pci/if_em.c
index aca6b4bb02f..27d0630bf9f 100644
--- a/sys/dev/pci/if_em.c
+++ b/sys/dev/pci/if_em.c
@@ -2113,7 +2113,7 @@ em_dma_malloc(struct em_softc *sc, bus_size_t size, 
struct em_dma_alloc *dma)
goto destroy;
 
r = bus_dmamem_map(sc->sc_dmat, &dma->dma_seg, dma->dma_nseg, size,
-   &dma->dma_vaddr, BUS_DMA_WAITOK);
+   &dma->dma_vaddr, BUS_DMA_WAITOK | BUS_DMA_COHERENT);
if (r != 0)
goto free;

Re: [patch] Check for -1 explicitly in getpeereid.c

2020-04-26 Thread Patrick Wildt

Hi,

I don't know userland very well, so I have a question.  In the middle of
2019 there have been plenty of changes in regards to changing checks of
syscalls from < 0 to a more strict == -1, like this one in isakmpd:

revision 1.26
date: 2019/06/28 13:32:44;  author: deraadt;  state: Exp;  lines: +2 -2;  
commitid: JJ6Ck4WTrgUiEjJp;
When system calls indicate an error they return -1, not some arbitrary
value < 0.  errno is only updated in this case.  Change all (most?)
callers of syscalls to follow this better, and let's see if this strictness
helps us in the future.

getsockopt(), I think, is also a system call.  And the manpage indicates
that a failure is always -1, and not some arbitrary number:

RETURN VALUES
 Upon successful completion, the value 0 is returned; otherwise the
 value -1 is returned and the global variable errno is set to indicate the
 error.

What is the difference between the diff in this mail, and the changes
done in the middle of last year?  getsockopt() isn't allowed to return
anything else but 0 and 1, right?  Though I guess the current check
(error != 0) is the one that also catches instances where getsockopt()
isn't behaving well, even though it shouldn't.  But then, with the -1
check, wouldn't we be catching more instances of syscalls misbehaving
if we checked for < -1?

Patrick

On Sun, Apr 26, 2020 at 02:45:54PM -0600, Theo de Raadt wrote:
> If it returns 50 then the creds structure is not valid, and can't be copied
> from.  It only returns valid creds *IF* success is indicated by 0.  But then
> you convert 50 to a return value of 0, and hide any indication that things
> went weird?
> 
> No way, I'm not buying your argument.   I think the code is making the
> correct choice now.
> 
> Martin Vahlensieck  wrote:
> 
> > Hi there
> > 
> > From the getsockopt(2) manual page says getsockopt(2) returns -1 on
> > error and 0 on success. Also getpeereid(3) only lists those 2 values.
> > This diff makes the return value check in getpeereid explicit. I guess
> > this is how it is done elsewhere in the tree (there is a commit turning
> > a bunch of "... < 0" to "== -1" I think this falls under that category).
> > 
> > Best,
> > 
> > Martin
> > 
> > Index: net/getpeereid.c
> > ===
> > RCS file: /cvs/src/lib/libc/net/getpeereid.c,v
> > retrieving revision 1.1
> > diff -u -p -r1.1 getpeereid.c
> > --- net/getpeereid.c1 Jul 2010 19:15:30 -   1.1
> > +++ net/getpeereid.c26 Apr 2020 20:28:50 -
> > @@ -28,7 +28,7 @@ getpeereid(int s, uid_t *euid, gid_t *eg
> >  
> > error = getsockopt(s, SOL_SOCKET, SO_PEERCRED,
> > &creds, &credslen);
> > -   if (error)
> > +   if (error == -1)
> > return (error);
> > *euid = creds.uid;
> > *egid = creds.gid;
> > 
>

sdmmc: CIS tuple can have empty body

2020-04-28 Thread Patrick Wildt

Hi,

on my i.MX8MM EVK there's a ath10k-based WiFi chip which we
unfortunately do not support (yet?).  But the SD/MMC CIS parser
complains:

sdmmc0: CIS parse error at 4136, tuple code 0x14, length 0
manufacturer 0x0271, product 0x0701 at sdmmc0 function 1 not configured

It's not a transmission bug though, since I saw prints from a Linux
dmesg on the web[0] stating something similar:

<4>mmc0: queuing unknown CIS tuple 0x01 (3 bytes)
<4>mmc0: queuing unknown CIS tuple 0x1a (5 bytes)
<4>mmc0: queuing unknown CIS tuple 0x1b (8 bytes)
<4>mmc0: queuing unknown CIS tuple 0x14 (0 bytes)

I guess the ath10k-Chips use some vendor-specific tuples in the CIS
structure.  The thing that our CIS parser complains about is the tuple
without a length.

Section 16 of the SDIO Simplified Specification 3.0[1] describes the CIS
formats, with Section 16.2 describing the Basic Tuple Format.  What our
code calls "tpllen", the tuple body length, is called "link field" in
the specification.

The specification explicitly says, that a empty tuple body is fine, by
saying: "If the link field is 0, then the tuple body is empty."

Thus I propose that instead of complaining about that tuple, we just
continue our job.  I guess sometimes the existance of a code is infor-
mation enough.

ok?

Patrick

[0] 
https://linuxlists.cc/l/9/linux-wireless/t/3226749/ath10k-sdio:_failed_to_load_firmware
[1] 
https://www.sdcard.org/downloads/pls/pdf/index.php?p=PartE1_SDIO_Simplified_Specification_Ver3.00.jpg&f=PartE1_SDIO_Simplified_Specification_Ver3.00.pdf&e=EN_SSE1

diff --git a/sys/dev/sdmmc/sdmmc_cis.c b/sys/dev/sdmmc/sdmmc_cis.c
index 21cf530b24f..09f3a70af40 100644
--- a/sys/dev/sdmmc/sdmmc_cis.c
+++ b/sys/dev/sdmmc/sdmmc_cis.c
@@ -76,12 +76,8 @@ sdmmc_read_cis(struct sdmmc_function *sf, struct sdmmc_cis 
*cis)
continue;
 
tpllen = sdmmc_io_read_1(sf0, reg++);
-   if (tpllen == 0) {
-   printf("%s: CIS parse error at %d, "
-   "tuple code %#x, length %d\n",
-   DEVNAME(sf->sc), reg, tplcode, tpllen);
-   break;
-   }
+   if (tpllen == 0)
+   continue;
 
switch (tplcode) {
case SD_IO_CISTPL_FUNCID:

Re: sdmmc: CIS tuple can have empty body

2020-04-28 Thread Patrick Wildt

On Tue, Apr 28, 2020 at 11:16:43PM +0200, Patrick Wildt wrote:
> Hi,
> 
> on my i.MX8MM EVK there's a ath10k-based WiFi chip which we
> unfortunately do not support (yet?).  But the SD/MMC CIS parser
> complains:
> 
> sdmmc0: CIS parse error at 4136, tuple code 0x14, length 0
> manufacturer 0x0271, product 0x0701 at sdmmc0 function 1 not configured
> 
> It's not a transmission bug though, since I saw prints from a Linux
> dmesg on the web[0] stating something similar:
> 
> <4>mmc0: queuing unknown CIS tuple 0x01 (3 bytes)
> <4>mmc0: queuing unknown CIS tuple 0x1a (5 bytes)
> <4>mmc0: queuing unknown CIS tuple 0x1b (8 bytes)
> <4>mmc0: queuing unknown CIS tuple 0x14 (0 bytes)
> 
> I guess the ath10k-Chips use some vendor-specific tuples in the CIS
> structure.  The thing that our CIS parser complains about is the tuple
> without a length.
> 
> Section 16 of the SDIO Simplified Specification 3.0[1] describes the CIS
> formats, with Section 16.2 describing the Basic Tuple Format.  What our
> code calls "tpllen", the tuple body length, is called "link field" in
> the specification.
> 
> The specification explicitly says, that a empty tuple body is fine, by
> saying: "If the link field is 0, then the tuple body is empty."
> 
> Thus I propose that instead of complaining about that tuple, we just
> continue our job.  I guess sometimes the existance of a code is infor-
> mation enough.
> 
> ok?
> 
> Patrick
> 
> [0] 
> https://linuxlists.cc/l/9/linux-wireless/t/3226749/ath10k-sdio:_failed_to_load_firmware
> [1] 
> https://www.sdcard.org/downloads/pls/pdf/index.php?p=PartE1_SDIO_Simplified_Specification_Ver3.00.jpg&f=PartE1_SDIO_Simplified_Specification_Ver3.00.pdf&e=EN_SSE1

Actually it would be even better to just remove the check, since
then we would fall into the default case which, given SDMMC_DEBUG
is on, also lets us know of the unknown tuple:

sdmmc0: unknown tuple code 0x1, length 3
sdmmc0: unknown tuple code 0x22, length 4
sdmmc0: unknown tuple code 0x1a, length 5
sdmmc0: unknown tuple code 0x1b, length 8
sdmmc0: unknown tuple code 0x14, length 0
sdmmc0: unknown tuple code 0x22, length 42
sdmmc0: unknown tuple code 0x80, length 1
sdmmc0: unknown tuple code 0x81, length 1
sdmmc0: unknown tuple code 0x82, length 1

The individual switch-cases are checking tpllen properly, so there
is no harm.

ok?

Patrick

diff --git a/sys/dev/sdmmc/sdmmc_cis.c b/sys/dev/sdmmc/sdmmc_cis.c
index 21cf530b24f..70e5b6283a7 100644
--- a/sys/dev/sdmmc/sdmmc_cis.c
+++ b/sys/dev/sdmmc/sdmmc_cis.c
@@ -76,12 +76,6 @@ sdmmc_read_cis(struct sdmmc_function *sf, struct sdmmc_cis 
*cis)
continue;
 
tpllen = sdmmc_io_read_1(sf0, reg++);
-   if (tpllen == 0) {
-   printf("%s: CIS parse error at %d, "
-   "tuple code %#x, length %d\n",
-   DEVNAME(sf->sc), reg, tplcode, tpllen);
-   break;
-   }
 
switch (tplcode) {
case SD_IO_CISTPL_FUNCID:

libsa's in_cksum() cannot handle payload of odd-length?

2020-05-18 Thread Patrick Wildt

Hi,

I was trying to tftpboot and had an issue with files of odd-length.
As it turns out, I think the in_cksum() that's called for UDP payload
cannot handle a payload length that's not aligned to 16 bytes.

I don't know how in_cksum() is supposed to work exactly, but it looks
like the first step is summing up all bytes.  The code is using 16-
byte blocks, apart from some oddbyte magic.

First of all, why is there a while loop around code that already
consumes the whole length?  That can be done in a single step
without the loop.  Why does it continue of there's an "oddbyte"?

If I simplify that whole construct, consuming in 16-bytes step
until there's only one left, then summing that one, in_cksum()
works for me.

Can someone please help me have a look?

Patrick

diff --git a/sys/lib/libsa/in_cksum.c b/sys/lib/libsa/in_cksum.c
index d3f2e6ac978..57ded38a7b7 100644
--- a/sys/lib/libsa/in_cksum.c
+++ b/sys/lib/libsa/in_cksum.c
@@ -59,31 +59,24 @@
 int
 in_cksum(const void *p, int len)
 {
-   int sum = 0, oddbyte = 0, v = 0;
const u_char *cp = p;
+   int sum = 0;
 
/* we assume < 2^16 bytes being summed */
-   while (len > 0) {
-   if (oddbyte) {
-   sum += v + *cp++;
-   len--;
-   }
+   while (len > 1) {
if (((long)cp & 1) == 0) {
-   while ((len -= 2) >= 0) {
-   sum += *(const u_short *)cp;
-   cp += 2;
-   }
+   sum += *(const u_short *)cp;
+   cp += 2;
} else {
-   while ((len -= 2) >= 0) {
-   sum += *cp++ << 8;
-   sum += *cp++;
-   }
+   sum += *cp++ << 8;
+   sum += *cp++;
}
-   if ((oddbyte = len & 1) != 0)
-   v = *cp << 8;
+   len -= 2;
+   }
+   if (len > 0) {
+   sum += *cp++;
+   len--;
}
-   if (oddbyte)
-   sum += v;
sum = (sum >> 16) + (sum & 0x); /* add in accumulated carries */
sum += sum >> 16;   /* add potential last carry */
return (0x & ~sum);

Re: libsa's in_cksum() cannot handle payload of odd-length?

2020-05-18 Thread Patrick Wildt

On Mon, May 18, 2020 at 05:50:28PM +0200, Claudio Jeker wrote:
> On Mon, May 18, 2020 at 03:50:05PM +0200, Patrick Wildt wrote:
> > Hi,
> > 
> > I was trying to tftpboot and had an issue with files of odd-length.
> > As it turns out, I think the in_cksum() that's called for UDP payload
> > cannot handle a payload length that's not aligned to 16 bytes.
> > 
> > I don't know how in_cksum() is supposed to work exactly, but it looks
> > like the first step is summing up all bytes.  The code is using 16-
> > byte blocks, apart from some oddbyte magic.
> > 
> > First of all, why is there a while loop around code that already
> > consumes the whole length?  That can be done in a single step
> > without the loop.  Why does it continue of there's an "oddbyte"?
> > 
> > If I simplify that whole construct, consuming in 16-bytes step
> > until there's only one left, then summing that one, in_cksum()
> > works for me.
> > 
> > Can someone please help me have a look?
> 
> There are other versions of in_cksum in our tree.
> Like: ./usr.sbin/ospfd/in_cksum.c
> 
> I'm surprised that the libsa code does no htons / ntohs conversions.
> Also after looking at the ospfd/in_cksum.c code I wonder if the htons /
> ntohs are actually reversed in that that code...

I copied ospfd's file, re-added the header includes, re-added const to
the pointer, adjusted the prototype for u_int16_t and size_t, and
replaced the fatal with a printf and return -1.  Why -1?  Well, the
checks always do if (in_cksum() != 0) { error(); }.  Unless they
create a packet, but I guess in that case I hope we don't create a
packet that's too big *in our bootloader*.

Since the implementation was replaced, taken from ospfd, I guess the
4th-clause can go as well?  I mean, there's no 4th-clause in ospfd and
I just copied the file.

This fixes my issue and works for me.

Opinions? ok?

Patrick

diff --git a/sys/lib/libsa/in_cksum.c b/sys/lib/libsa/in_cksum.c
index d3f2e6ac978..c4b12e01c04 100644
--- a/sys/lib/libsa/in_cksum.c
+++ b/sys/lib/libsa/in_cksum.c
@@ -1,4 +1,4 @@
-/* $OpenBSD: in_cksum.c,v 1.5 2014/11/19 20:28:56 miod Exp $   */
+/* $OpenBSD: in_cksum.c,v 1.7 2014/07/20 20:27:19 tobias Exp $ */
 /* $NetBSD: in_cksum.c,v 1.3 1995/04/22 13:53:48 cgd Exp $ */
 
 /*
@@ -17,11 +17,7 @@
  * 2. Redistributions in binary form must reproduce the above copyright
  *notice, this list of conditions and the following disclaimer in the
  *documentation and/or other materials provided with the distribution.
- * 3. All advertising materials mentioning features or use of this software
- *must display the following acknowledgement:
- * This product includes software developed by the University of
- * California, Lawrence Berkeley Laboratory and its contributors.
- * 4. Neither the name of the University nor the names of its contributors
+ * 3. Neither the name of the University nor the names of its contributors
  *may be used to endorse or promote products derived from this software
  *without specific prior written permission.
  *
@@ -56,35 +52,38 @@
  * code and should be modified for each CPU to be as fast as possible.
  * In particular, it should not be this one.
  */
-int
-in_cksum(const void *p, int len)
+u_int16_t
+in_cksum(const void *p, size_t l)
 {
-   int sum = 0, oddbyte = 0, v = 0;
+   unsigned int sum = 0;
+   int len;
const u_char *cp = p;
 
-   /* we assume < 2^16 bytes being summed */
-   while (len > 0) {
-   if (oddbyte) {
-   sum += v + *cp++;
-   len--;
+   /* ensure that < 2^16 bytes being summed */
+   if (l >= (1 << 16)) {
+   printf("in_cksum: packet too big\n");
+   return -1;
+   }
+   len = (int)l;
+
+   if (((long)cp & 1) == 0) {
+   while (len > 1) {
+   sum += htons(*(u_short *)cp);
+   cp += 2;
+   len -= 2;
}
-   if (((long)cp & 1) == 0) {
-   while ((len -= 2) >= 0) {
-   sum += *(const u_short *)cp;
-   cp += 2;
-   }
-   } else {
-   while ((len -= 2) >= 0) {
-   sum += *cp++ << 8;
-   sum += *cp++;
-   }
+   } else {
+   while (len > 1) {
+   sum += *cp++ << 8;
+   sum += *cp++;
+   len -= 2;
}
-   if ((oddbyte = len & 1) != 0)
-   v = *cp << 8;
}
-   if (oddbyte)
-

Re: libsa's in_cksum() cannot handle payload of odd-length?

2020-05-18 Thread Patrick Wildt

On Mon, May 18, 2020 at 10:16:27AM -0600, Theo de Raadt wrote:
> I suspect there are other inconsistancies in all these versions.
> 
> ./sys/arch/arm/arm/in_cksum_arm.S
> ./sys/arch/i386/i386/in_cksum.s
> ./sys/arch/sparc64/sparc64/in_cksum.S
> ./sys/arch/sh/sh/in_cksum.S

These are assembly and for the kernel, uses mbufs.

> ./sys/arch/alpha/alpha/in_cksum.c

Has quite a bit of magic.

> ./sys/arch/hppa/hppa/in_cksum.c

Uses assembly in macros.

> ./sys/arch/m88k/m88k/in_cksum.c

Similar to hppa, but less obscure

> ./sys/arch/powerpc/powerpc/in_cksum.c

Plenty of powerpc assembly.

> ./sys/netinet/in_cksum.c

The default for the kernel.

> ./usr.sbin/dvmrpd/in_cksum.c
> ./usr.sbin/ospfd/in_cksum.c

Finally some userland tools, so no mbufs.  These are the same files,
apart from dvmrpd.h/ospfd.h include.

> ./usr.sbin/eigrpd/in_cksum.c

This one uses unsigned char instead of u_char, but is otherwise the same
as dvmrpd/ospfd.

> ./usr.sbin/tcpdump/in_cksum.c

Similar but has different requirements.

> ./sys/lib/libsa/in_cksum.c

Bootloader, no mbufs.  Code best taken from one of the userland tools.
ospfd is still the best bet.

powerpc64: Target Info in clang for OpenBSD

2020-05-19 Thread Patrick Wildt

Hi,

drahn@ was complaining to me that his cross-compiler wasn't defining
__OpenBSD__ or __ELF__, and I think the fix is pretty simple.  We're
just missing a case in a switch-case.

The .cpp file itself still compiles, but I haven't built a full clang
with it.  Please give it a go and report back.

I'll already ask for OKs though, but will only commit once I got
positive feedback.  :)

ok?

Patrick

diff --git a/gnu/llvm/tools/clang/lib/Basic/Targets.cpp 
b/gnu/llvm/tools/clang/lib/Basic/Targets.cpp
index 3c139d72479..5bff08ad70d 100644
--- a/gnu/llvm/tools/clang/lib/Basic/Targets.cpp
+++ b/gnu/llvm/tools/clang/lib/Basic/Targets.cpp
@@ -349,6 +349,8 @@ TargetInfo *AllocateTarget(const llvm::Triple &Triple,
   return new FreeBSDTargetInfo(Triple, Opts);
 case llvm::Triple::NetBSD:
   return new NetBSDTargetInfo(Triple, Opts);
+case llvm::Triple::OpenBSD:
+  return new OpenBSDTargetInfo(Triple, Opts);
 default:
   return new PPC64TargetInfo(Triple, Opts);
 }
@@ -359,6 +361,8 @@ TargetInfo *AllocateTarget(const llvm::Triple &Triple,
   return new LinuxTargetInfo(Triple, Opts);
 case llvm::Triple::NetBSD:
   return new NetBSDTargetInfo(Triple, Opts);
+case llvm::Triple::OpenBSD:
+  return new OpenBSDTargetInfo(Triple, Opts);
 default:
   return new PPC64TargetInfo(Triple, Opts);
 }

Re: sparc64 boot issue on qemu

2020-05-30 Thread Patrick Wildt

On Sat, May 30, 2020 at 07:21:15PM +, Miod Vallat wrote:
> Yet another case where the emulator does not match the real hardware.
> 
> Why bother with them?
> 
> Get qemu to fix their shit so that the frame buffer metrics variable are
> aligned on 64-bit boundaries. There might not be a written specification
> for this requirement, but that's the way real hardware behaves, and it
> makes complete sense (the variables are OFW cells, which are 64-bit
> values and 64-bit aligned).

I'm not sure if sparc's OFW is different, but in the device trees as
used on arm and probably mips as well, 64-bit values are represented
using two 32-bit cells.  So I think a requirement of 64-bit would not
be correct anyway, and it should be 32-bit instead.

I saw some mailthread on the U-Boot lists regarding some alignment
requirements of some payloads.  Even though they confused the diff-
erence between "alignment of payload that will be put somewhere"
and "alignment of where the payload actually ends up with", it seems
like they also only require 32-bit alignment.  The tools that create
the image with the payloads, which they discussed, also makes sure
all payloads are aligned to 32-bit.

Re: WireGuard patchset for OpenBSD, rev. 3

2020-06-21 Thread Patrick Wildt

On Sun, Jun 21, 2020 at 10:06:52AM -0400, Sonic wrote:
> Along that line, does wireguard have any problems using alias
> addresses? It's not a problem with IKEv1 but it is with IKEv2.
> 
> Thanks!
> 
> Chris

I still don't see how this is a problem with IKEv2, so don't spread any
rumours and instead have a look at my response to your mail on misc@.

Patrick

Re: SSE in kernel?

2020-06-23 Thread Patrick Wildt

On Tue, Jun 23, 2020 at 06:51:20AM -0400, Bryan Steele wrote:
> On Mon, Jun 22, 2020 at 11:10:10PM -0700, jo...@armadilloaerospace.com wrote:
> > Are SSE instructions allowed in the AMD64 kernel?  Is #ifdef __SSE__
> > a sufficient guard?
> > 
> > I have a rasops32 putchar with SSE that is 2x faster.
> 
> No, in general you cannot using FP instructions in the kernel, also the
> kernel is often compiled with -msoft-float on platforms that support it.

Exceptions are being made for amdgpu drm, where some of the files are
compiled with sse enabled, though the code is guarded with
fpu_kernel_enter().  DC_FP_START() and DC_FP_END() to be precise, which
are macros pointing to the kernel functions.

umass(4): consistently use sc_xfer_flags for polling mode

2020-06-23 Thread Patrick Wildt

Hi,

when powering down, sd(4) will trigger a powerdown on it's umass(4)
USB stick.  If the device fails to respond, for whatever reason, the
umass(4) code will do multiple reset mechanism, and one of those uses
a control transfer.  Unfortunately the control transfer is not passed
the sc_xfer_flags, which are *only* used to supply USBD_SYNCHRONOUS
to allow the polling mode to work.

Without USBD_SYNCHRONOUS, umass_polled_transfer()'s call to
usbd_transfer() will immediately return, and it will never complete.

The code will return to scsi, where it will wait until the "cookie"
is cleared.  Since this is polling mode, there's no asynchronous call-
back, and the cookie will never be cleared.  Thus we will msleep and
wait forever.

By also using sc->sc_xfer_flags on the control transfer, it will run
synchronously.  There's still another bug that happens when even more
transfers fail, since then umass_bbb_transfer()'s call to umass_bbb_
reset() will cause the SCSI done handler to be called a second time
resulting in a panic.

But that's a bug in the state machine for error handling, which can be
fixed later on.  Also a panic during powerdown is better than hanging
indefinitely.

ok?

Patrick

diff --git a/sys/dev/usb/umass.c b/sys/dev/usb/umass.c
index 53d783ff396..f871a3d9c41 100644
--- a/sys/dev/usb/umass.c
+++ b/sys/dev/usb/umass.c
@@ -789,18 +789,18 @@ umass_setup_ctrl_transfer(struct umass_softc *sc, 
usb_device_request_t *req,
/* Initialise a USB control transfer and then schedule it */
 
usbd_setup_default_xfer(xfer, sc->sc_udev, (void *) sc,
-   USBD_DEFAULT_TIMEOUT, req, buffer, buflen, flags,
-   sc->sc_methods->wire_state);
+   USBD_DEFAULT_TIMEOUT, req, buffer, buflen,
+   flags | sc->sc_xfer_flags, sc->sc_methods->wire_state);
 
if (sc->sc_udev->bus->use_polling) {
DPRINTF(UDMASS_XFER,("%s: start polled ctrl xfer buffer=%p "
"buflen=%d flags=0x%x\n", sc->sc_dev.dv_xname, buffer,
-   buflen, flags));
+   buflen, flags | sc->sc_xfer_flags));
err = umass_polled_transfer(sc, xfer);
} else {
DPRINTF(UDMASS_XFER,("%s: start ctrl xfer buffer=%p buflen=%d "
"flags=0x%x\n", sc->sc_dev.dv_xname, buffer, buflen,
-   flags));
+   flags | sc->sc_xfer_flags));
err = usbd_transfer(xfer);
}
if (err && err != USBD_IN_PROGRESS) {

xhci(4): acknowledge interrupts before calling usb_schedsoftintr()

2020-06-23 Thread Patrick Wildt

Hi,

I had issues with a machine hanging on powerdown.  The issue is caused
by sd(4)'s suspend method trying to "power down" my umass(4) USB stick.

The symptom was that during powerdown, when running in "polling mode",
the first transaction (send command to power down to USB stick) works:
We enqueue a transfer, and then poll for an event in xhci(4).  On the
first transfer, we see an event.  On the second transfer, which is to
read the status from the USB stick, we get a timeout.  We poll for an
event, but we never see it.

"Polling" for an event in xhci(4) means checking its interrupt status
for *any* bit.  But, the interrupt status register never had one set
for the second transaction.  Using a USB debugger, one could see that
the second transaction actually completed, but we just did not get an
interrupt for that completed transfer.

The issue is actually in xhci(4)'s interrupt handler, which is also
called for the polling mode.  There we first acknowledge the pending
interrupts in the USB status register, then we call usb_schedsoftintr(),
and afterwards we acknowledge the interrupt manager regarding "level-
-triggered" interrupts.

In polling mode, usb_schedsoftintr() calls the xhci_softintr() method
right away, which will dequeue an event from the event queue and thus
complete transfers.  The important aspect there is that dequeuing an
event also means touching xhci's registers to inform it that we have
dequeued an event.

In non-polling mode, usb_schedsoftintr() will only schedule a soft-
interrupt, which means that in regards to "touching" the xhci hardware,
the first thing that happens is acknowledging the interrupt manager
bits.

Moving the call to usb_schedsoftintr() to be after the interrupt ACKs
resolves my problem.  With this change, the first thing that happens,
polling and non-polling, is acknowledge the interrupts, and no other
register touching.  And that's also what Linux is doing.  ACK first,
handle events later.

With this, the next xhci_poll() actually sees an interrupt and the
second transfer can succeed.  Thus my machine finally shuts down and
does not anymore hang indefinitely.

Comments?

Patrick

diff --git a/sys/dev/usb/xhci.c b/sys/dev/usb/xhci.c
index 2d65208f3db..ba5ee56502c 100644
--- a/sys/dev/usb/xhci.c
+++ b/sys/dev/usb/xhci.c
@@ -624,13 +624,13 @@ xhci_intr1(struct xhci_softc *sc)
return (1);
}
 
-   XOWRITE4(sc, XHCI_USBSTS, intrs); /* Acknowledge */
-   usb_schedsoftintr(&sc->sc_bus);
-
-   /* Acknowledge PCI interrupt */
+   /* Acknowledge interrupts */
+   XOWRITE4(sc, XHCI_USBSTS, intrs);
intrs = XRREAD4(sc, XHCI_IMAN(0));
XRWRITE4(sc, XHCI_IMAN(0), intrs | XHCI_IMAN_INTR_PEND);
 
+   usb_schedsoftintr(&sc->sc_bus);
+
return (1);
 }

Re: xhci: zero length multi-TRB inbound xfer does not work

2020-06-24 Thread Patrick Wildt

On Tue, Jun 16, 2020 at 06:55:27AM +, sc.dy...@gmail.com wrote:
> hi,
> 
> The function xhci_event_xfer_isoc() of sys/dev/usb/xhci.c at line 954
> does not work with zero length multi-TRB inbound transfer.
> 
>949/*
>950 * If we queued two TRBs for a frame and this is the 
> second TRB,
>951 * check if the first TRB needs accounting since it 
> might not have
>952 * raised an interrupt in case of full data received.
>953 */
>954if ((letoh32(xp->ring.trbs[trb_idx].trb_flags) & 
> XHCI_TRB_TYPE_MASK) ==
>955XHCI_TRB_TYPE_NORMAL) {
>956frame_idx--;
>957if (trb_idx == 0)
>958trb0_idx = xp->ring.ntrb - 2;
>959else
>960trb0_idx = trb_idx - 1;
>961if (xfer->frlengths[frame_idx] == 0) {
>962xfer->frlengths[frame_idx] = 
> XHCI_TRB_LEN(letoh32(
>963
> xp->ring.trbs[trb0_idx].trb_status));
>964}
>965}
>966
>967xfer->frlengths[frame_idx] +=
>968
> XHCI_TRB_LEN(letoh32(xp->ring.trbs[trb_idx].trb_status)) - remain;
>969xfer->actlen += xfer->frlengths[frame_idx];
> 
> When a multi-TRB inbound transfer TD completes with transfer length = 0,
> the HC should generate two events: 1st event for ISOCH TRB /w ISP|CHAIN
> and 2nd event for NORMAL TRB w/ ISP|IOC.
> Transfer Length field (it's remain length, actually) of each event is
> same as requested length is, i.e., transferred length is 0.
> So when the first event raises the frlengths is set to 0 at line 967.
> It's correct.
> On second event, as the comment describes, xhci.c tries to calculate
> the 1st TRB xfer length at lines 954-965. The requested length of
> 1st TRB is stored into frlengths -- even though the xfer len is 0.
> 
> If frlengths = 0, we cannot distinguish the case the first event is
> not raised from the case the transferred length is 0.
> The frlengths is already 0 so the requested length of 1st TRB is stored.

That's a really good find!  I actually do wonder if we could have the
same issue with the non-isoc transfers, when the first TRB throws a
short and then we get another event for the last TRB.

Maybe it would make sense to record the idx of the last TRB that we have
received an event for?  Then we could check if we already processed that
TRB.

Patrick

> For example, I applied debug printf [*1], and run
> mplayer tv:// for my webcam.
> I see...
> 
> #25 remain 1024 type 5 origlen 1024 frlengths[25] 0
> #25 (omitted) frlen[25] 1024
> #26 remain 2048 type 1 origlen 2048 frlengths[25] 1024
> 
> These console logs show a 3072 bytes frame is splitted into
> two TRBs and got 0 bytes. The first TRB transfers 0 bytes and
> the second TRB transfers 0, too, but it results 1024 bytes.
> 
> My proposal patch [*2] adds a flag to xhci_xfer that indicates the
> TRB processed by xhci.c previously has CHAIN bit, and updates the
> frlengths only when that flag is not set.
> 
> 
> 
> [*1]
> debug printf.
> It shows only splitted isochronous TDs.
> 
> --- sys/dev/usb/xhci.c.orig   Sun Apr  5 10:12:37 2020
> +++ sys/dev/usb/xhci.cFri May 29 04:13:36 2020
> @@ -961,12 +961,23 @@ xhci_event_xfer_isoc(struct usbd_xfer *xfer, struct xh
>   if (xfer->frlengths[frame_idx] == 0) {
>   xfer->frlengths[frame_idx] = XHCI_TRB_LEN(letoh32(
>   xp->ring.trbs[trb0_idx].trb_status));
> + printf("#%d (omitted) frlen[%d] %u\n",
> + trb0_idx, frame_idx, xfer->frlengths[frame_idx]);
>   }
>   }
>  
>   xfer->frlengths[frame_idx] +=
>   XHCI_TRB_LEN(letoh32(xp->ring.trbs[trb_idx].trb_status)) - remain;
>   xfer->actlen += xfer->frlengths[frame_idx];
> + uint32_t trb_flags = letoh32(xp->ring.trbs[trb_idx].trb_flags);
> + if ((trb_flags & XHCI_TRB_CHAIN) ||
> + (trb_flags & XHCI_TRB_TYPE_MASK) == XHCI_TRB_TYPE_NORMAL) {
> + printf("#%d remain %u type %u origlen %u frlengths[%d] %hu\n",
> + trb_idx, remain,
> + XHCI_TRB_TYPE(trb_flags),
> + XHCI_TRB_LEN(le32toh(xp->ring.trbs[trb_idx].trb_status)),
> + frame_idx, xfer->frlengths[frame_idx]);
> + }
>  
>   if (xx->index != trb_idx)
>   return (1);
> 
> [*2]
> patch
> 
> --- sys/dev/usb/xhcivar.h.origSun Oct  6 21:19:28 2019
> +++ sys/dev/usb/xhcivar.h Fri May 22 04:19:57 2020
> @@ -40,6 +40,7 @@ struct xhci_xfer {
>   struct usbd_xfer xfer;
>   int  index; /* Index of the last TRB */
>   size_t   ntr

virtio(4) at fdt: version 2 for Parallels 16 on Mac (M1)

2021-04-14 Thread Patrick Wildt

Hi,

Parallels 16 for Mac supports the Apple M1 SoC now, and since it does
provide an EFI 'BIOS', our images boot out of the box (once converted
to 'hdd' or supplied as USB stick).

Unfortunately virtio doesn't attach, because Parallels seems to provide
a 'new' version 2.  The following diff adds support for version 2 and
I used it to install the VM over vio(4) network.  And I was able to
install packages over vio(4) network.  Disk is ahci(4), USB passthrough
is xhci(4), so that works nicely out of the box.

Not sure if we want this for 6.9 or not.  I think it wouldn't break the
current version 1, so I think it shouldn't hurt.

If you're wondering why I'm 'so late' with this: jcs@ asked me to have
a look at the official Parallels for M1 release, and I just did that.
So I couldn't be any faster than this anyway.

Opinions?

Patrick

diff --git a/sys/dev/fdt/virtio_mmio.c b/sys/dev/fdt/virtio_mmio.c
index 88e45436c00..e474bad9e6b 100644
--- a/sys/dev/fdt/virtio_mmio.c
+++ b/sys/dev/fdt/virtio_mmio.c
@@ -58,10 +58,17 @@
 #define VIRTIO_MMIO_QUEUE_NUM  0x038
 #define VIRTIO_MMIO_QUEUE_ALIGN0x03c
 #define VIRTIO_MMIO_QUEUE_PFN  0x040
+#define VIRTIO_MMIO_QUEUE_READY0x044
 #define VIRTIO_MMIO_QUEUE_NOTIFY   0x050
 #define VIRTIO_MMIO_INTERRUPT_STATUS   0x060
 #define VIRTIO_MMIO_INTERRUPT_ACK  0x064
 #define VIRTIO_MMIO_STATUS 0x070
+#define VIRTIO_MMIO_QUEUE_DESC_LOW 0x080
+#define VIRTIO_MMIO_QUEUE_DESC_HIGH0x084
+#define VIRTIO_MMIO_QUEUE_AVAIL_LOW0x090
+#define VIRTIO_MMIO_QUEUE_AVAIL_HIGH   0x094
+#define VIRTIO_MMIO_QUEUE_USED_LOW 0x0a0
+#define VIRTIO_MMIO_QUEUE_USED_HIGH0x0a4
 #define VIRTIO_MMIO_CONFIG 0x100
 
 #define VIRTIO_MMIO_INT_VRING  (1 << 0)
@@ -106,6 +113,7 @@ struct virtio_mmio_softc {
void*sc_ih;
 
int sc_config_offset;
+   uint32_tsc_version;
 };
 
 struct cfattach virtio_mmio_ca = {
@@ -159,10 +167,31 @@ virtio_mmio_setup_queue(struct virtio_softc *vsc, struct 
virtqueue *vq,
vq->vq_index);
bus_space_write_4(sc->sc_iot, sc->sc_ioh, VIRTIO_MMIO_QUEUE_NUM,
bus_space_read_4(sc->sc_iot, sc->sc_ioh, 
VIRTIO_MMIO_QUEUE_NUM_MAX));
-   bus_space_write_4(sc->sc_iot, sc->sc_ioh, VIRTIO_MMIO_QUEUE_ALIGN,
-   PAGE_SIZE);
-   bus_space_write_4(sc->sc_iot, sc->sc_ioh, VIRTIO_MMIO_QUEUE_PFN,
-   addr / VIRTIO_PAGE_SIZE);
+   if (sc->sc_version == 1) {
+   bus_space_write_4(sc->sc_iot, sc->sc_ioh,
+   VIRTIO_MMIO_QUEUE_ALIGN, PAGE_SIZE);
+   bus_space_write_4(sc->sc_iot, sc->sc_ioh,
+   VIRTIO_MMIO_QUEUE_PFN, addr / VIRTIO_PAGE_SIZE);
+   } else {
+   bus_space_write_4(sc->sc_iot, sc->sc_ioh,
+   VIRTIO_MMIO_QUEUE_DESC_LOW, addr);
+   bus_space_write_4(sc->sc_iot, sc->sc_ioh,
+   VIRTIO_MMIO_QUEUE_DESC_HIGH, addr >> 32);
+   bus_space_write_4(sc->sc_iot, sc->sc_ioh,
+   VIRTIO_MMIO_QUEUE_AVAIL_LOW,
+   addr + vq->vq_availoffset);
+   bus_space_write_4(sc->sc_iot, sc->sc_ioh,
+   VIRTIO_MMIO_QUEUE_AVAIL_HIGH,
+   (addr + vq->vq_availoffset) >> 32);
+   bus_space_write_4(sc->sc_iot, sc->sc_ioh,
+   VIRTIO_MMIO_QUEUE_USED_LOW,
+   addr + vq->vq_usedoffset);
+   bus_space_write_4(sc->sc_iot, sc->sc_ioh,
+   VIRTIO_MMIO_QUEUE_USED_HIGH,
+   (addr + vq->vq_usedoffset) >> 32);
+   bus_space_write_4(sc->sc_iot, sc->sc_ioh,
+   VIRTIO_MMIO_QUEUE_READY, 1);
+   }
 }
 
 void
@@ -192,7 +221,7 @@ virtio_mmio_attach(struct device *parent, struct device 
*self, void *aux)
struct fdt_attach_args *faa = aux;
struct virtio_mmio_softc *sc = (struct virtio_mmio_softc *)self;
struct virtio_softc *vsc = &sc->sc_sc;
-   uint32_t id, magic, version;
+   uint32_t id, magic;
 
if (faa->fa_nreg < 1) {
printf(": no register data\n");
@@ -213,17 +242,19 @@ virtio_mmio_attach(struct device *parent, struct device 
*self, void *aux)
return;
}
 
-   version = bus_space_read_4(sc->sc_iot, sc->sc_ioh, VIRTIO_MMIO_VERSION);
-   if (version != 1) {
-   printf(": unknown version 0x%02x; giving up\n", version);
+   sc->sc_version = bus_space_read_4(sc->sc_iot, sc->sc_ioh,
+   VIRTIO_MMIO_VERSION);
+   if (sc->sc_version < 1 || sc->sc_version > 2) {
+   printf(": unknown version 0x%02x; giving up\n", sc->sc_version);
return;
}
 
id = bus_space_read_4(sc->sc_iot, sc->sc_ioh, VIRTIO_MMIO_DEVICE_ID);
printf(": Virtio %s Device", virtio_device_string(id));
 
-   bus_space_write_4(sc->sc_iot, sc->sc_ioh, VIRTIO_MMIO_GUEST_P

Re: virtio(4) at fdt: version 2 for Parallels 16 on Mac (M1)

2021-04-14 Thread Patrick Wildt

Am Wed, Apr 14, 2021 at 10:17:58PM +0200 schrieb Patrick Wildt:
> Hi,
> 
> Parallels 16 for Mac supports the Apple M1 SoC now, and since it does
> provide an EFI 'BIOS', our images boot out of the box (once converted
> to 'hdd' or supplied as USB stick).
> 
> Unfortunately virtio doesn't attach, because Parallels seems to provide
> a 'new' version 2.  The following diff adds support for version 2 and
> I used it to install the VM over vio(4) network.  And I was able to
> install packages over vio(4) network.  Disk is ahci(4), USB passthrough
> is xhci(4), so that works nicely out of the box.
> 
> Not sure if we want this for 6.9 or not.  I think it wouldn't break the
> current version 1, so I think it shouldn't hurt.
> 
> If you're wondering why I'm 'so late' with this: jcs@ asked me to have
> a look at the official Parallels for M1 release, and I just did that.
> So I couldn't be any faster than this anyway.
> 
> Opinions?
> 
> Patrick

Obviously I forgot to pay dmesg tax ;)

OpenBSD 6.9 (GENERIC.MP) #295: Wed Apr 14 22:06:35 CEST 2021
patr...@lx2k.blueri.se:/usr/src/sys/arch/arm64/compile/GENERIC.MP
real mem  = 516423680 (492MB)
avail mem = 468152320 (446MB)
random: boothowto does not indicate good seed
mainbus0 at root: Parallels ARM Virtual Machine
psci0 at mainbus0: PSCI 1.0
cpu0 at mainbus0 mpidr 0: Unknown, MIDR 0x410f
cpu0: 192KB 64b/line 6-way L1 PIPT I-cache, 128KB 64b/line 8-way L1 D-cache
cpu0: 12288KB 128b/line 12-way L2 cache
cpu0: 
TLBIOS+IRANGE,TS+AXFLAG,FHM,DP,SHA3,RDM,Atomic,CRC32,SHA2+SHA512,SHA1,AES+PMULL,SPECRES,SB,FRINTTS,GPI,LRCPC+LDAPUR,FCMA,JSCVT,API+PAC,DPB,SpecSEI,PAN+ATS1E1,LO,HPDS,CSV3,CSV2
cpu1 at mainbus0 mpidr 1: Unknown, MIDR 0x410f
cpu1: 192KB 64b/line 6-way L1 PIPT I-cache, 128KB 64b/line 8-way L1 D-cache
cpu1: 12288KB 128b/line 12-way L2 cache
cpu1: 
TLBIOS+IRANGE,TS+AXFLAG,FHM,DP,SHA3,RDM,Atomic,CRC32,SHA2+SHA512,SHA1,AES+PMULL,SPECRES,SB,FRINTTS,GPI,LRCPC+LDAPUR,FCMA,JSCVT,API+PAC,DPB,SpecSEI,PAN+ATS1E1,LO,HPDS,CSV3,CSV2
efi0 at mainbus0: UEFI 2.7
efi0: EDK II rev 0x1
smbios0 at efi0: SMBIOS 3.0.0
smbios0: vendor Parallels Software International Inc. version "16.5.0 (50692)" 
date Mar 25 2021
smbios0: Parallels Parallels ARM Virtual Machine
apm0 at mainbus0
ampintc0 at mainbus0 nirq 128, ncpu 2 ipi: 0, 1: "interrupt-controller"
agtimer0 at mainbus0: 24000 kHz
"soc" at mainbus0 not configured
"clk24mhz" at mainbus0 not configured
pluart0 at mainbus0
ahci0 at mainbus0: AHCI 1.1
ahci0: port 0: 1.5Gb/s
ahci0: port 1: 1.5Gb/s
scsibus0 at ahci0: 32 targets
sd0 at scsibus0 targ 0 lun 0:  
t10.ATA_OpenBSD-0_SSD_Y15J1M1AG4B4S7VMR61V
sd0: 8192MB, 512 bytes/sector, 16777216 sectors, thin
sd1 at scsibus0 targ 1 lun 0:  
t10.ATA_miniroot69_NQGN5C6P8H5MSAW6W3PG
sd1: 33MB, 512 bytes/sector, 67584 sectors, thin
ehci0 at mainbus0
usb0 at ehci0: USB revision 2.0
uhub0 at usb0 configuration 1 interface 0 "Generic EHCI root hub" rev 2.00/1.00 
addr 1
xhci0 at mainbus0, xHCI 1.10
usb1 at xhci0: USB revision 3.0
uhub1 at usb1 configuration 1 interface 0 "Generic xHCI root hub" rev 3.00/1.00 
addr 1
"gpu" at mainbus0 not configured
"toolgate" at mainbus0 not configured
virtio0 at mainbus0: Virtio Memory Balloon Device
viomb0 at virtio0
virtio1 at mainbus0: Virtio Network Device
vio0 at virtio1: address 00:1c:42:8a:67:34
simplefb0 at mainbus0: 1024x768, 32bpp
wsdisplay0 at simplefb0 mux 1: console (std, vt100 emulation)
wsdisplay0: screen 1-5 added (std, vt100 emulation)
uhidev0 at uhub1 port 1 configuration 1 interface 0 "Parallels Virtual Mouse" 
rev 3.00/1.00 addr 2
uhidev0: iclass 3/0, 1 report id
ums0 at uhidev0 reportid 1: 8 buttons, Z and W dir
wsmouse0 at ums0 mux 0
uhidev1 at uhub1 port 1 configuration 1 interface 1 "Parallels Virtual Mouse" 
rev 3.00/1.00 addr 2
uhidev1: iclass 3/0, 2 report ids
ums1 at uhidev1 reportid 2: 8 buttons, Z and W dir
wsmouse1 at ums1 mux 0
uhidev2 at uhub1 port 2 configuration 1 interface 0 "Parallels Virtual 
Keyboard" rev 3.00/1.00 addr 3
uhidev2: iclass 3/1
ukbd0 at uhidev2: 8 variable keys, 5 key codes
wskbd0 at ukbd0: console keyboard, using wsdisplay0
uvideo0 at uhub1 port 3 configuration 1 interface 0 "Parallels FaceTime HD 
Camera" rev 3.10/1.00 addr 4
video0 at uvideo0
vscsi0 at root
scsibus1 at vscsi0: 256 targets
softraid0 at root
scsibus2 at softraid0: 256 targets
root on sd0a (4a24f1a721df244f.a) swap on sd0b dump on sd0b
Automatic boot in progress: starting file system checks.
/dev/sd0a (4a24f1a721df244f.a): file system is clean; not checking
/dev/sd0e (4a24f1a721df244f.e): file system is clean; not checking
/dev/sd0d (4a24f1a721df244f.d): file system is clean; not checking
pf enabled
starting network
vio0: 10.211.55.4 lease accepted from 10.211.55.1 (00:1c:42:00:0

Re: virtio(4) at fdt: version 2 for Parallels 16 on Mac (M1)

2021-04-14 Thread Patrick Wildt

Am Wed, Apr 14, 2021 at 10:55:14PM +0200 schrieb Mark Kettenis:
> > Date: Wed, 14 Apr 2021 22:25:16 +0200
> > From: Patrick Wildt 
> > 
> > Am Wed, Apr 14, 2021 at 10:17:58PM +0200 schrieb Patrick Wildt:
> > > Hi,
> > > 
> > > Parallels 16 for Mac supports the Apple M1 SoC now, and since it does
> > > provide an EFI 'BIOS', our images boot out of the box (once converted
> > > to 'hdd' or supplied as USB stick).
> > > 
> > > Unfortunately virtio doesn't attach, because Parallels seems to provide
> > > a 'new' version 2.  The following diff adds support for version 2 and
> > > I used it to install the VM over vio(4) network.  And I was able to
> > > install packages over vio(4) network.  Disk is ahci(4), USB passthrough
> > > is xhci(4), so that works nicely out of the box.
> > > 
> > > Not sure if we want this for 6.9 or not.  I think it wouldn't break the
> > > current version 1, so I think it shouldn't hurt.
> > > 
> > > If you're wondering why I'm 'so late' with this: jcs@ asked me to have
> > > a look at the official Parallels for M1 release, and I just did that.
> > > So I couldn't be any faster than this anyway.
> > > 
> > > Opinions?
> > > 
> > > Patrick
> > 
> > Obviously I forgot to pay dmesg tax ;)
> > 
> > OpenBSD 6.9 (GENERIC.MP) #295: Wed Apr 14 22:06:35 CEST 2021
> > patr...@lx2k.blueri.se:/usr/src/sys/arch/arm64/compile/GENERIC.MP
> > real mem  = 516423680 (492MB)
> > avail mem = 468152320 (446MB)
> > random: boothowto does not indicate good seed
> > mainbus0 at root: Parallels ARM Virtual Machine
> > psci0 at mainbus0: PSCI 1.0
> > cpu0 at mainbus0 mpidr 0: Unknown, MIDR 0x410f
> > cpu0: 192KB 64b/line 6-way L1 PIPT I-cache, 128KB 64b/line 8-way L1 D-cache
> > cpu0: 12288KB 128b/line 12-way L2 cache
> > cpu0: 
> > TLBIOS+IRANGE,TS+AXFLAG,FHM,DP,SHA3,RDM,Atomic,CRC32,SHA2+SHA512,SHA1,AES+PMULL,SPECRES,SB,FRINTTS,GPI,LRCPC+LDAPUR,FCMA,JSCVT,API+PAC,DPB,SpecSEI,PAN+ATS1E1,LO,HPDS,CSV3,CSV2
> > cpu1 at mainbus0 mpidr 1: Unknown, MIDR 0x410f
> > cpu1: 192KB 64b/line 6-way L1 PIPT I-cache, 128KB 64b/line 8-way L1 D-cache
> > cpu1: 12288KB 128b/line 12-way L2 cache
> > cpu1: 
> > TLBIOS+IRANGE,TS+AXFLAG,FHM,DP,SHA3,RDM,Atomic,CRC32,SHA2+SHA512,SHA1,AES+PMULL,SPECRES,SB,FRINTTS,GPI,LRCPC+LDAPUR,FCMA,JSCVT,API+PAC,DPB,SpecSEI,PAN+ATS1E1,LO,HPDS,CSV3,CSV2
> > efi0 at mainbus0: UEFI 2.7
> > efi0: EDK II rev 0x1
> > smbios0 at efi0: SMBIOS 3.0.0
> > smbios0: vendor Parallels Software International Inc. version "16.5.0 
> > (50692)" date Mar 25 2021
> > smbios0: Parallels Parallels ARM Virtual Machine
> > apm0 at mainbus0
> > ampintc0 at mainbus0 nirq 128, ncpu 2 ipi: 0, 1: "interrupt-controller"
> > agtimer0 at mainbus0: 24000 kHz
> > "soc" at mainbus0 not configured
> > "clk24mhz" at mainbus0 not configured
> > pluart0 at mainbus0
> > ahci0 at mainbus0: AHCI 1.1
> > ahci0: port 0: 1.5Gb/s
> > ahci0: port 1: 1.5Gb/s
> > scsibus0 at ahci0: 32 targets
> > sd0 at scsibus0 targ 0 lun 0:  
> > t10.ATA_OpenBSD-0_SSD_Y15J1M1AG4B4S7VMR61V
> > sd0: 8192MB, 512 bytes/sector, 16777216 sectors, thin
> > sd1 at scsibus0 targ 1 lun 0:  
> > t10.ATA_miniroot69_NQGN5C6P8H5MSAW6W3PG
> > sd1: 33MB, 512 bytes/sector, 67584 sectors, thin
> > ehci0 at mainbus0
> > usb0 at ehci0: USB revision 2.0
> > uhub0 at usb0 configuration 1 interface 0 "Generic EHCI root hub" rev 
> > 2.00/1.00 addr 1
> > xhci0 at mainbus0, xHCI 1.10
> > usb1 at xhci0: USB revision 3.0
> > uhub1 at usb1 configuration 1 interface 0 "Generic xHCI root hub" rev 
> > 3.00/1.00 addr 1
> > "gpu" at mainbus0 not configured
> > "toolgate" at mainbus0 not configured
> > virtio0 at mainbus0: Virtio Memory Balloon Device
> > viomb0 at virtio0
> > virtio1 at mainbus0: Virtio Network Device
> > vio0 at virtio1: address 00:1c:42:8a:67:34
> > simplefb0 at mainbus0: 1024x768, 32bpp
> > wsdisplay0 at simplefb0 mux 1: console (std, vt100 emulation)
> > wsdisplay0: screen 1-5 added (std, vt100 emulation)
> > uhidev0 at uhub1 port 1 configuration 1 interface 0 "Parallels Virtual 
> > Mouse" rev 3.00/1.00 addr 2
> > uhidev0: iclass 3/0, 1 report id
> > ums0 at uhidev0 reportid 1: 8 buttons, Z and W dir
> > wsmouse0 at ums0 mux 0
> > uhidev1 at uhub1 port 1 configuration 1 interface 1 "Par

Re: virtio(4) at fdt: version 2 for Parallels 16 on Mac (M1)

2021-04-14 Thread Patrick Wildt

On Wed, Apr 14, 2021 at 11:20:56PM +0200, Patrick Wildt wrote:
> Am Wed, Apr 14, 2021 at 10:55:14PM +0200 schrieb Mark Kettenis:
> > > Date: Wed, 14 Apr 2021 22:25:16 +0200
> > > From: Patrick Wildt 
> > > 
> > > Am Wed, Apr 14, 2021 at 10:17:58PM +0200 schrieb Patrick Wildt:
> > > > Hi,
> > > > 
> > > > Parallels 16 for Mac supports the Apple M1 SoC now, and since it does
> > > > provide an EFI 'BIOS', our images boot out of the box (once converted
> > > > to 'hdd' or supplied as USB stick).
> > > > 
> > > > Unfortunately virtio doesn't attach, because Parallels seems to provide
> > > > a 'new' version 2.  The following diff adds support for version 2 and
> > > > I used it to install the VM over vio(4) network.  And I was able to
> > > > install packages over vio(4) network.  Disk is ahci(4), USB passthrough
> > > > is xhci(4), so that works nicely out of the box.
> > > > 
> > > > Not sure if we want this for 6.9 or not.  I think it wouldn't break the
> > > > current version 1, so I think it shouldn't hurt.
> > > > 
> > > > If you're wondering why I'm 'so late' with this: jcs@ asked me to have
> > > > a look at the official Parallels for M1 release, and I just did that.
> > > > So I couldn't be any faster than this anyway.
> > > > 
> > > > Opinions?
> > > > 
> > > > Patrick
> > > 
> > > Obviously I forgot to pay dmesg tax ;)
> > > 

Things change a little when you run 'machine acpi' in efiboot.

OpenBSD 6.9 (GENERIC.MP) #295: Wed Apr 14 22:06:35 CEST 2021
patr...@lx2k.blueri.se:/usr/src/sys/arch/arm64/compile/GENERIC.MP
real mem  = 516284416 (492MB)
avail mem = 468021248 (446MB)
random: good seed from bootblocks
mainbus0 at root: ACPI
psci0 at mainbus0: PSCI 1.0
cpu0 at mainbus0 mpidr 0: Unknown, MIDR 0x410f
cpu0: 192KB 64b/line 6-way L1 PIPT I-cache, 128KB 64b/line 8-way L1 D-cache
cpu0: 12288KB 128b/line 12-way L2 cache
cpu0: 
TLBIOS+IRANGE,TS+AXFLAG,FHM,DP,SHA3,RDM,Atomic,CRC32,SHA2+SHA512,SHA1,AES+PMULL,SPECRES,SB,FRINTTS,GPI,LRCPC+LDAPUR,FCMA,JSCVT,API+PAC,DPB,SpecSEI,PAN+ATS1E1,LO,HPDS,CSV3,CSV2
cpu1 at mainbus0 mpidr 1: Unknown, MIDR 0x410f
cpu1: 192KB 64b/line 6-way L1 PIPT I-cache, 128KB 64b/line 8-way L1 D-cache
cpu1: 12288KB 128b/line 12-way L2 cache
cpu1: 
TLBIOS+IRANGE,TS+AXFLAG,FHM,DP,SHA3,RDM,Atomic,CRC32,SHA2+SHA512,SHA1,AES+PMULL,SPECRES,SB,FRINTTS,GPI,LRCPC+LDAPUR,FCMA,JSCVT,API+PAC,DPB,SpecSEI,PAN+ATS1E1,LO,HPDS,CSV3,CSV2
efi0 at mainbus0: UEFI 2.7
efi0: EDK II rev 0x1
smbios0 at efi0: SMBIOS 3.0.0
smbios0: vendor Parallels Software International Inc. version "16.5.0 (50692)" 
date Mar 25 2021
smbios0: Parallels Parallels ARM Virtual Machine
apm0 at mainbus0
ampintc0 at mainbus0 nirq 128, ncpu 2 ipi: 0, 1: "interrupt-controller"
agtimer0 at mainbus0: 24000 kHz
acpi0 at mainbus0: ACPI 6.1
acpi0: sleep states
acpi0: tables DSDT FACP DBG2 GTDT APIC
acpi0: wakeup devices
"ACPI0007" at acpi0 not configured
"ACPI0007" at acpi0 not configured
"ACPI0007" at acpi0 not configured
"ACPI0007" at acpi0 not configured
"ACPI0007" at acpi0 not configured
"ACPI0007" at acpi0 not configured
"ACPI0007" at acpi0 not configured
"ACPI0007" at acpi0 not configured
acpibtn0 at acpi0: PWRB
acpige0 at acpi0 irq 48
"PRL4005" at acpi0 not configured
"PRL4000" at acpi0 not configured
"PRL4006" at acpi0 not configured
"PRL4009" at acpi0 not configured
"PNP0D20" at acpi0 not configured
ahci0 at acpi0 AHC0 addr 0x214/0x2000 irq 34: AHCI 1.1
ahci0: port 0: 1.5Gb/s
ahci0: port 1: 1.5Gb/s
ahci0: port 2: 1.5Gb/s
scsibus0 at ahci0: 32 targets
sd0 at scsibus0 targ 0 lun 0:  
t10.ATA_OpenBSD-0_SSD_SM7VVT660WEM5DW1GGN0
sd0: 65536MB, 512 bytes/sector, 134217728 sectors, thin
sd1 at scsibus0 targ 1 lun 0:  
t10.ATA_miniroot69_NQGN5C6P8H5MSAW6W3PG
sd1: 33MB, 512 bytes/sector, 67584 sectors, thin
cd0 at scsibus0 targ 2 lun 0: <, Virtual DVD-ROM, R103> removable
"ACPI000E" at acpi0 not configured
xhci0 at acpi0 XHC0 addr 0x216/0x1000 irq 36, xHCI 1.10
usb0 at xhci0: USB revision 3.0
uhub0 at usb0 configuration 1 interface 0 "Generic xHCI root hub" rev 3.00/1.00 
addr 1
simplefb0 at mainbus0: 1024x768, 32bpp
wsdisplay0 at simplefb0 mux 1: console (std, vt100 emulation)
wsdisplay0: screen 1-5 added (std, vt100 emulation)
uhidev0 at uhub0 port 1 configuration 1 interface 0 "Parallels Virtual Mouse" 
rev 3.00/1.00 addr 2
uhidev0: iclass 3/0, 1 report id
ums0 at uhidev0 reportid 1: 8 buttons, Z and W dir
wsmouse0 at u

Re: virtio(4) at fdt: version 2 for Parallels 16 on Mac (M1)

2021-04-14 Thread Patrick Wildt

Am Thu, Apr 15, 2021 at 12:47:44AM +0200 schrieb Patrick Wildt:
> On Wed, Apr 14, 2021 at 11:20:56PM +0200, Patrick Wildt wrote:
> > Am Wed, Apr 14, 2021 at 10:55:14PM +0200 schrieb Mark Kettenis:
> > > > Date: Wed, 14 Apr 2021 22:25:16 +0200
> > > > From: Patrick Wildt 
> > > > 
> > > > Am Wed, Apr 14, 2021 at 10:17:58PM +0200 schrieb Patrick Wildt:
> > > > > Hi,
> > > > > 
> > > > > Parallels 16 for Mac supports the Apple M1 SoC now, and since it does
> > > > > provide an EFI 'BIOS', our images boot out of the box (once converted
> > > > > to 'hdd' or supplied as USB stick).
> > > > > 
> > > > > Unfortunately virtio doesn't attach, because Parallels seems to 
> > > > > provide
> > > > > a 'new' version 2.  The following diff adds support for version 2 and
> > > > > I used it to install the VM over vio(4) network.  And I was able to
> > > > > install packages over vio(4) network.  Disk is ahci(4), USB 
> > > > > passthrough
> > > > > is xhci(4), so that works nicely out of the box.
> > > > > 
> > > > > Not sure if we want this for 6.9 or not.  I think it wouldn't break 
> > > > > the
> > > > > current version 1, so I think it shouldn't hurt.
> > > > > 
> > > > > If you're wondering why I'm 'so late' with this: jcs@ asked me to have
> > > > > a look at the official Parallels for M1 release, and I just did that.
> > > > > So I couldn't be any faster than this anyway.
> > > > > 
> > > > > Opinions?
> > > > > 
> > > > > Patrick
> > > > 
> > > > Obviously I forgot to pay dmesg tax ;)
> > > > 
> 
> Things change a little when you run 'machine acpi' in efiboot.

And here's the DSDT (thanks to jcs@).  I think audio (HDEF) should be
easy, I believe we only need to attach Azalia to that device, rest
hopefully 'just works'.  PNP0D20 is ehci, that should be easy as
well.  Not sure about video...  There's also an Ethernet device.

Btw, this is just a bit of ACPI info dump, which is unrelated to this
diff.

/*
 * Intel ACPI Component Architecture
 * AML/ASL+ Disassembler version 20200925 (64-bit version)
 * Copyright (c) 2000 - 2020 Intel Corporation
 * 
 * Disassembling to symbolic ASL+ operators
 *
 * Disassembly of DSDT.2, Wed Apr 14 17:46:36 2021
 *
 * Original Table Header:
 * Signature"DSDT"
 * Length   0x0AC6 (2758)
 * Revision 0x02
 * Checksum 0x9D
 * OEM ID   "PRLS  "
 * OEM Table ID "PRLS_OEM"
 * OEM Revision 0x0003 (3)
 * Compiler ID  "INTL"
 * Compiler Version 0x20160527 (538314023)
 */
DefinitionBlock ("", "DSDT", 2, "PRLS  ", "PRLS_OEM", 0x0003)
{
Scope (_SB)
{
OperationRegion (MBOX, SystemMemory, 0x0210, 0x4000)
Field (MBOX, DWordAcc, NoLock, Preserve)
{
MVER,   32, 
RAM,32, 
CPUM,   32, 
MUSB,   32, 
MUFS,   32, 
MAHC,   32, 
VRAM,   32, 
MSER,   32, 
MHDA,   32, 
GPU,32, 
LOON,   32, 
TOOL,   32, 
NET,32
}

Device (CP00)
{
Name (_HID, "ACPI0007" /* Processor Device */)  // _HID: Hardware ID
Name (_UID, Zero)  // _UID: Unique ID
}

Device (CP01)
{
Name (_HID, "ACPI0007" /* Processor Device */)  // _HID: Hardware ID
Name (_UID, One)  // _UID: Unique ID
}

Device (CP02)
{
Name (_HID, "ACPI0007" /* Processor Device */)  // _HID: Hardware ID
Name (_UID, 0x02)  // _UID: Unique ID
}

Device (CP03)
{
Name (_HID, "ACPI0007" /* Processor Device */)  // _HID: Hardware ID
Name (_UID, 0x03)  // _UID: Unique ID
}

Device (CP04)
{
Name (_HID, "ACPI0007" /* Processor Device */)  // _HID: Hardware ID
Name (_UID, 0x04)  // _UID: Unique ID
}

Device (CP05)
{
Name (_HID, "ACPI0007" /* Processor Device */)  // _HID: Hardware ID
Name (_UID, 0x05)  // _UID: Unique ID
}

Device (CP06)
{
Name (_HID, "ACPI0007" /* Processor Device */)  // _HID: Hardware ID
Name (_UID, 0x06)

Re: Change umb(4) devclass from DV_DULL to DV_IFNET

2021-04-20 Thread Patrick Wildt

Am Mon, Apr 19, 2021 at 10:25:39AM +0200 schrieb Tilo Stritzky:
> On 10/04/21 22:56  Tilo Stritzky wrote:
> > umb interfaces advertise themselves as generic devices.
> > Network makes a lot more sense, I think.
> > tested on amd64.
> 
> Having seen no response on this one, I'ld like to expand a little
> further.
> 
> This value defines how a device is identified in hotplug events,
> eventually showing up in userland as argv to /etc/hotplug/attach.
> It doesn't seem to be used by the kernel itself.
> 
> Currently umb(4) mobile broadband interfaces identify as ``generic''.
> This is not exactly wrong, but there is a ``network interface'' class
> which is a much tighter match. All other USB network interfaces set
> it to DV_IFNET.
> 
> The change lets me group umb related stuff together with other
> hotplugged network devices in /etc/hotplug/attach.
> 
> Alas, this might break some existing hotplugd setups.
> 
> tilo

That does indeed make sense.  umb(4) is some kind of a network device,
and especially in hotplug I think it makes a lot more sense there as
well.  So, I'm in favour.

Objections?  ok?

> Index: if_umb.c
> ===
> RCS file: /cvs/src/sys/dev/usb/if_umb.c,v
> retrieving revision 1.43
> diff -u -p -r1.43 if_umb.c
> --- if_umb.c  1 Apr 2021 08:39:52 -   1.43
> +++ if_umb.c  10 Apr 2021 20:14:59 -
> @@ -212,7 +212,7 @@ uint8_tumb_uuid_qmi_mbim[] = MBIM_UUI
>  uint32_t  umb_session_id = 0;
> 
>  struct cfdriver umb_cd = {
> - NULL, "umb", DV_DULL
> + NULL, "umb", DV_IFNET
>  };
> 
>  const struct cfattach umb_ca = {
> 
> 
> 
> 
> umb0 at uhub0 port 1 configuration 1 interface 0 "MediaTek Inc Product" rev 
> 2.00/3.00 addr 2
> umsm0 at uhub0 port 1 configuration 1 interface 2 "MediaTek Inc Product" rev 
> 2.00/3.00 addr 2
> ucom0 at umsm0
> umsm1 at uhub0 port 1 configuration 1 interface 3 "MediaTek Inc Product" rev 
> 2.00/3.00 addr 2
> ucom1 at umsm1
> umsm2 at uhub0 port 1 configuration 1 interface 4 "MediaTek Inc Product" rev 
> 2.00/3.00 addr 2
> ucom2 at umsm2
> umsm3 at uhub0 port 1 configuration 1 interface 5 "MediaTek Inc Product" rev 
> 2.00/3.00 addr 2
> ucom3 at umsm3
> umass0 at uhub0 port 1 configuration 1 interface 6 "MediaTek Inc Product" rev 
> 2.00/3.00 addr 2
> umass0: using SCSI over Bulk-Only
> scsibus2 at umass0: 2 targets, initiator 0
> sd1 at scsibus2 targ 1 lun 0:  removable
>

Re: tmpfs & UVM aobj

2021-04-22 Thread Patrick Wildt

Am Thu, Apr 22, 2021 at 11:19:22AM +0200 schrieb Martin Pieuchot:
> uao_shrink() and uao_grow() are only used by TMPFS, ok to place them
> under an #ifdef?  This save some bytes on RAMDISKs.

sure, ok patrick@

> Index: uvm/uvm_aobj.c
> ===
> RCS file: /cvs/src/sys/uvm/uvm_aobj.c,v
> retrieving revision 1.94
> diff -u -p -r1.94 uvm_aobj.c
> --- uvm/uvm_aobj.c31 Mar 2021 08:53:39 -  1.94
> +++ uvm/uvm_aobj.c22 Apr 2021 09:00:27 -
> @@ -416,6 +416,7 @@ uao_free(struct uvm_aobj *aobj)
>   * pager functions
>   */
>  
> +#ifdef TMPFS
>  /*
>   * Shrink an aobj to a given number of pages. The procedure is always the 
> same:
>   * assess the necessity of data structure conversion (hash to array), secure
> @@ -692,6 +693,7 @@ uao_grow(struct uvm_object *uobj, int pa
>   else
>   return uao_grow_convert(uobj, pages);
>  }
> +#endif /* TMPFS */
>  
>  /*
>   * uao_create: create an aobj of the given size and return its uvm_object.
>

enable dt(4)

2021-04-26 Thread Patrick Wildt

Hi,

as proposed by bluhm@ recently, this is the diff to enable dt(4) in
GENERIC.  The overhead should be small, and I have been using it on
arm64 to successfully debug issues for a while now.

I can't vouch that it builds for all architectures... Did anyone do
that?  Number 1 rule: don't break Theo's build.

Patrick

diff --git a/sys/conf/GENERIC b/sys/conf/GENERIC
index 33d0f368968..c47bea90cde 100644
--- a/sys/conf/GENERIC
+++ b/sys/conf/GENERIC
@@ -82,7 +82,7 @@ pseudo-device msts1   # MSTS line discipline
 pseudo-device  endrun  1   # EndRun line discipline
 pseudo-device  vnd 4   # vnode disk devices
 pseudo-device  ksyms   1   # kernel symbols device
-#pseudo-device dt  # Dynamic Tracer
+pseudo-device  dt  # Dynamic Tracer
 
 # clonable devices
 pseudo-device  bpfilter# packet filter

hvn(4): don't input mbufs if interface is not running

2021-05-12 Thread Patrick Wildt

Hi,

when hvn(4) attaches it sends commands and waits for replies to come
back in, hence the interrupt function is being polled.  Unfortunately
it seems that the 'receive pipe' has both command completion and data
packets.  As it turns out, while hvn(4) is just setting up the pipes,
it can already receive packets, which I have seen happening on Hyper-V.

This essentially means that if_input() is being called *before* the
card is set up (or UP).  This seems wrong.  Apparently on drivers like
em(4) we only read packets if IFF_RUNNING is set.  I think in the case
of hvn(4), we should drop packets unless IFF_RUNNING is set.

Opinions?

Patrick

diff --git a/sys/dev/pv/if_hvn.c b/sys/dev/pv/if_hvn.c
index f12e2f935ca..4306f717baf 100644
--- a/sys/dev/pv/if_hvn.c
+++ b/sys/dev/pv/if_hvn.c
@@ -1470,7 +1470,10 @@ hvn_rndis_input(struct hvn_softc *sc, uint64_t tid, void 
*arg)
}
hvn_nvs_ack(sc, tid);
 
-   if_input(ifp, &ml);
+   if (ifp->if_flags & IFF_RUNNING)
+   if_input(ifp, &ml);
+   else
+   ml_purge(&ml);
 }
 
 static inline struct mbuf *

Re: move copyout() in DIOCGETSTATES outside of NET_LOCK() and state_lcok

2021-05-20 Thread Patrick Wildt

Am Thu, May 20, 2021 at 11:28:19AM +0200 schrieb Claudio Jeker:
> On Thu, May 20, 2021 at 09:37:38AM +0200, Martin Pieuchot wrote:
> > On 20/05/21(Thu) 03:23, Alexandr Nedvedicky wrote:
> > > Hrvoje gave a try to experimental diff, which trades rw-locks in pf(4)
> > > for mutexes [1]. Hrvoje soon discovered machine panics, when doing 'pfctl 
> > > -ss'
> > > The callstack looks as follows:
> > >
> > > [...]
> > > specific to experimental diff [1]. However this made me thinking, that
> > > it's not a good idea to do copyout() while holding NET_LOCK() and 
> > > state_lock.
> > 
> > malloc(9) and copyout(9) are kind of ok while using the NET_LOCK() but
> > if a deadlock occurs while a global rwlock is held, debugging becomes
> > harder.
> > 
> > As long as the `state_lock' and PF_LOCK() are mutexes all allocations
> > and copyin/copyout(9) must be done without holding them.
> 
> One way to reduce the problems with copyout(9) is to use uvm_vslock() to
> lock the pages. This is what sysctl does but it comes with its own set of
> issues.
> 
> In general exporting large collections from the kernel needs to be
> rethought. The system should not grab a lock for a long time to
> serve a userland process.
>  
> > > Diff below moves copyout() at line 1784 outside of protection of both 
> > > locks.
> > > The approach I took is relatively straightforward:
> > > 
> > > let DIOCGETSTATES to allocate hold_states array, which will keep
> > > references to states.
> > > 
> > > grab locks and take references, keep those references in hold
> > > array.
> > > 
> > > drop locks, export states and do copyout, while walking
> > > array of references.
> > > 
> > > drop references, release hold_states array.
> > > 
> > > does it make sense? If we agree that this approach makes sense
> > 
> > In my opinion it does.  The other approach would be to (ab)use the
> > NET_LOCK() to serialize updates, like bluhm@'s diff does.  Both
> > approaches have pros and cons.
> > 
> 
> I really think adding more to the NET_LOCK() is a step in the wrong
> direction. It will just creap into everything and grow the size of the
> kernel lock. For me the cons outweight the pros.

While what we do at genua probably isn't particularly relevant to the
OpenBSD approach to unlocking the network subsystem, but what we do is
that there is a rwlock that protects the network configuration, and for
ioctls like DIOCGETSTATES we do indeed call uvm_vslock(), then take the
lock either read or write, depending on which ioctl it is, and unlock it
prior to return.

Since this is only the network conf lock, taking it with a 'read' for
something like DIOCGETSTATES does not really influence the network
receive/transmit path, I think.

Re: xhci early enumeration

2021-05-21 Thread Patrick Wildt

Am Wed, May 19, 2021 at 07:15:50AM + schrieb Christian Ludwig:
> The usb(4) driver allows to enumerate the bus early during boot by
> setting its driver flags to 0x1 in UKC. This mechanism can enable a USB
> console keyboard early during autoconf(9), which can come in handy at
> times. This needs USB polling mode to work, which is a bit broken. Here
> is my attempt to fix it for xhci(4) controllers.
> 
> According to the xHCI specification section 4.2 "Host Controller
> Initalization", the host controller must be fully initialized before
> descending into device enumeration. Then xhci(4) sends command TRBs to
> open new pipes during enumeration. They wait for completion using
> tsleep(). This is bad when in polling mode at boot. And finally, the
> behavior should be the same on resume as it is at boot. Therefore also
> enumerate USB devices during resume when the flag is set.
> 
> I am specifically looking for tests on xhci controllers with usb(4)
> flags set to 1 in UKC.
> 
> So long,
> 
> 
>  - Christian
> 
> 
> diff --git a/sys/arch/armv7/marvell/mvxhci.c b/sys/arch/armv7/marvell/mvxhci.c
> index 38a636fd123..2137f68b816 100644
> --- a/sys/arch/armv7/marvell/mvxhci.c
> +++ b/sys/arch/armv7/marvell/mvxhci.c
> @@ -155,12 +155,12 @@ mvxhci_attach(struct device *parent, struct device 
> *self, void *aux)
>   goto disestablish_ret;
>   }
>  
> - /* Attach usb device. */
> - config_found(self, &sc->sc.sc_bus, usbctlprint);
> -
>   /* Now that the stack is ready, config' the HC and enable interrupts. */
>   xhci_config(&sc->sc);
>  
> + /* Attach usb device. */
> + config_found(self, &sc->sc.sc_bus, usbctlprint);
> +
>   return;
>  
>  disestablish_ret:
> diff --git a/sys/dev/acpi/xhci_acpi.c b/sys/dev/acpi/xhci_acpi.c
> index 95e69cee896..d762f69a00e 100644
> --- a/sys/dev/acpi/xhci_acpi.c
> +++ b/sys/dev/acpi/xhci_acpi.c
> @@ -112,12 +112,12 @@ xhci_acpi_attach(struct device *parent, struct device 
> *self, void *aux)
>   goto disestablish_ret;
>   }
>  
> - /* Attach usb device. */
> - config_found(self, &sc->sc.sc_bus, usbctlprint);
> -
>   /* Now that the stack is ready, config' the HC and enable interrupts. */
>   xhci_config(&sc->sc);
>  
> + /* Attach usb device. */
> + config_found(self, &sc->sc.sc_bus, usbctlprint);
> +
>   return;
>  
>  disestablish_ret:
> diff --git a/sys/dev/fdt/xhci_fdt.c b/sys/dev/fdt/xhci_fdt.c
> index 38c976a6b24..84e00bdadc5 100644
> --- a/sys/dev/fdt/xhci_fdt.c
> +++ b/sys/dev/fdt/xhci_fdt.c
> @@ -116,12 +116,12 @@ xhci_fdt_attach(struct device *parent, struct device 
> *self, void *aux)
>   goto disestablish_ret;
>   }
>  
> - /* Attach usb device. */
> - config_found(self, &sc->sc.sc_bus, usbctlprint);
> -
>   /* Now that the stack is ready, config' the HC and enable interrupts. */
>   xhci_config(&sc->sc);
>  
> + /* Attach usb device. */
> + config_found(self, &sc->sc.sc_bus, usbctlprint);
> +
>   return;
>  
>  disestablish_ret:
> diff --git a/sys/dev/pci/xhci_pci.c b/sys/dev/pci/xhci_pci.c
> index fa3271b0d30..0b46083b705 100644
> --- a/sys/dev/pci/xhci_pci.c
> +++ b/sys/dev/pci/xhci_pci.c
> @@ -195,12 +195,12 @@ xhci_pci_attach(struct device *parent, struct device 
> *self, void *aux)
>   if (PCI_VENDOR(psc->sc_id) == PCI_VENDOR_INTEL)
>   xhci_pci_port_route(psc);
>  
> - /* Attach usb device. */
> - config_found(self, &psc->sc.sc_bus, usbctlprint);
> -
>   /* Now that the stack is ready, config' the HC and enable interrupts. */
>   xhci_config(&psc->sc);
>  
> + /* Attach usb device. */
> + config_found(self, &psc->sc.sc_bus, usbctlprint);
> +
>   return;
>  
>  disestablish_ret:

The interesting thing is that xhci_config() used to be part of
xhci_init() and was explicitly taken out from it to fix a panic
that showed up when enumeration happened afterwards.

https://github.com/openbsd/src/commit/48155c88d2b90737b892a715e56d81bc73254308

Is it possible that this works in polling mode, but not without?

While I agree that moving xhci_config() before enumeration creates
consistency with the others, this change was done deliberately and
we should find out why.

mpi, do you still happen to have the logs or the machine for that
particular issue?

Patrick

> diff --git a/sys/dev/usb/usb.c b/sys/dev/usb/usb.c
> index b8943882d0a..f9aff94bfee 100644
> --- a/sys/dev/usb/usb.c
> +++ b/sys/dev/usb/usb.c
> @@ -911,8 +911,19 @@ usb_activate(struct device *self, int act)
>* hub transfers do not need to sleep.
>*/
>   sc->sc_bus->use_polling++;
> - if (!usb_attach_roothub(sc))
> + if (!usb_attach_roothub(sc)) {
> + struct usbd_device *dev = sc->sc_bus->root_hub;
> +#if 1
> + /*
> +  * Turning this code off will delay attachment of USB 
> devices
> +  * until

Re: xhci early enumeration

2021-05-21 Thread Patrick Wildt

Am Fri, May 21, 2021 at 06:18:40PM +0200 schrieb Martin Pieuchot:
> On 21/05/21(Fri) 10:48, Patrick Wildt wrote:
> > Am Wed, May 19, 2021 at 07:15:50AM + schrieb Christian Ludwig:
> > > The usb(4) driver allows to enumerate the bus early during boot by
> > > setting its driver flags to 0x1 in UKC. This mechanism can enable a USB
> > > console keyboard early during autoconf(9), which can come in handy at
> > > times. This needs USB polling mode to work, which is a bit broken. Here
> > > is my attempt to fix it for xhci(4) controllers.
> > > 
> > > According to the xHCI specification section 4.2 "Host Controller
> > > Initalization", the host controller must be fully initialized before
> > > descending into device enumeration. Then xhci(4) sends command TRBs to
> > > open new pipes during enumeration. They wait for completion using
> > > tsleep(). This is bad when in polling mode at boot. And finally, the
> > > behavior should be the same on resume as it is at boot. Therefore also
> > > enumerate USB devices during resume when the flag is set.
> > > 
> > > I am specifically looking for tests on xhci controllers with usb(4)
> > > flags set to 1 in UKC.
> > > 
> > > So long,
> > > 
> > > 
> > >  - Christian
> > > 
> > > 
> > > diff --git a/sys/arch/armv7/marvell/mvxhci.c 
> > > b/sys/arch/armv7/marvell/mvxhci.c
> > > index 38a636fd123..2137f68b816 100644
> > > --- a/sys/arch/armv7/marvell/mvxhci.c
> > > +++ b/sys/arch/armv7/marvell/mvxhci.c
> > > @@ -155,12 +155,12 @@ mvxhci_attach(struct device *parent, struct device 
> > > *self, void *aux)
> > >   goto disestablish_ret;
> > >   }
> > >  
> > > - /* Attach usb device. */
> > > - config_found(self, &sc->sc.sc_bus, usbctlprint);
> > > -
> > >   /* Now that the stack is ready, config' the HC and enable interrupts. */
> > >   xhci_config(&sc->sc);
> > >  
> > > + /* Attach usb device. */
> > > + config_found(self, &sc->sc.sc_bus, usbctlprint);
> > > +
> > >   return;
> > >  
> > >  disestablish_ret:
> > > diff --git a/sys/dev/acpi/xhci_acpi.c b/sys/dev/acpi/xhci_acpi.c
> > > index 95e69cee896..d762f69a00e 100644
> > > --- a/sys/dev/acpi/xhci_acpi.c
> > > +++ b/sys/dev/acpi/xhci_acpi.c
> > > @@ -112,12 +112,12 @@ xhci_acpi_attach(struct device *parent, struct 
> > > device *self, void *aux)
> > >   goto disestablish_ret;
> > >   }
> > >  
> > > - /* Attach usb device. */
> > > - config_found(self, &sc->sc.sc_bus, usbctlprint);
> > > -
> > >   /* Now that the stack is ready, config' the HC and enable interrupts. */
> > >   xhci_config(&sc->sc);
> > >  
> > > + /* Attach usb device. */
> > > + config_found(self, &sc->sc.sc_bus, usbctlprint);
> > > +
> > >   return;
> > >  
> > >  disestablish_ret:
> > > diff --git a/sys/dev/fdt/xhci_fdt.c b/sys/dev/fdt/xhci_fdt.c
> > > index 38c976a6b24..84e00bdadc5 100644
> > > --- a/sys/dev/fdt/xhci_fdt.c
> > > +++ b/sys/dev/fdt/xhci_fdt.c
> > > @@ -116,12 +116,12 @@ xhci_fdt_attach(struct device *parent, struct 
> > > device *self, void *aux)
> > >   goto disestablish_ret;
> > >   }
> > >  
> > > - /* Attach usb device. */
> > > - config_found(self, &sc->sc.sc_bus, usbctlprint);
> > > -
> > >   /* Now that the stack is ready, config' the HC and enable interrupts. */
> > >   xhci_config(&sc->sc);
> 
> > >  
> > > + /* Attach usb device. */
> > > + config_found(self, &sc->sc.sc_bus, usbctlprint);
> > > +
> > >   return;
> > >  
> > >  disestablish_ret:
> > > diff --git a/sys/dev/pci/xhci_pci.c b/sys/dev/pci/xhci_pci.c
> > > index fa3271b0d30..0b46083b705 100644
> > > --- a/sys/dev/pci/xhci_pci.c
> > > +++ b/sys/dev/pci/xhci_pci.c
> > > @@ -195,12 +195,12 @@ xhci_pci_attach(struct device *parent, struct 
> > > device *self, void *aux)
> > >   if (PCI_VENDOR(psc->sc_id) == PCI_VENDOR_INTEL)
> > >   xhci_pci_port_route(psc);
> > >  
> > > - /* Attach usb device. */
> > > - config_found(self, &psc->sc.sc_bus, usbctlprint);
> > > -
> > >   /* Now that the stack is ready, config' the HC and enable in

Re: xhci early enumeration

2021-05-21 Thread Patrick Wildt

Am Fri, May 21, 2021 at 07:24:59PM +0200 schrieb Mark Kettenis:
> > Date: Fri, 21 May 2021 19:01:39 +0200
> > From: Patrick Wildt 
> > 
> > Am Fri, May 21, 2021 at 06:18:40PM +0200 schrieb Martin Pieuchot:
> > > On 21/05/21(Fri) 10:48, Patrick Wildt wrote:
> > > > Am Wed, May 19, 2021 at 07:15:50AM + schrieb Christian Ludwig:
> > > > > The usb(4) driver allows to enumerate the bus early during boot by
> > > > > setting its driver flags to 0x1 in UKC. This mechanism can enable a 
> > > > > USB
> > > > > console keyboard early during autoconf(9), which can come in handy at
> > > > > times. This needs USB polling mode to work, which is a bit broken. 
> > > > > Here
> > > > > is my attempt to fix it for xhci(4) controllers.
> > > > > 
> > > > > According to the xHCI specification section 4.2 "Host Controller
> > > > > Initalization", the host controller must be fully initialized before
> > > > > descending into device enumeration. Then xhci(4) sends command TRBs to
> > > > > open new pipes during enumeration. They wait for completion using
> > > > > tsleep(). This is bad when in polling mode at boot. And finally, the
> > > > > behavior should be the same on resume as it is at boot. Therefore also
> > > > > enumerate USB devices during resume when the flag is set.
> > > > > 
> > > > > I am specifically looking for tests on xhci controllers with usb(4)
> > > > > flags set to 1 in UKC.
> > > > > 
> > > > > So long,
> > > > > 
> > > > > 
> > > > >  - Christian
> > > > > 
> > > > > 
> > > > > diff --git a/sys/arch/armv7/marvell/mvxhci.c 
> > > > > b/sys/arch/armv7/marvell/mvxhci.c
> > > > > index 38a636fd123..2137f68b816 100644
> > > > > --- a/sys/arch/armv7/marvell/mvxhci.c
> > > > > +++ b/sys/arch/armv7/marvell/mvxhci.c
> > > > > @@ -155,12 +155,12 @@ mvxhci_attach(struct device *parent, struct 
> > > > > device *self, void *aux)
> > > > >   goto disestablish_ret;
> > > > >   }
> > > > >  
> > > > > - /* Attach usb device. */
> > > > > - config_found(self, &sc->sc.sc_bus, usbctlprint);
> > > > > -
> > > > >   /* Now that the stack is ready, config' the HC and enable 
> > > > > interrupts. */
> > > > >   xhci_config(&sc->sc);
> > > > >  
> > > > > + /* Attach usb device. */
> > > > > + config_found(self, &sc->sc.sc_bus, usbctlprint);
> > > > > +
> > > > >   return;
> > > > >  
> > > > >  disestablish_ret:
> > > > > diff --git a/sys/dev/acpi/xhci_acpi.c b/sys/dev/acpi/xhci_acpi.c
> > > > > index 95e69cee896..d762f69a00e 100644
> > > > > --- a/sys/dev/acpi/xhci_acpi.c
> > > > > +++ b/sys/dev/acpi/xhci_acpi.c
> > > > > @@ -112,12 +112,12 @@ xhci_acpi_attach(struct device *parent, struct 
> > > > > device *self, void *aux)
> > > > >   goto disestablish_ret;
> > > > >   }
> > > > >  
> > > > > - /* Attach usb device. */
> > > > > - config_found(self, &sc->sc.sc_bus, usbctlprint);
> > > > > -
> > > > >   /* Now that the stack is ready, config' the HC and enable 
> > > > > interrupts. */
> > > > >   xhci_config(&sc->sc);
> > > > >  
> > > > > + /* Attach usb device. */
> > > > > + config_found(self, &sc->sc.sc_bus, usbctlprint);
> > > > > +
> > > > >   return;
> > > > >  
> > > > >  disestablish_ret:
> > > > > diff --git a/sys/dev/fdt/xhci_fdt.c b/sys/dev/fdt/xhci_fdt.c
> > > > > index 38c976a6b24..84e00bdadc5 100644
> > > > > --- a/sys/dev/fdt/xhci_fdt.c
> > > > > +++ b/sys/dev/fdt/xhci_fdt.c
> > > > > @@ -116,12 +116,12 @@ xhci_fdt_attach(struct device *parent, struct 
> > > > > device *self, void *aux)
> > > > >   goto disestablish_ret;
> > > > >   }
> > > > >  
> > > > > - /* Attach usb device. */
> > > > &

Re: amd64: softintr_dispatch: remove kernel lock

2021-05-22 Thread Patrick Wildt

Am Sat, May 22, 2021 at 02:33:47PM +0200 schrieb Mark Kettenis:
> > Date: Sat, 22 May 2021 11:11:38 +
> > From: Visa Hankala 
> > 
> > On Wed, May 19, 2021 at 05:11:09PM -0500, Scott Cheloha wrote:
> > > Hi,
> > > 
> > > visa@ says I need to unlock softintr_dispatch() before I can
> > > unlock softclock(), so let's do that.
> > > 
> > > Additionally, when we call softintr_disestablish() we want to wait for
> > > the underlying softintr handle to finish running if it is running.
> > > 
> > > We can start with amd64.
> > > 
> > > I think this approach will work:
> > > 
> > > - Keep a pointer to the running softintr, if any, in the queue.  NULL
> > >   the pointer when we return from sih_func().
> > > 
> > > - Take/release the kernel lock if the SI_MPSAFE flag is present when
> > >   we enter/leave sih_func().
> > > 
> > > - If the handle is running when you call softintr_disestablish(), spin
> > >   until the handle isn't running anymore and retry.
> > > 
> > > There is no softintr manpage but I think it is understood that
> > > softintr_disestablish() is only safe to call from a process context,
> > > otherwise you may deadlock.  Maybe we should do splassert(IPL_NONE)?
> > > 
> > > We could probably sleep here instead of spinning.  We'd have to change
> > > every softintr_disestablish() implementation to do that, though.
> > > Otherwise you'd have different behavior on different platforms.
> > 
> > I think your diff does not pay enough attention to the fact that soft
> > interrupts are handled by all CPUs. I think the diff that I posted
> > a while ago [1] is better in that respect.
> > 
> > Two biggest things that I do not like in my original diff are
> > synchronization of handler execution, and use of the SMR barrier.
> > 
> > [1] https://marc.info/?l=openbsd-tech&m=162092714911609
> > 
> > The kernel lock has guaranteed that at most one CPU is able to run
> > a given soft interrupt handler at a time. My diff used a mutex to
> > prevent concurrent execution. However, it is wasteful to spin. It would
> > be more economical to let the current runner of the handler re-execute
> > the code.
> > 
> > The SMR barrier in softintr_disestablish() was a trick to drain any
> > pending activity. However, it made me feel uneasy because I have not
> > checked every caller of softintr_disestablish(). My main worry is not
> > the latency but unexpected side effects.
> > 
> > Below is a revised diff that improves the above two points.
> > 
> > When a soft interrupt handler is scheduled, it is assigned to a CPU.
> > That CPU will keep running the handler as long as there are pending
> > requests. Once all pending requests have been drained, the CPU
> > relinquishes its hold of the handler. This provides natural
> > serialization.
> > 
> > Now softintr_disestablish() uses spinning for draining activity.
> > I still have slight qualms about this, though, because the feature
> > has not been so explicit before. Integration with witness(4) might be
> > in order.
> > 
> > softintr_disestablish() uses READ_ONCE() to enforce reloading of the
> > value in the busy-wait loop. This way the variable does not need to be
> > volatile. (As yet another option, CPU_BUSY_CYCLE() could always
> > imply memory clobbering, which should make an optimizing compiler
> > redo the load.) For consistency with this READ_ONCE(), WRITE_ONCE() is
> > used whenever the variable is written, excluding the initialization.
> > 
> > The patch uses a single mutex for access serialization. The old code
> > has used one mutex per each soft IPL level, but I am not sure how
> > useful that has been. I think it would be better to have a separate
> > mutex for each CPU. However, the increased code complexity might not
> > be worthwhile at the moment. Even having the per-CPU queues has
> > a whiff of being somewhat overkill.
> 
> A few comments:
> 
> * Looking at amd64 in isolation does not make sense.  Like a lot of MD
>   code in OpenBSD the softintr code was copied from whatever
>   Net/FreeBSD had at the time, with no attempt at unification (it
>   works, check it in, don't go back to clean it up).  However, with
>   powerpc64 and riscv64 we try to do things a little bit better in
>   that area.  So arm64, powerpc64 and riscv64 share the same softintr
>   implementation that already implements softintr_establish_flags()
>   with SOFTINTR_ESTABLISH_MPSAFE.  Now we haven't used that flag
>   anywhere in our tree yet, so the code might be completely busted.
>   But it may make a lot of sense to migrate other architectures to the
>   same codebase.
> 
> * The softintr_disestablish() function isn't used a lot in our tree.
>   It may make sense to postpone worrying about safely disestablishing
>   mpsafe soft interrupts for now and simply panic if someone tries to
>   do this.
> 
> * Wouldn't it make sense for an mpsafe soft interrupt to protect
>   itself from running simultaniously on multiple CPUs?  It probably
>   already needs some sort of lock to protect th

mcx(4): sync only received length on RX

2021-05-31 Thread Patrick Wildt

Hi,

mcx(4) seems to sync the whole mapsize on processing a received packet.
As far as I know, we usually only sync the actual size that we have
received.  Noticed this when doing bounce buffer tests, seeing that
it copied a lot more data than is necessary.

That's because the RX buffer size is maximum supported MTU, which is
about 9500 bytes or so.  For small packets, or regular 1500 bytes,
this adds overhead.

This change should not change anything for ARM machines that have a
cache coherent PCIe bus or x86.

ok?

Patrick

diff --git a/sys/dev/pci/if_mcx.c b/sys/dev/pci/if_mcx.c
index 38437e54897..065855d46d3 100644
--- a/sys/dev/pci/if_mcx.c
+++ b/sys/dev/pci/if_mcx.c
@@ -6800,20 +6800,20 @@ mcx_process_rx(struct mcx_softc *sc, struct mcx_rx *rx,
 {
struct mcx_slot *ms;
struct mbuf *m;
-   uint32_t flags;
+   uint32_t flags, len;
int slot;
 
+   len = bemtoh32(&cqe->cq_byte_cnt);
slot = betoh16(cqe->cq_wqe_count) % (1 << MCX_LOG_RQ_SIZE);
 
ms = &rx->rx_slots[slot];
-   bus_dmamap_sync(sc->sc_dmat, ms->ms_map, 0, ms->ms_map->dm_mapsize,
-   BUS_DMASYNC_POSTREAD);
+   bus_dmamap_sync(sc->sc_dmat, ms->ms_map, 0, len, BUS_DMASYNC_POSTREAD);
bus_dmamap_unload(sc->sc_dmat, ms->ms_map);
 
m = ms->ms_m;
ms->ms_m = NULL;
 
-   m->m_pkthdr.len = m->m_len = bemtoh32(&cqe->cq_byte_cnt);
+   m->m_pkthdr.len = m->m_len = len;
 
if (cqe->cq_rx_hash_type) {
m->m_pkthdr.ph_flowid = betoh32(cqe->cq_rx_hash);

nvme(4): fix prpl sync length

2021-05-31 Thread Patrick Wildt

Hi,

this call to sync the DMA mem wants to sync N - 1 number of prpl
entries, as the first segment is configured regularly, while the
addresses for the following segments (if more than 2), are in a
special DMA memory.

The code currently removes a single byte, instead of an entry.
This just means that it is syncing more than it should.

ok?

Patrick

diff --git a/sys/dev/ic/nvme.c b/sys/dev/ic/nvme.c
index 62b8e40c626..6db25260ef0 100644
--- a/sys/dev/ic/nvme.c
+++ b/sys/dev/ic/nvme.c
@@ -629,7 +629,7 @@ nvme_scsi_io(struct scsi_xfer *xs, int dir)
bus_dmamap_sync(sc->sc_dmat,
NVME_DMA_MAP(sc->sc_ccb_prpls),
ccb->ccb_prpl_off,
-   sizeof(*ccb->ccb_prpl) * dmap->dm_nsegs - 1,
+   sizeof(*ccb->ccb_prpl) * (dmap->dm_nsegs - 1),
BUS_DMASYNC_PREWRITE);
}
 
@@ -691,7 +691,7 @@ nvme_scsi_io_done(struct nvme_softc *sc, struct nvme_ccb 
*ccb,
bus_dmamap_sync(sc->sc_dmat,
NVME_DMA_MAP(sc->sc_ccb_prpls),
ccb->ccb_prpl_off,
-   sizeof(*ccb->ccb_prpl) * dmap->dm_nsegs - 1,
+   sizeof(*ccb->ccb_prpl) * (dmap->dm_nsegs - 1),
BUS_DMASYNC_POSTWRITE);
}

Re: sdmmc(4): check and retry bus width change

2021-06-01 Thread Patrick Wildt

Am Mon, Feb 22, 2021 at 08:10:21PM +0100 schrieb Patrick Wildt:
> Hi,
> 
> it seems like some eMMCs are not capable of doing 8-bit operation,
> even if the controller supports it.  I was questioning our drivers
> first, but it looks like it's the same on Linux.  In the case that
> 8-bit doesn't work, they seem to fall back to lower values to make
> that HW work.
> 
> This diff implements a mechanism that tries 8-bit, if available,
> then 4-bit and in the end falls back to 1-bit.  This makes my HW
> work, but I would like to have this tested by a broader audience.
> 
> Apparently there's a "bus test" command, but it isn't implemented
> on all host controllers.  Hence I simply try to read the EXT_CSD
> to make sure the transfer works.
> 
> For testing, a print like
> 
> printf("%s: using %u-bit width\n", DEVNAME(sc), width);
> 
> could be added at line 928.
> 
> What could possible regressions be?  The width could become smaller
> then previously.  This would reduce the read/write transfer speed.
> Also it's possible that eMMCs are not recognized/initialized anymore.
> 
> What could possible improvements be?  eMMCs that previously didn't
> work now work, with at least 1-bit or 4-bit wide transfers.
> 
> Please note that this only works for eMMCs.  SD cards are *not* using
> this code path.  SD cards have a different initialization code path.
> 
> Please report any changes or non-changes.  If nothing changes, that's
> perfect.
> 
> Patrick

Anyone want to give this a try?  It's basically relevant for all
ARM machines with eMMC ('soldered SD cards'), and they should work
as well as before.

> diff --git a/sys/dev/sdmmc/sdmmc_mem.c b/sys/dev/sdmmc/sdmmc_mem.c
> index 59bcb1b4a11..5856b9bb1b3 100644
> --- a/sys/dev/sdmmc/sdmmc_mem.c
> +++ b/sys/dev/sdmmc/sdmmc_mem.c
> @@ -56,6 +56,8 @@ int sdmmc_mem_signal_voltage(struct sdmmc_softc *, int);
>  
>  int  sdmmc_mem_sd_init(struct sdmmc_softc *, struct sdmmc_function *);
>  int  sdmmc_mem_mmc_init(struct sdmmc_softc *, struct sdmmc_function *);
> +int  sdmmc_mem_mmc_select_bus_width(struct sdmmc_softc *,
> + struct sdmmc_function *, int);
>  int  sdmmc_mem_single_read_block(struct sdmmc_function *, int, u_char *,
>   size_t);
>  int  sdmmc_mem_read_block_subr(struct sdmmc_function *, bus_dmamap_t,
> @@ -908,31 +910,20 @@ sdmmc_mem_mmc_init(struct sdmmc_softc *sc, struct 
> sdmmc_function *sf)
>   ext_csd[EXT_CSD_CARD_TYPE]);
>   }
>  
> - if (ISSET(sc->sc_caps, SMC_CAPS_8BIT_MODE)) {
> + if (ISSET(sc->sc_caps, SMC_CAPS_8BIT_MODE) &&
> + sdmmc_mem_mmc_select_bus_width(sc, sf, 8) == 0)
>   width = 8;
> - value = EXT_CSD_BUS_WIDTH_8;
> - } else if (ISSET(sc->sc_caps, SMC_CAPS_4BIT_MODE)) {
> + else if (ISSET(sc->sc_caps, SMC_CAPS_4BIT_MODE) &&
> + sdmmc_mem_mmc_select_bus_width(sc, sf, 4) == 0)
>   width = 4;
> - value = EXT_CSD_BUS_WIDTH_4;
> - } else {
> - width = 1;
> - value = EXT_CSD_BUS_WIDTH_1;
> - }
> -
> - if (width != 1) {
> - error = sdmmc_mem_mmc_switch(sf, EXT_CSD_CMD_SET_NORMAL,
> - EXT_CSD_BUS_WIDTH, value);
> - if (error == 0)
> - error = sdmmc_chip_bus_width(sc->sct,
> - sc->sch, width);
> - else {
> + else {
> + error = sdmmc_mem_mmc_select_bus_width(sc, sf, 1);
> + if (error != 0) {
>   DPRINTF(("%s: can't change bus width"
>   " (%d bit)\n", DEVNAME(sc), width));
>   return error;
>   }
> -
> - /* : need bus test? (using by CMD14 & CMD19) */
> - sdmmc_delay(1);
> + width = 1;
>   }
>  
>   if (timing != SDMMC_TIMING_LEGACY) {
> @@ -1041,6 +1032,59 @@ sdmmc_mem_mmc_init(struct sdmmc_softc *sc, struct 
> sdmmc_function *sf)
>   return error;
>  }
>  
> +int
> +sdmmc_mem_mmc_select_bus_width(struct sdmmc_softc *sc, struct sdmmc_function 
> *sf,
> +int width)
> +{
> + u_int8_t ext_csd[512];
> + int error, value;
> +
> + switch (width) {
> + case 8:
> + value = EXT_CSD_BUS_WIDTH_8;
> +

Re: 10gbase-r support for mvpp(4)

2021-06-02 Thread Patrick Wildt

Am Wed, Jun 02, 2021 at 10:37:36PM +0200 schrieb Mark Kettenis:
> Linux folks changed the device tree to use 10gbase-r instead of
> 10gbase-kr since "it is more correct".  Then the UEFI folks synched
> their device trees to Linux and the 10G ports broke.  So accept both
> in the code.
> 
> ok?
> 

Sigh. sure, ok patrick@.

Though you could probably put it into a single if (!strncmp ||
!strncmp), but I don't mind either way.

> 
> Index: dev/fdt/if_mvpp.c
> ===
> RCS file: /cvs/src/sys/dev/fdt/if_mvpp.c,v
> retrieving revision 1.44
> diff -u -p -r1.44 if_mvpp.c
> --- dev/fdt/if_mvpp.c 12 Dec 2020 11:48:52 -  1.44
> +++ dev/fdt/if_mvpp.c 2 Jun 2021 20:34:30 -
> @@ -1354,7 +1354,9 @@ mvpp2_port_attach(struct device *parent,
>  
>   phy_mode = malloc(len, M_TEMP, M_WAITOK);
>   OF_getprop(sc->sc_node, "phy-mode", phy_mode, len);
> - if (!strncmp(phy_mode, "10gbase-kr", strlen("10gbase-kr")))
> + if (!strncmp(phy_mode, "10gbase-r", strlen("10gbase-r")))
> + sc->sc_phy_mode = PHY_MODE_10GBASER;
> + else if (!strncmp(phy_mode, "10gbase-kr", strlen("10gbase-kr")))
>   sc->sc_phy_mode = PHY_MODE_10GBASER;
>   else if (!strncmp(phy_mode, "2500base-x", strlen("2500base-x")))
>   sc->sc_phy_mode = PHY_MODE_2500BASEX;
>

Re: Patch: Send AUTHENTICATION_FAILED in case of unexpected auth method or auth data not being accessible

2021-06-29 Thread Patrick Wildt

Am Tue, Jun 29, 2021 at 10:39:06AM + schrieb Claudia Priesterjahn:
> We added two AUTHENTICATION_FAILED notifications for the cases that
> the peer used an unexepected authentication method and for the case
> that the peer's authentication data is not accessible.

Bit of a spacing issue, but that can be fixed prior to commit.  I'd also
drop the comments, since the function call seems self explanatory.  With
that changed, ok patrick@.

> diff --git a/sbin/iked/ikev2.c b/sbin/iked/ikev2.c
> index 9e890979110..1870dc18459 100644
> --- a/sbin/iked/ikev2.c
> +++ b/sbin/iked/ikev2.c
> @@ -805,6 +805,8 @@ ikev2_auth_verify(struct iked *env, struct iked_sa *sa)
> ikev2_auth_map),
> print_map(ikeauth.auth_method,
> ikev2_auth_map));
> +   /* send N(AUTHENTICATION_FAILED) back */
> +   ikev2_send_auth_failed(env, sa);
> return (-1);
> }
> ikeauth.auth_method = sa->sa_peerauth.id_type;
> @@ -813,6 +815,8 @@ ikev2_auth_verify(struct iked *env, struct iked_sa *sa)
> sa->sa_hdr.sh_initiator)) == NULL) {
> log_debug("%s: failed to get auth data",
> __func__);
> +   /* send N(AUTHENTICATION_FAILED) back */
> +   ikev2_send_auth_failed(env, sa);
> return (-1);
> }
>

dwiic(4): wait for tx empty when hitting tx limit

2021-07-04 Thread Patrick Wildt

Hi,

I had trouble interfacing with a machine's IPMI through dwiic(4).  What
I saw was that when sending 'bigger' commands, it would never receive
the STOP bit interrupt.

The trouble is, as can be seen in the log, that we want to send (it
says read, but it's a write OP, so it's send) 20 bytes, but the tx
limit says 14.

What we should do is send 14 bytes, then wait for it to send us the
tx empty interrupt (like we do when we first enable the controller),
and then re-read the tx limit.  The last line in the log is some
debug print I added for myself, but is not part of the diff.

With this, I was finally able to change the IPMI password and regain
access to the web interface after updating the BMC's firmware...

dwiic0: dwiic_i2c_exec: op 7, addr 0x10, cmdlen 2, len 3, flags 0x00
dwiic0: dwiic_i2c_exec: need to read 3 bytes, can send 14 read reqs
dwiic0: dwiic_i2c_exec: op 5, addr 0x10, cmdlen 1, len 33, flags 0x00
dwiic0: dwiic_i2c_exec: need to read 33 bytes, can send 15 read reqs
dwiic0: dwiic_i2c_exec: op 7, addr 0x10, cmdlen 2, len 20, flags 0x00
dwiic0: dwiic_i2c_exec: need to read 20 bytes, can send 14 read reqs
dwiic0: new tx limit 8

Opinions? ok?

Patrick

diff --git a/sys/dev/ic/dwiic.c b/sys/dev/ic/dwiic.c
index 84d97b8645b..d04a7b03979 100644
--- a/sys/dev/ic/dwiic.c
+++ b/sys/dev/ic/dwiic.c
@@ -416,6 +416,21 @@ dwiic_i2c_exec(void *cookie, i2c_op_t op, i2c_addr_t addr, 
const void *cmdbuf,
tx_limit = sc->tx_fifo_depth -
dwiic_read(sc, DW_IC_TXFLR);
}
+
+   if (I2C_OP_WRITE_P(op) && tx_limit == 0 && x < len) {
+   s = splbio();
+   dwiic_read(sc, DW_IC_CLR_INTR);
+   dwiic_write(sc, DW_IC_INTR_MASK, DW_IC_INTR_TX_EMPTY);
+
+   if (tsleep_nsec(&sc->sc_writewait, PRIBIO, "dwiic",
+   MSEC_TO_NSEC(500)) != 0)
+   printf("%s: timed out waiting for tx_empty "
+   "intr\n", sc->sc_dev.dv_xname);
+   splx(s);
+
+   tx_limit = sc->tx_fifo_depth -
+   dwiic_read(sc, DW_IC_TXFLR);
+   }
}
 
if (I2C_OP_STOP_P(op) && I2C_OP_WRITE_P(op)) {

Re: dwiic(4): wait for tx empty when hitting tx limit

2021-07-05 Thread Patrick Wildt

Am Mon, Jul 05, 2021 at 06:34:31PM +0200 schrieb Mark Kettenis:
> > Date: Mon, 5 Jul 2021 00:04:24 +0200
> > From: Patrick Wildt 
> > 
> > Hi,
> > 
> > I had trouble interfacing with a machine's IPMI through dwiic(4).  What
> > I saw was that when sending 'bigger' commands, it would never receive
> > the STOP bit interrupt.
> > 
> > The trouble is, as can be seen in the log, that we want to send (it
> > says read, but it's a write OP, so it's send) 20 bytes, but the tx
> > limit says 14.
> > 
> > What we should do is send 14 bytes, then wait for it to send us the
> > tx empty interrupt (like we do when we first enable the controller),
> > and then re-read the tx limit.  The last line in the log is some
> > debug print I added for myself, but is not part of the diff.
> > 
> > With this, I was finally able to change the IPMI password and regain
> > access to the web interface after updating the BMC's firmware...
> > 
> > dwiic0: dwiic_i2c_exec: op 7, addr 0x10, cmdlen 2, len 3, flags 0x00
> > dwiic0: dwiic_i2c_exec: need to read 3 bytes, can send 14 read reqs
> > dwiic0: dwiic_i2c_exec: op 5, addr 0x10, cmdlen 1, len 33, flags 0x00
> > dwiic0: dwiic_i2c_exec: need to read 33 bytes, can send 15 read reqs
> > dwiic0: dwiic_i2c_exec: op 7, addr 0x10, cmdlen 2, len 20, flags 0x00
> > dwiic0: dwiic_i2c_exec: need to read 20 bytes, can send 14 read reqs
> > dwiic0: new tx limit 8
> > 
> > Opinions? ok?
> 
> I think you're on to something.  But this needs to handle I2C_F_POLL.

True that.  The previous code, which waits for the controller to accept
commands, just does DELAY(200), but I'm not sure that's good enough for
inbetween transfers.  One can apparently though poll through the raw
interrupt status register, where the interrupt mask isn't applied.  So
maybe like that?  Guess I should try setting ipmi to polling mode...

> > diff --git a/sys/dev/ic/dwiic.c b/sys/dev/ic/dwiic.c
> > index 84d97b8645b..d04a7b03979 100644
> > --- a/sys/dev/ic/dwiic.c
> > +++ b/sys/dev/ic/dwiic.c
> > @@ -416,6 +416,21 @@ dwiic_i2c_exec(void *cookie, i2c_op_t op, i2c_addr_t 
> > addr, const void *cmdbuf,
> > tx_limit = sc->tx_fifo_depth -
> > dwiic_read(sc, DW_IC_TXFLR);
> > }
> > +
> > +   if (I2C_OP_WRITE_P(op) && tx_limit == 0 && x < len) {
> > +   s = splbio();
> > +   dwiic_read(sc, DW_IC_CLR_INTR);
> > +   dwiic_write(sc, DW_IC_INTR_MASK, DW_IC_INTR_TX_EMPTY);
> > +
> > +   if (tsleep_nsec(&sc->sc_writewait, PRIBIO, "dwiic",
> > +   MSEC_TO_NSEC(500)) != 0)
> > +   printf("%s: timed out waiting for tx_empty "
> > +   "intr\n", sc->sc_dev.dv_xname);
> > +   splx(s);
> > +
> > +   tx_limit = sc->tx_fifo_depth -
> > +   dwiic_read(sc, DW_IC_TXFLR);
> > +   }
> > }
> >  
> > if (I2C_OP_STOP_P(op) && I2C_OP_WRITE_P(op)) {
> > 
> > 
>

Re: dwiic(4): wait for tx empty when hitting tx limit

2021-07-05 Thread Patrick Wildt

Am Mon, Jul 05, 2021 at 07:07:24PM +0200 schrieb Mark Kettenis:
> > Date: Mon, 5 Jul 2021 19:02:32 +0200
> > From: Patrick Wildt 
> > 
> > Am Mon, Jul 05, 2021 at 06:34:31PM +0200 schrieb Mark Kettenis:
> > > > Date: Mon, 5 Jul 2021 00:04:24 +0200
> > > > From: Patrick Wildt 
> > > > 
> > > > Hi,
> > > > 
> > > > I had trouble interfacing with a machine's IPMI through dwiic(4).  What
> > > > I saw was that when sending 'bigger' commands, it would never receive
> > > > the STOP bit interrupt.
> > > > 
> > > > The trouble is, as can be seen in the log, that we want to send (it
> > > > says read, but it's a write OP, so it's send) 20 bytes, but the tx
> > > > limit says 14.
> > > > 
> > > > What we should do is send 14 bytes, then wait for it to send us the
> > > > tx empty interrupt (like we do when we first enable the controller),
> > > > and then re-read the tx limit.  The last line in the log is some
> > > > debug print I added for myself, but is not part of the diff.
> > > > 
> > > > With this, I was finally able to change the IPMI password and regain
> > > > access to the web interface after updating the BMC's firmware...
> > > > 
> > > > dwiic0: dwiic_i2c_exec: op 7, addr 0x10, cmdlen 2, len 3, flags 0x00
> > > > dwiic0: dwiic_i2c_exec: need to read 3 bytes, can send 14 read reqs
> > > > dwiic0: dwiic_i2c_exec: op 5, addr 0x10, cmdlen 1, len 33, flags 0x00
> > > > dwiic0: dwiic_i2c_exec: need to read 33 bytes, can send 15 read reqs
> > > > dwiic0: dwiic_i2c_exec: op 7, addr 0x10, cmdlen 2, len 20, flags 0x00
> > > > dwiic0: dwiic_i2c_exec: need to read 20 bytes, can send 14 read reqs
> > > > dwiic0: new tx limit 8
> > > > 
> > > > Opinions? ok?
> > > 
> > > I think you're on to something.  But this needs to handle I2C_F_POLL.
> > 
> > True that.  The previous code, which waits for the controller to accept
> > commands, just does DELAY(200), but I'm not sure that's good enough for
> > inbetween transfers.  One can apparently though poll through the raw
> > interrupt status register, where the interrupt mask isn't applied.  So
> > maybe like that?  Guess I should try setting ipmi to polling mode...
> 
> Polling the interrupt status register should work I suppose.  But for
> read operations we actually poll the DW_IC_RXFLR register.

Yeah, that would work for TX as well.  Maybe something like this, but
then the diff still needs to address what happens when we timeout and
there's still no tx_limit > 0.  Maybe timeout like the read stuff:

if (rx_avail == 0) {
printf("%s: timed out reading remaining %d\n",
sc->sc_dev.dv_xname, (int)(len - readpos));
sc->sc_i2c_xfer.error = 1;
sc->sc_busy = 0;

return (1);
}

diff --git a/sys/dev/ic/dwiic.c b/sys/dev/ic/dwiic.c
index 84d97b8645b..d5d77a52b73 100644
--- a/sys/dev/ic/dwiic.c
+++ b/sys/dev/ic/dwiic.c
@@ -416,6 +416,33 @@ dwiic_i2c_exec(void *cookie, i2c_op_t op, i2c_addr_t addr, 
const void *cmdbuf,
tx_limit = sc->tx_fifo_depth -
dwiic_read(sc, DW_IC_TXFLR);
}
+
+   if (I2C_OP_WRITE_P(op) && tx_limit == 0 && x < len) {
+   if (flags & I2C_F_POLL) {
+   for (retries = 1000; retries > 0; retries--) {
+   tx_limit = sc->tx_fifo_depth -
+   dwiic_read(sc, DW_IC_TXFLR);
+   if (tx_limit > 0)
+   break;
+   DELAY(50);
+   }
+   } else {
+   s = splbio();
+   dwiic_read(sc, DW_IC_CLR_INTR);
+   dwiic_write(sc, DW_IC_INTR_MASK,
+   DW_IC_INTR_TX_EMPTY);
+
+   if (tsleep_nsec(&sc->sc_writewait, PRIBIO,
+   "dwiic", MSEC_TO_NSEC(500)) != 0)
+   printf("%s: timed out waiting for "
+   "tx_empty intr\n",
+   sc->sc_dev.dv_xname);
+   splx(s);
+
+   tx_limit = sc->tx_fifo_depth -
+   dwiic_read(sc, DW_IC_TXFLR);
+   }
+   }
}
 
if (I2C_OP_STOP_P(op) && I2C_OP_WRITE_P(op)) {

Re: dwiic(4): wait for tx empty when hitting tx limit

2021-07-13 Thread Patrick Wildt

Am Mon, Jul 05, 2021 at 07:52:28PM +0200 schrieb Mark Kettenis:
> > Date: Mon, 5 Jul 2021 19:30:28 +0200
> > From: Patrick Wildt 
> > 
> > Am Mon, Jul 05, 2021 at 07:07:24PM +0200 schrieb Mark Kettenis:
> > > > Date: Mon, 5 Jul 2021 19:02:32 +0200
> > > > From: Patrick Wildt 
> > > > 
> > > > Am Mon, Jul 05, 2021 at 06:34:31PM +0200 schrieb Mark Kettenis:
> > > > > > Date: Mon, 5 Jul 2021 00:04:24 +0200
> > > > > > From: Patrick Wildt 
> > > > > > 
> > > > > > Hi,
> > > > > > 
> > > > > > I had trouble interfacing with a machine's IPMI through dwiic(4).  
> > > > > > What
> > > > > > I saw was that when sending 'bigger' commands, it would never 
> > > > > > receive
> > > > > > the STOP bit interrupt.
> > > > > > 
> > > > > > The trouble is, as can be seen in the log, that we want to send (it
> > > > > > says read, but it's a write OP, so it's send) 20 bytes, but the tx
> > > > > > limit says 14.
> > > > > > 
> > > > > > What we should do is send 14 bytes, then wait for it to send us the
> > > > > > tx empty interrupt (like we do when we first enable the controller),
> > > > > > and then re-read the tx limit.  The last line in the log is some
> > > > > > debug print I added for myself, but is not part of the diff.
> > > > > > 
> > > > > > With this, I was finally able to change the IPMI password and regain
> > > > > > access to the web interface after updating the BMC's firmware...
> > > > > > 
> > > > > > dwiic0: dwiic_i2c_exec: op 7, addr 0x10, cmdlen 2, len 3, flags 0x00
> > > > > > dwiic0: dwiic_i2c_exec: need to read 3 bytes, can send 14 read reqs
> > > > > > dwiic0: dwiic_i2c_exec: op 5, addr 0x10, cmdlen 1, len 33, flags 
> > > > > > 0x00
> > > > > > dwiic0: dwiic_i2c_exec: need to read 33 bytes, can send 15 read reqs
> > > > > > dwiic0: dwiic_i2c_exec: op 7, addr 0x10, cmdlen 2, len 20, flags 
> > > > > > 0x00
> > > > > > dwiic0: dwiic_i2c_exec: need to read 20 bytes, can send 14 read reqs
> > > > > > dwiic0: new tx limit 8
> > > > > > 
> > > > > > Opinions? ok?
> > > > > 
> > > > > I think you're on to something.  But this needs to handle I2C_F_POLL.
> > > > 
> > > > True that.  The previous code, which waits for the controller to accept
> > > > commands, just does DELAY(200), but I'm not sure that's good enough for
> > > > inbetween transfers.  One can apparently though poll through the raw
> > > > interrupt status register, where the interrupt mask isn't applied.  So
> > > > maybe like that?  Guess I should try setting ipmi to polling mode...
> > > 
> > > Polling the interrupt status register should work I suppose.  But for
> > > read operations we actually poll the DW_IC_RXFLR register.
> > 
> > Yeah, that would work for TX as well.  Maybe something like this, but
> > then the diff still needs to address what happens when we timeout and
> > there's still no tx_limit > 0.  Maybe timeout like the read stuff:
> > 
> > if (rx_avail == 0) {
> > printf("%s: timed out reading remaining %d\n",
> > sc->sc_dev.dv_xname, (int)(len - readpos));
> > sc->sc_i2c_xfer.error = 1;
> > sc->sc_busy = 0;
> > 
> > return (1);
> > }
> 
> Yes.

This works for me. ok?

Patrick

diff --git a/sys/dev/ic/dwiic.c b/sys/dev/ic/dwiic.c
index 84d97b8645b..8588b0905ea 100644
--- a/sys/dev/ic/dwiic.c
+++ b/sys/dev/ic/dwiic.c
@@ -416,6 +416,42 @@ dwiic_i2c_exec(void *cookie, i2c_op_t op, i2c_addr_t addr, 
const void *cmdbuf,
tx_limit = sc->tx_fifo_depth -
dwiic_read(sc, DW_IC_TXFLR);
}
+
+   if (I2C_OP_WRITE_P(op) && tx_limit == 0 && x < len) {
+   if (flags & I2C_F_POLL) {
+   for (retries = 1000; retries > 0; retries--) {
+   tx_limit = sc->tx_fifo_depth -
+   dwiic_read(sc, DW_IC_TXFLR);
+   if (tx_limit > 0)
+

Re: pf icmp reflect

2021-07-26 Thread Patrick Wildt

On Mon, Jul 26, 2021 at 06:41:42PM +0200, Alexander Bluhm wrote:
> Hi,
> 
> The mbuf header cleanup I added in revision 1.173 of ip_icmp.c is
> too strict.  ICMP error packets generated by pf are not passed
> immediately, but may be blocked.  Preserve PF_TAG_GENERATED flag
> in icmp_reflect() and icmp6_reflect().
> 
> ok?

While I do prefer uint8_t, the struct member is defined as u_int8_t, so
I guess for consistency we can use that.

ok patrick@

> bluhm
> 
> Index: netinet/ip_icmp.c
> ===
> RCS file: /data/mirror/openbsd/cvs/src/sys/netinet/ip_icmp.c,v
> retrieving revision 1.186
> diff -u -p -r1.186 ip_icmp.c
> --- netinet/ip_icmp.c 30 Mar 2021 08:37:10 -  1.186
> +++ netinet/ip_icmp.c 26 Jul 2021 14:10:37 -
> @@ -691,6 +691,7 @@ icmp_reflect(struct mbuf *m, struct mbuf
>   struct rtentry *rt = NULL;
>   int optlen = (ip->ip_hl << 2) - sizeof(struct ip);
>   u_int rtableid;
> + u_int8_t pfflags;
>  
>   if (!in_canforward(ip->ip_src) &&
>   ((ip->ip_src.s_addr & IN_CLASSA_NET) !=
> @@ -704,8 +705,10 @@ icmp_reflect(struct mbuf *m, struct mbuf
>   return (ELOOP);
>   }
>   rtableid = m->m_pkthdr.ph_rtableid;
> + pfflags = m->m_pkthdr.pf.flags;
>   m_resethdr(m);
>   m->m_pkthdr.ph_rtableid = rtableid;
> + m->m_pkthdr.pf.flags = pfflags & PF_TAG_GENERATED;
>  
>   /*
>* If the incoming packet was addressed directly to us,
> Index: netinet6/icmp6.c
> ===
> RCS file: /data/mirror/openbsd/cvs/src/sys/netinet6/icmp6.c,v
> retrieving revision 1.235
> diff -u -p -r1.235 icmp6.c
> --- netinet6/icmp6.c  10 Mar 2021 10:21:49 -  1.235
> +++ netinet6/icmp6.c  26 Jul 2021 15:42:33 -
> @@ -1052,6 +1052,7 @@ icmp6_reflect(struct mbuf **mp, size_t o
>   struct in6_addr t, *src = NULL;
>   struct sockaddr_in6 sa6_src, sa6_dst;
>   u_int rtableid;
> + u_int8_t pfflags;
>  
>   CTASSERT(sizeof(struct ip6_hdr) + sizeof(struct icmp6_hdr) <= MHLEN);
>  
> @@ -1069,8 +1070,10 @@ icmp6_reflect(struct mbuf **mp, size_t o
>   return (ELOOP);
>   }
>   rtableid = m->m_pkthdr.ph_rtableid;
> + pfflags = m->m_pkthdr.pf.flags;
>   m_resethdr(m);
>   m->m_pkthdr.ph_rtableid = rtableid;
> + m->m_pkthdr.pf.flags = pfflags & PF_TAG_GENERATED;
>  
>   /*
>* If there are extra headers between IPv6 and ICMPv6, strip
>

Re: [please test] amd64: schedule clock interrupts against system clock

2021-07-27 Thread Patrick Wildt

On Thu, Jun 24, 2021 at 09:50:07PM -0500, Scott Cheloha wrote:
> Hi,
> 
> I'm looking for testers for the attached patch.  You need an amd64
> machine with a lapic.
> 
> This includes:
> 
> - All "real" amd64 machines ever made
> - amd64 VMs running on hypervisors that provide a virtual lapic
> 
> Note that this does *not* include:
> 
> - amd64 VMs running on OpenBSD's vmm(4).
> 
> (I will ask for a separate round of testing for vmm(4) VMs, don't
> worry.)
> 
> The patch adds a new machine-independent clock interrupt scheduling
> layer (hereafter, "clockintr") to the kernel in kern/kern_clockintr.c,
> configures GENERIC amd64 kernels to use clockintr, and changes
> amd64/lapic.c to use clockintr instead of calling hardclock(9)
> directly.
> 
> Please apply the patch and make sure to reconfigure your kernel before
> recompiling/installing it to test.  I am especially interested in
> whether this breaks suspend/resume or hibernate/unhibernate.
> Suspend/resume is unaffected on my Lenovo X1C7, is the same true for
> your machine?  Please include a dmesg with your results.
> 
> Stats for the clockintr subsystem are exposed via sysctl(2).  If
> you're interested in providing them you can compile and run the
> program attached inline in my next mail.  A snippet of the output from
> across a suspend/resume is especially useful.
> 
> This is the end of the mail if you just want to test this.  If you are
> interested in the possible behavior changes or a description of how
> clockintr works, keep reading.
> 
> Thanks,
> 
> -Scott
> 
> --
> 
> There are some behavior changes, but I have found them to be small,
> harmless, and/or useful.  The first one is the most significant:
> 
> - Clockintr schedules events against the system clock, so hardclock(9)
>   ticks are pegged to the system clock, so the length of a tick is now
>   subject to NTP adjustment via adjtime(2) and adjfreq(2).
> 
>   In practice, NTP adjustment is very conservative.  In my testing the
>   delta between the raw frequency and the NTP frequency is small
>   when ntpd(8) is doing coarse correction with adjtime(2) and invisible
>   when ntpd(8) is doing fine correction with adjfreq(2).
> 
>   The upshot of the frequency difference is sometimes you will get
>   some spurious ("early") interrupts while ntpd(8) is correcting the
>   clock.  They go away when the ntpd(8) finishes synchronizing.
> 
>   FWIW: Linux, FreeBSD, and DragonflyBSD have all made this jump.
> 
> - hardclock(9) will run simultaneously on every CPU in the system.
>   This seems to be fine, but there might be some subtle contention
>   that gets worse as you add more CPUs.  Worth investigating.
> 
> - amd64 gets a pseudorandom statclock().  This is desirable, right?
> 
> - "Lost" or delayed ticks are handled by the clockintr layer
>   transparently.  This means that if the clock interrupt is delayed
>   due to e.g. hypervisor delays, we don't "lose" ticks and the
>   timeout schedule does not decay.
> 
>   This is super relevant for vmm(4), but it may also be relevant for
>   other hypervisors.

This sounds pretty good.  I remember someone having horrible time issues
because his VM was missing plenty of ticks, as the HyperVisor was over-
provisioned, or so.

So my X395 still suspends/resumes fine with your change, so that's nice.
Not sure if there's anything other than that which I should test, but so
far I don't notice any regression.

> 
> --
> 
> Last, here are notes for people interested in the design or the actual
> code.  Ask questions if something about my approach seems off, I have
> never added a subsystem to the kernel before.  The code has not
> changed much in the last six months so I think I am nearing a stable
> design.  I will document these interfaces in a real manpage soon.
> 
> - Clockintr prototypes are declared in .
> 
> - Clockintr depends on the timecounter and the system clock to do its
>   scheduling.  If there is no working timecounter the machine will
>   hang, as multitasking preemption will cease.
> 
> - Global clockintr initialization is done via clockintr_init().  You
>   call this from cpu_initclocks() on the primary CPU *after* you install
>   a timecounter.  The function sets a global frequency for the
>   hardclock(9) (required), a global frequency for the statclock()
>   (or zero if you don't want a statclock), and sets global behavior
>   flags.
> 
>   There is only one flag right now, CI_RNDSTAT, which toggles whether
>   the statclock() has a pseudorandom period.  If the platform has a
>   one-shot clock (e.g. amd64, arm64, etc.) it makes sense to set
>   CI_RNDSTAT.  If the platform does not have a one-shot clock (e.g.
>   alpha) there is no point in setting CI_RNDSTAT as the hardware
>   cannot provide the feature.
> 
> - Per-CPU clockintr initialization is done via clockintr_cpu_init().
>   On the primary CPU, call this immediately *after* you call
>   clockintr_init().  Secondary CPUs should call this late in
>   cpu_hatch(), probably right

Re: iked(8): Increase the default Child SA data lifetime limit

2021-08-03 Thread Patrick Wildt

Am Tue, Aug 03, 2021 at 01:40:51PM +0200 schrieb Tobias Heider:
> On Tue, Aug 03, 2021 at 12:17:38PM +0100, Stuart Henderson wrote:
> > On 2021/08/03 01:12, Vitaliy Makkoveev wrote:
> > > iked(8) uses 3 hours and 512 megabytes of processed data as default
> > > lifetime hard limits for Child SA. Also it sets 85-95% of these values as
> > > soft limit. iked(8) should perform rekeying before we reach hard limit
> > > otherwise this SA will be killed and the tunnel stopped. With default
> > > values the window is only 25-52 megabytes and we easily consume them
> > > before rekeying and the tunnel stops.
> > > 
> > > Hrvoje Popovski complained about such stops when he has tested ipsec(4)
> > > related diffs. I also tried iked(8) with my macos and found that simple
> > > "ping -f ..." makes rekeying impossible.
> > > 
> > > The hard limit could be modified in iked.conf(5) by setting "lifetime
> > > xxx bytes yyy", but the 5% difference between hard and soft limits forces
> > > to set bytes limit big enough, about 4G and more, which could be bad for
> > > security reason.
> > > 
> > > I propose to increase the default hard limit at least up to 1G. Also I
> > > propose to decrease the soft limit down to 50-60% of hard limit. This
> > > keeps the rekeying frequency but increases the update window to 410-512
> > > megabytes. Also this allow to keep bytes in "lifetime" setting small
> > > enough.
> > 
> > I have a couple of comments;
> > 
> > - this isn't a problem I've run into with real-world usage or when
> > running tcpbench over (moderately fast) internet connections - I'm not
> > saying it doesn't happen, but it seems relatively uncommon, with
> > connections at LAN speeds of course it's much more likely
> > 
> > - a 50% lower limit feels too low to me
> > 
> > - your jitter change affects lifetime both in seconds and in bytes,
> > I think changing the jitter for the seconds lifetime is a mistake
> > 
> > - the jitter change could result in some really short rekey intervals
> > if somebody has manually specified lifetimes which are the same as or less
> > than the current default
> > 
> > - looking at other IKEv2 implementations: if bytes lifetime is supported
> > at all (several implementations don't have it, only time-based lifetime),
> > the default settings rarely seem to use it
> > 
> > - 512MB is not really a lot of data
> > 
> > My first though now I know about this problem is just to increase the
> > default byte limit and leave the jitter range as-is. I don't know enough
> > about the security requirements of IKEv2 to know what demands it places
> > on rekeying, but given the above (especially that other implementations
> > mostly don't use byte limits at all), the figure I'd pull out of the air
> > would be more like 4GB.
> > 
> 
> I agree with Stuart here. In my experience raising the limit to 4 GB is enough
> to solve the problem and the current jitter works well enough.
> 
> In a next step we can think about relaxing the limits even further for "safe"
> algorithms like Theo proposed, but this would need a bit more work.
> 
> FWIW here's a diff I sent to bluhm a few weeks ago.  We didn't commit it yet
> because the low limit helped us find and reproduce a PMTU bug (that should
> be gone now).

ok patrick@

> Index: types.h
> ===
> RCS file: /cvs/src/sbin/iked/types.h,v
> retrieving revision 1.43
> diff -u -p -r1.43 types.h
> --- types.h   13 May 2021 15:20:48 -  1.43
> +++ types.h   3 Aug 2021 11:35:26 -
> @@ -67,7 +67,7 @@
>  #define IKED_CYCLE_BUFFERS   8   /* # of static buffers for mapping */
>  #define IKED_PASSWORD_SIZE   256 /* limited by most EAP types */
>  
> -#define IKED_LIFETIME_BYTES  536870912 /* 512 Mb */
> +#define IKED_LIFETIME_BYTES  4294967296 /* 4 GB */
>  #define IKED_LIFETIME_SECONDS10800 /* 3 hours */
>  
>  #define IKED_E   0x1000  /* Decrypted flag */
>

Re: vmx(4): remove useless code

2021-08-06 Thread Patrick Wildt

Am Thu, Aug 05, 2021 at 02:33:01PM +0200 schrieb Jan Klemkow:
> Hi,
> 
> The following diff removes useless code from the driver.  As discussed
> here [1] and committed there [2], the hypervisor doesn't do anything
> with the data structures.  We even just set NULL to the pointer since
> the initial commit of vmx(4).  So, I guess it better to remove all of
> these.  The variables are bzero'd in vmxnet3_dma_allocmem() anyway.
> 
> OK?

My main concern was if the structs are getting zeroed correctly, but
they do, so that's fine.

That said, it looks like Linux sets the pointer to ~0ULL, not 0.  Should
we follow Linux' pattern there and do that as well?

Patrick

> bye,
> Jan
> 
> [1]: https://www.lkml.org/lkml/2021/1/19/1225
> [2]: 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/net/vmxnet3/vmxnet3_drv.c?id=de1da8bcf40564a2adada2d5d5426e05355f66e8
> 
> Index: dev/pci/if_vmx.c
> ===
> RCS file: /cvs/src/sys/dev/pci/if_vmx.c,v
> retrieving revision 1.66
> diff -u -p -r1.66 if_vmx.c
> --- dev/pci/if_vmx.c  23 Jul 2021 00:29:14 -  1.66
> +++ dev/pci/if_vmx.c  5 Aug 2021 11:12:26 -
> @@ -157,7 +157,6 @@ struct vmxnet3_softc {
>  #define WRITE_BAR1(sc, reg, val) \
>   bus_space_write_4((sc)->sc_iot1, (sc)->sc_ioh1, reg, val)
>  #define WRITE_CMD(sc, cmd) WRITE_BAR1(sc, VMXNET3_BAR1_CMD, cmd)
> -#define vtophys(va) 0/* XXX ok? */
>  
>  int vmxnet3_match(struct device *, void *, void *);
>  void vmxnet3_attach(struct device *, struct device *, void *);
> @@ -468,8 +467,6 @@ vmxnet3_dma_init(struct vmxnet3_softc *s
>   ds->vmxnet3_revision = 1;
>   ds->upt_version = 1;
>   ds->upt_features = UPT1_F_CSUM | UPT1_F_VLAN;
> - ds->driver_data = vtophys(sc);
> - ds->driver_data_len = sizeof(struct vmxnet3_softc);
>   ds->queue_shared = qs_pa;
>   ds->queue_shared_len = qs_len;
>   ds->mtu = VMXNET3_MAX_MTU;
> @@ -546,8 +543,6 @@ vmxnet3_alloc_txring(struct vmxnet3_soft
>   ts->cmd_ring_len = NTXDESC;
>   ts->comp_ring = comp_pa;
>   ts->comp_ring_len = NTXCOMPDESC;
> - ts->driver_data = vtophys(tq);
> - ts->driver_data_len = sizeof *tq;
>   ts->intr_idx = intr;
>   ts->stopped = 1;
>   ts->error = 0;
> @@ -598,8 +593,6 @@ vmxnet3_alloc_rxring(struct vmxnet3_soft
>   rs->cmd_ring_len[1] = NRXDESC;
>   rs->comp_ring = comp_pa;
>   rs->comp_ring_len = NRXCOMPDESC;
> - rs->driver_data = vtophys(rq);
> - rs->driver_data_len = sizeof *rq;
>   rs->intr_idx = intr;
>   rs->stopped = 1;
>   rs->error = 0;
>

Re: vmx(4): remove useless code

2021-08-06 Thread Patrick Wildt

On Fri, Aug 06, 2021 at 11:05:53AM +0200, Patrick Wildt wrote:
> Am Thu, Aug 05, 2021 at 02:33:01PM +0200 schrieb Jan Klemkow:
> > Hi,
> > 
> > The following diff removes useless code from the driver.  As discussed
> > here [1] and committed there [2], the hypervisor doesn't do anything
> > with the data structures.  We even just set NULL to the pointer since
> > the initial commit of vmx(4).  So, I guess it better to remove all of
> > these.  The variables are bzero'd in vmxnet3_dma_allocmem() anyway.
> > 
> > OK?
> 
> My main concern was if the structs are getting zeroed correctly, but
> they do, so that's fine.
> 
> That said, it looks like Linux sets the pointer to ~0ULL, not 0.  Should
> we follow Linux' pattern there and do that as well?
> 

Thinking about it a little more, I think we should do that as well.  And
maybe explicitly set driver_data_len to 0 even though it's already zero.
Basically for readability.

> 
> > bye,
> > Jan
> > 
> > [1]: https://www.lkml.org/lkml/2021/1/19/1225
> > [2]: 
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/net/vmxnet3/vmxnet3_drv.c?id=de1da8bcf40564a2adada2d5d5426e05355f66e8
> > 
> > Index: dev/pci/if_vmx.c
> > ===
> > RCS file: /cvs/src/sys/dev/pci/if_vmx.c,v
> > retrieving revision 1.66
> > diff -u -p -r1.66 if_vmx.c
> > --- dev/pci/if_vmx.c23 Jul 2021 00:29:14 -  1.66
> > +++ dev/pci/if_vmx.c5 Aug 2021 11:12:26 -
> > @@ -157,7 +157,6 @@ struct vmxnet3_softc {
> >  #define WRITE_BAR1(sc, reg, val) \
> > bus_space_write_4((sc)->sc_iot1, (sc)->sc_ioh1, reg, val)
> >  #define WRITE_CMD(sc, cmd) WRITE_BAR1(sc, VMXNET3_BAR1_CMD, cmd)
> > -#define vtophys(va) 0  /* XXX ok? */
> >  
> >  int vmxnet3_match(struct device *, void *, void *);
> >  void vmxnet3_attach(struct device *, struct device *, void *);
> > @@ -468,8 +467,6 @@ vmxnet3_dma_init(struct vmxnet3_softc *s
> > ds->vmxnet3_revision = 1;
> > ds->upt_version = 1;
> > ds->upt_features = UPT1_F_CSUM | UPT1_F_VLAN;
> > -   ds->driver_data = vtophys(sc);
> > -   ds->driver_data_len = sizeof(struct vmxnet3_softc);
> > ds->queue_shared = qs_pa;
> > ds->queue_shared_len = qs_len;
> > ds->mtu = VMXNET3_MAX_MTU;
> > @@ -546,8 +543,6 @@ vmxnet3_alloc_txring(struct vmxnet3_soft
> > ts->cmd_ring_len = NTXDESC;
> > ts->comp_ring = comp_pa;
> > ts->comp_ring_len = NTXCOMPDESC;
> > -   ts->driver_data = vtophys(tq);
> > -   ts->driver_data_len = sizeof *tq;
> > ts->intr_idx = intr;
> > ts->stopped = 1;
> > ts->error = 0;
> > @@ -598,8 +593,6 @@ vmxnet3_alloc_rxring(struct vmxnet3_soft
> > rs->cmd_ring_len[1] = NRXDESC;
> > rs->comp_ring = comp_pa;
> > rs->comp_ring_len = NRXCOMPDESC;
> > -   rs->driver_data = vtophys(rq);
> > -   rs->driver_data_len = sizeof *rq;
> > rs->intr_idx = intr;
> > rs->stopped = 1;
> > rs->error = 0;
> > 
>

virtio(4): don't require legacy mode to have an I/O BAR

2021-08-23 Thread Patrick Wildt

Hi,

so on the new Parallels version, when using the 'Other' OS setting,
virtio(4) won't attach.  Apparently it's not Virtio 1.0, because even
Fedora 34 falls back to the 'legacy' driver.

While our code expects (and requires) an I/O BAR, it seems to be that
the PCI device only provides two memory BARs.

Linux still works, probably because they don't care about the type.
So I figured, let's just do that as well.  With the following diff,
virtio(4) attach es again, and I can install over vio(4).

I don't know if this violates any official virtio(4) spec, but on
the other hand... it fixes as bug, makes it work, and just loosens
up the requirement a little.

ok?

Patrick

diff --git a/sys/dev/pci/virtio_pci.c b/sys/dev/pci/virtio_pci.c
index c99f50136cd..0a29293e16c 100644
--- a/sys/dev/pci/virtio_pci.c
+++ b/sys/dev/pci/virtio_pci.c
@@ -508,7 +508,10 @@ int
 virtio_pci_attach_09(struct virtio_pci_softc *sc, struct pci_attach_args *pa)
 {
struct virtio_softc *vsc = &sc->sc_sc;
-   if (pci_mapreg_map(pa, PCI_MAPREG_START, PCI_MAPREG_TYPE_IO, 0,
+   pcireg_t type;
+
+   type = pci_mapreg_type(pa->pa_pc, pa->pa_tag, PCI_MAPREG_START);
+   if (pci_mapreg_map(pa, PCI_MAPREG_START, type, 0,
&sc->sc_iot, &sc->sc_ioh, NULL, &sc->sc_iosize, 0)) {
printf("%s: can't map i/o space\n", vsc->sc_dev.dv_xname);
return EIO;

Re: [please test] amd64: schedule clock interrupts against system clock

2021-09-06 Thread Patrick Wildt

Am Fri, Jul 30, 2021 at 07:55:29PM +0200 schrieb Alexander Bluhm:
> On Mon, Jul 26, 2021 at 08:12:39AM -0500, Scott Cheloha wrote:
> > On Fri, Jun 25, 2021 at 06:09:27PM -0500, Scott Cheloha wrote:
> > 1 month bump.  I really appreciate the tests I've gotten so far, thank
> > you.
> 
> On my Xeon machine it works and all regress tests pass.
> 
> But it fails on my old Opteron machine.  It hangs after attaching
> cpu1.

This seems to be caused by contention on the mutex in i8254's gettick().

With Scott's diff, delay_func is i8254_delay() on that old AMD machine.
Its gettick() implementation uses a mutex to protect I/O access to the
i8254.

When secondary CPUs come up, they will wait for CPU0 to let them boot up
further by checking for a flag:

/*
 * We need to wait until we can identify, otherwise dmesg
 * output will be messy.
 */
while ((ci->ci_flags & CPUF_IDENTIFY) == 0)
delay(10);

Now that machine has 3 secondary cores that are spinning like that.  At
the same time CPU0 waits for the core to come up:

/* wait for it to identify */
for (i = 200; (ci->ci_flags & CPUF_IDENTIFY) && i > 0; i--)
delay(10);

That means we have 3-4 cores spinning just to be able to delay().  Our
mutex implementation isn't fair, which means whoever manages to claim
the free mutex wins.  Now if CPU2 and CPU3 are spinning all the time,
CPU1 identifies and needs delay() and CPU0 waits for CPU1, maybe the
one that needs to make progress never gets it.

I changed those delay(10) in cpu_hatch() to CPU_BUSY_CYCLE() and it went
ahead a bit better instead of hanging forever.

Then I remembered an idea something from years ago: fair kernel mutexes,
so basically mutexes implemented as ticket lock, like our kerne lock.

I did a quick diff, which probably contains a million bugs, but with
this bluhm's machine boots up well.

I'm not saying this is the solution, but it might be.

Patrick

diff --git a/sys/kern/kern_lock.c b/sys/kern/kern_lock.c
index 5cc55bb256a..c6a284beb51 100644
--- a/sys/kern/kern_lock.c
+++ b/sys/kern/kern_lock.c
@@ -248,6 +248,8 @@ __mtx_init(struct mutex *mtx, int wantipl)
mtx->mtx_owner = NULL;
mtx->mtx_wantipl = wantipl;
mtx->mtx_oldipl = IPL_NONE;
+   mtx->mtx_ticket = 0;
+   mtx->mtx_cur = 0;
 }
 
 #ifdef MULTIPROCESSOR
@@ -255,15 +257,26 @@ void
 mtx_enter(struct mutex *mtx)
 {
struct schedstate_percpu *spc = &curcpu()->ci_schedstate;
+   struct cpu_info *ci = curcpu();
+   unsigned int t;
 #ifdef MP_LOCKDEBUG
int nticks = __mp_lock_spinout;
 #endif
+   int s;
+
+   /* Avoid deadlocks after panic or in DDB */
+   if (panicstr || db_active)
+   return;
 
WITNESS_CHECKORDER(MUTEX_LOCK_OBJECT(mtx),
LOP_EXCLUSIVE | LOP_NEWORDER, NULL);
 
+   if (mtx->mtx_wantipl != IPL_NONE)
+   s = splraise(mtx->mtx_wantipl);
+
spc->spc_spinning++;
-   while (mtx_enter_try(mtx) == 0) {
+   t = atomic_inc_int_nv(&mtx->mtx_ticket) - 1;
+   while (READ_ONCE(mtx->mtx_cur) != t) {
CPU_BUSY_CYCLE();
 
 #ifdef MP_LOCKDEBUG
@@ -275,12 +288,21 @@ mtx_enter(struct mutex *mtx)
 #endif
}
spc->spc_spinning--;
+
+   mtx->mtx_owner = curcpu();
+   if (mtx->mtx_wantipl != IPL_NONE)
+   mtx->mtx_oldipl = s;
+#ifdef DIAGNOSTIC
+   ci->ci_mutex_level++;
+#endif
+   WITNESS_LOCK(MUTEX_LOCK_OBJECT(mtx), LOP_EXCLUSIVE);
 }
 
 int
 mtx_enter_try(struct mutex *mtx)
 {
-   struct cpu_info *owner, *ci = curcpu();
+   struct cpu_info *ci = curcpu();
+   unsigned int t;
int s;
 
/* Avoid deadlocks after panic or in DDB */
@@ -290,13 +312,15 @@ mtx_enter_try(struct mutex *mtx)
if (mtx->mtx_wantipl != IPL_NONE)
s = splraise(mtx->mtx_wantipl);
 
-   owner = atomic_cas_ptr(&mtx->mtx_owner, NULL, ci);
 #ifdef DIAGNOSTIC
-   if (__predict_false(owner == ci))
+   if (__predict_false(mtx->mtx_owner == ci))
panic("mtx %p: locking against myself", mtx);
 #endif
-   if (owner == NULL) {
+
+   t = READ_ONCE(mtx->mtx_cur);
+   if (atomic_cas_uint(&mtx->mtx_ticket, t, t + 1) == t) {
membar_enter_after_atomic();
+   mtx->mtx_owner = curcpu();
if (mtx->mtx_wantipl != IPL_NONE)
mtx->mtx_oldipl = s;
 #ifdef DIAGNOSTIC
@@ -369,6 +393,9 @@ mtx_leave(struct mutex *mtx)
membar_exit();
 #endif
mtx->mtx_owner = NULL;
+#ifdef MULTIPROCESSOR
+   atomic_inc_int_nv(&mtx->mtx_cur);
+#endif
if (mtx->mtx_wantipl != IPL_NONE)
splx(s);
 }

mutex(9): initialize some more mutexes before use?

2021-09-07 Thread Patrick Wildt

Hi,

I was playing around a little with the mutex code and found that on
arm64 there some uninitialized mutexes out there.

I think the arm64 specific one is comparatively easy to solve.  We
either initialize the mtx when we initialize the rest of the pmap, or
we move it into the global definition of those.  I opted for the former
version.

The other one prolly needs more discussion/debugging.  So uvm_init()
calls first pmap_init() and then uvm_km_page_init().  The latter does
initialize the mutex, but arm64's pmap_init() already uses pools, which
uses km_alloc, which then uses that mutex.  Now one easy fix would be
to just initialize the definition right away instead of during runtime.

But there might be the question if arm64's pmap is allowed to use pools
and km_alloc during pmap_init.

Patrick

#0  0xff800073f984 in mtx_enter (mtx=0xff8000f3b048 ) at 
/usr/src/sys/kern/kern_lock.c:281
#1  0xff8000937e6c in km_alloc (sz=, kv=0xff8000da6a30 , kp=0xff8000da6a48 
, kd=0xff8000e934d8)
at /usr/src/sys/uvm/uvm_km.c:899
#2  0xff800084d804 in pool_page_alloc (pp=, flags=,
slowdown=) 
at /usr/src/sys/kern/subr_pool.c:1633
#3  0xff800084f8dc in pool_allocator_alloc (pp=0xff8000ea6e40 
, flags=65792, slowdown=0xff80026cd098) at 
/usr/src/sys/kern/subr_pool.c:1602
#4  0xff800084ef08 in pool_p_alloc (pp=0xff8000ea6e40 , 
flags=2, slowdown=0xff8000e9359c) at /usr/src/sys/kern/subr_pool.c:926
#5  0xff800084f808 in pool_prime (pp=, n=) at 
/usr/src/sys/kern/subr_pool.c:896
#6  0xff800048c20c in pmap_init () at 
/usr/src/sys/arch/arm64/arm64/pmap.c:1682
#7  0xff80009384dc in uvm_init () at /usr/src/sys/uvm/uvm_init.c:118
#8  0xff800048e664 in main (framep=) at /usr/src/sys/kern/init_main.c:235

diff --git a/sys/arch/arm64/arm64/pmap.c b/sys/arch/arm64/arm64/pmap.c
index 79a344cc84e..f070f4540ec 100644
--- a/sys/arch/arm64/arm64/pmap.c
+++ b/sys/arch/arm64/arm64/pmap.c
@@ -1308,10 +1308,12 @@ pmap_bootstrap(long kvo, paddr_t lpt1, long 
kernelstart, long kernelend,
pmap_kernel()->pm_vp.l1 = (struct pmapvp1 *)va;
pmap_kernel()->pm_privileged = 1;
pmap_kernel()->pm_asid = 0;
+   mtx_init(&pmap_kernel()->pm_mtx, IPL_VM);
 
pmap_tramp.pm_vp.l1 = (struct pmapvp1 *)va + 1;
pmap_tramp.pm_privileged = 1;
pmap_tramp.pm_asid = 0;
+   mtx_init(&pmap_tramp.pm_mtx, IPL_VM);
 
/* Mark ASID 0 as in-use. */
pmap_asid[0] |= (3U << 0);
diff --git a/sys/uvm/uvm_km.c b/sys/uvm/uvm_km.c
index 4a60377e9d7..e77afeda832 100644
--- a/sys/uvm/uvm_km.c
+++ b/sys/uvm/uvm_km.c
@@ -644,7 +644,7 @@ uvm_km_page_lateinit(void)
  * not zero filled.
  */
 
-struct uvm_km_pages uvm_km_pages;
+struct uvm_km_pages uvm_km_pages = { .mtx = MUTEX_INITIALIZER(IPL_VM) };
 
 void uvm_km_createthread(void *);
 void uvm_km_thread(void *);
@@ -664,7 +664,6 @@ uvm_km_page_init(void)
int len, bulk;
vaddr_t addr;
 
-   mtx_init(&uvm_km_pages.mtx, IPL_VM);
if (!uvm_km_pages.lowat) {
/* based on physmem, calculate a good value here */
uvm_km_pages.lowat = physmem / 256;

Re: [please test] amd64: schedule clock interrupts against system clock

2021-09-07 Thread Patrick Wildt

Am Mon, Sep 06, 2021 at 09:43:29PM +0200 schrieb Patrick Wildt:
> Am Fri, Jul 30, 2021 at 07:55:29PM +0200 schrieb Alexander Bluhm:
> > On Mon, Jul 26, 2021 at 08:12:39AM -0500, Scott Cheloha wrote:
> > > On Fri, Jun 25, 2021 at 06:09:27PM -0500, Scott Cheloha wrote:
> > > 1 month bump.  I really appreciate the tests I've gotten so far, thank
> > > you.
> > 
> > On my Xeon machine it works and all regress tests pass.
> > 
> > But it fails on my old Opteron machine.  It hangs after attaching
> > cpu1.
> 
> This seems to be caused by contention on the mutex in i8254's gettick().
> 
> With Scott's diff, delay_func is i8254_delay() on that old AMD machine.
> Its gettick() implementation uses a mutex to protect I/O access to the
> i8254.
> 
> When secondary CPUs come up, they will wait for CPU0 to let them boot up
> further by checking for a flag:
> 
>   /*
>* We need to wait until we can identify, otherwise dmesg
>* output will be messy.
>*/
>   while ((ci->ci_flags & CPUF_IDENTIFY) == 0)
>   delay(10);
> 
> Now that machine has 3 secondary cores that are spinning like that.  At
> the same time CPU0 waits for the core to come up:
> 
>   /* wait for it to identify */
>   for (i = 200; (ci->ci_flags & CPUF_IDENTIFY) && i > 0; i--)
>   delay(10);
> 
> That means we have 3-4 cores spinning just to be able to delay().  Our
> mutex implementation isn't fair, which means whoever manages to claim
> the free mutex wins.  Now if CPU2 and CPU3 are spinning all the time,
> CPU1 identifies and needs delay() and CPU0 waits for CPU1, maybe the
> one that needs to make progress never gets it.
> 
> I changed those delay(10) in cpu_hatch() to CPU_BUSY_CYCLE() and it went
> ahead a bit better instead of hanging forever.
> 
> Then I remembered an idea something from years ago: fair kernel mutexes,
> so basically mutexes implemented as ticket lock, like our kerne lock.
> 
> I did a quick diff, which probably contains a million bugs, but with
> this bluhm's machine boots up well.
> 
> I'm not saying this is the solution, but it might be.
> 
> Patrick

Cleaned the diff up a little, changes since last time:

* Rename the struct members to be the same as mplock.
* Change the code to use ticket/user numbers like mplock.  This
  has one obvious downside: If a mutex is not initialized, trying
  to get this mutex will result in a hang.  At least that just let
  me find some uninitialized mutexes.
* More consistent use of the 'ci' variable.
* Definitely compiles with/without DIAGNOSTIC.
* Made sure mtx_enter() still has the membar.
* No need for READ_ONCE() when members are volatile.

Apart from being fair, this diff also changes the behaviour while
spinning for a lock.  Previously mtx_enter called mtx_enter_try
in a loop until it got the lock.  mtx_enter_try does splraise,
try lock, splx.  This diff currently spins with the SPL raised,
so that's a change in behaviour.  I'm sure I can change the diff
to splraise/splx while looping, if we prefer that behaviour.

Patrick

diff --git a/sys/kern/kern_lock.c b/sys/kern/kern_lock.c
index 5cc55bb256a..1eeb30e0c40 100644
--- a/sys/kern/kern_lock.c
+++ b/sys/kern/kern_lock.c
@@ -248,6 +248,8 @@ __mtx_init(struct mutex *mtx, int wantipl)
mtx->mtx_owner = NULL;
mtx->mtx_wantipl = wantipl;
mtx->mtx_oldipl = IPL_NONE;
+   mtx->mtx_users = 0;
+   mtx->mtx_ticket = 1;
 }
 
 #ifdef MULTIPROCESSOR
@@ -255,15 +257,26 @@ void
 mtx_enter(struct mutex *mtx)
 {
struct schedstate_percpu *spc = &curcpu()->ci_schedstate;
+   struct cpu_info *ci = curcpu();
 #ifdef MP_LOCKDEBUG
int nticks = __mp_lock_spinout;
 #endif
+   u_int t;
+   int s;
+
+   /* Avoid deadlocks after panic or in DDB */
+   if (panicstr || db_active)
+   return;
 
WITNESS_CHECKORDER(MUTEX_LOCK_OBJECT(mtx),
LOP_EXCLUSIVE | LOP_NEWORDER, NULL);
 
+   if (mtx->mtx_wantipl != IPL_NONE)
+   s = splraise(mtx->mtx_wantipl);
+
spc->spc_spinning++;
-   while (mtx_enter_try(mtx) == 0) {
+   t = atomic_inc_int_nv(&mtx->mtx_users);
+   while (mtx->mtx_ticket != t) {
CPU_BUSY_CYCLE();
 
 #ifdef MP_LOCKDEBUG
@@ -275,12 +288,22 @@ mtx_enter(struct mutex *mtx)
 #endif
}
spc->spc_spinning--;
+
+   membar_enter_after_atomic();
+   mtx->mtx_owner = ci;
+   if (mtx->mtx_wantipl != IPL_NONE)
+   mtx->mtx_oldipl = s;
+#ifdef DIAGNOSTIC
+   ci->ci_mutex_level++;
+#endif
+   WITNESS_LOCK(MUTEX_LOCK_OBJECT(mtx), LOP_EXCLUSIVE);
 }
 
 int
 mtx_enter_try(struct mutex *mtx)
 {
-   str

Re: [please test] amd64: schedule clock interrupts against system clock

2021-09-07 Thread Patrick Wildt

Am Tue, Sep 07, 2021 at 02:43:22PM +0200 schrieb Patrick Wildt:
> Am Mon, Sep 06, 2021 at 09:43:29PM +0200 schrieb Patrick Wildt:
> > Am Fri, Jul 30, 2021 at 07:55:29PM +0200 schrieb Alexander Bluhm:
> > > On Mon, Jul 26, 2021 at 08:12:39AM -0500, Scott Cheloha wrote:
> > > > On Fri, Jun 25, 2021 at 06:09:27PM -0500, Scott Cheloha wrote:
> > > > 1 month bump.  I really appreciate the tests I've gotten so far, thank
> > > > you.
> > > 
> > > On my Xeon machine it works and all regress tests pass.
> > > 
> > > But it fails on my old Opteron machine.  It hangs after attaching
> > > cpu1.
> > 
> > This seems to be caused by contention on the mutex in i8254's gettick().
> > 
> > With Scott's diff, delay_func is i8254_delay() on that old AMD machine.
> > Its gettick() implementation uses a mutex to protect I/O access to the
> > i8254.
> > 
> > When secondary CPUs come up, they will wait for CPU0 to let them boot up
> > further by checking for a flag:
> > 
> > /*
> >  * We need to wait until we can identify, otherwise dmesg
> >  * output will be messy.
> >  */
> > while ((ci->ci_flags & CPUF_IDENTIFY) == 0)
> > delay(10);
> > 
> > Now that machine has 3 secondary cores that are spinning like that.  At
> > the same time CPU0 waits for the core to come up:
> > 
> > /* wait for it to identify */
> > for (i = 200; (ci->ci_flags & CPUF_IDENTIFY) && i > 0; i--)
> > delay(10);
> > 
> > That means we have 3-4 cores spinning just to be able to delay().  Our
> > mutex implementation isn't fair, which means whoever manages to claim
> > the free mutex wins.  Now if CPU2 and CPU3 are spinning all the time,
> > CPU1 identifies and needs delay() and CPU0 waits for CPU1, maybe the
> > one that needs to make progress never gets it.
> > 
> > I changed those delay(10) in cpu_hatch() to CPU_BUSY_CYCLE() and it went
> > ahead a bit better instead of hanging forever.
> > 
> > Then I remembered an idea something from years ago: fair kernel mutexes,
> > so basically mutexes implemented as ticket lock, like our kerne lock.
> > 
> > I did a quick diff, which probably contains a million bugs, but with
> > this bluhm's machine boots up well.
> > 
> > I'm not saying this is the solution, but it might be.
> > 
> > Patrick
> 
> Cleaned the diff up a little, changes since last time:
> 
> * Rename the struct members to be the same as mplock.
> * Change the code to use ticket/user numbers like mplock.  This
>   has one obvious downside: If a mutex is not initialized, trying
>   to get this mutex will result in a hang.  At least that just let
>   me find some uninitialized mutexes.
> * More consistent use of the 'ci' variable.
> * Definitely compiles with/without DIAGNOSTIC.
> * Made sure mtx_enter() still has the membar.
> * No need for READ_ONCE() when members are volatile.
> 
> Apart from being fair, this diff also changes the behaviour while
> spinning for a lock.  Previously mtx_enter called mtx_enter_try
> in a loop until it got the lock.  mtx_enter_try does splraise,
> try lock, splx.  This diff currently spins with the SPL raised,
> so that's a change in behaviour.  I'm sure I can change the diff
> to splraise/splx while looping, if we prefer that behaviour.
> 
> Patrick

make -j17 seems to have used less system time, so that seemed to have
made the machine slightly faster:

old: make -j17  1160.01s user 3244.58s system 1288% cpu 5:41.96 total
new: make -j17  1171.80s user 3059.67s system 1295% cpu 5:26.65 total

I'll change the diff to do splraise/splx while looping, to make the
behaviour more similar to before, and then re-do my testing.

Patrick

Re: [please test] amd64: schedule clock interrupts against system clock

2021-09-07 Thread Patrick Wildt

Am Tue, Sep 07, 2021 at 09:52:42PM +0200 schrieb Martin Pieuchot:
> On 07/09/21(Tue) 21:47, Patrick Wildt wrote:
> > Am Tue, Sep 07, 2021 at 02:43:22PM +0200 schrieb Patrick Wildt:
> > > Am Mon, Sep 06, 2021 at 09:43:29PM +0200 schrieb Patrick Wildt:
> > > > Am Fri, Jul 30, 2021 at 07:55:29PM +0200 schrieb Alexander Bluhm:
> > > > > On Mon, Jul 26, 2021 at 08:12:39AM -0500, Scott Cheloha wrote:
> > > > > > On Fri, Jun 25, 2021 at 06:09:27PM -0500, Scott Cheloha wrote:
> > > > > > 1 month bump.  I really appreciate the tests I've gotten so far, 
> > > > > > thank
> > > > > > you.
> > > > > 
> > > > > On my Xeon machine it works and all regress tests pass.
> > > > > 
> > > > > But it fails on my old Opteron machine.  It hangs after attaching
> > > > > cpu1.
> > > > 
> > > > This seems to be caused by contention on the mutex in i8254's gettick().
> > > > 
> > > > With Scott's diff, delay_func is i8254_delay() on that old AMD machine.
> > > > Its gettick() implementation uses a mutex to protect I/O access to the
> > > > i8254.
> > > > 
> > > > When secondary CPUs come up, they will wait for CPU0 to let them boot up
> > > > further by checking for a flag:
> > > > 
> > > > /*
> > > >  * We need to wait until we can identify, otherwise dmesg
> > > >  * output will be messy.
> > > >  */
> > > > while ((ci->ci_flags & CPUF_IDENTIFY) == 0)
> > > > delay(10);
> > > > 
> > > > Now that machine has 3 secondary cores that are spinning like that.  At
> > > > the same time CPU0 waits for the core to come up:
> > > > 
> > > > /* wait for it to identify */
> > > > for (i = 200; (ci->ci_flags & CPUF_IDENTIFY) && i > 0; i--)
> > > > delay(10);
> > > > 
> > > > That means we have 3-4 cores spinning just to be able to delay().  Our
> > > > mutex implementation isn't fair, which means whoever manages to claim
> > > > the free mutex wins.  Now if CPU2 and CPU3 are spinning all the time,
> > > > CPU1 identifies and needs delay() and CPU0 waits for CPU1, maybe the
> > > > one that needs to make progress never gets it.
> > > > 
> > > > I changed those delay(10) in cpu_hatch() to CPU_BUSY_CYCLE() and it went
> > > > ahead a bit better instead of hanging forever.
> > > > 
> > > > Then I remembered an idea something from years ago: fair kernel mutexes,
> > > > so basically mutexes implemented as ticket lock, like our kerne lock.
> > > > 
> > > > I did a quick diff, which probably contains a million bugs, but with
> > > > this bluhm's machine boots up well.
> > > > 
> > > > I'm not saying this is the solution, but it might be.
> > > > 
> > > > Patrick
> > > 
> > > Cleaned the diff up a little, changes since last time:
> > > 
> > > * Rename the struct members to be the same as mplock.
> > > * Change the code to use ticket/user numbers like mplock.  This
> > >   has one obvious downside: If a mutex is not initialized, trying
> > >   to get this mutex will result in a hang.  At least that just let
> > >   me find some uninitialized mutexes.
> > > * More consistent use of the 'ci' variable.
> > > * Definitely compiles with/without DIAGNOSTIC.
> > > * Made sure mtx_enter() still has the membar.
> > > * No need for READ_ONCE() when members are volatile.
> > > 
> > > Apart from being fair, this diff also changes the behaviour while
> > > spinning for a lock.  Previously mtx_enter called mtx_enter_try
> > > in a loop until it got the lock.  mtx_enter_try does splraise,
> > > try lock, splx.  This diff currently spins with the SPL raised,
> > > so that's a change in behaviour.  I'm sure I can change the diff
> > > to splraise/splx while looping, if we prefer that behaviour.
> > > 
> > > Patrick
> 
> This change makes sense on its own as the contention is switching away
> from KERNEL_LOCK() to mutexes.
> 
> Note that hppa has its own mutex implementation in case somebody wants
> to keep in sync.
>  
> > make -j17 seems to have used less system time, so that seemed to have
> > made the machine slightly faster:
> > 
> > old: make -j17  1160.01s user 3244.58s system 1288% cpu 5:41.96 total
> > new: make -j17  1171.80s user 3059.67s system 1295% cpu 5:26.65 total
> 
> Is it with -current or with the UVM unlocking diff that put more
> pressure on mutxes?

It's with the UVM unlocking diff that put more pressure on mutexes.

> > I'll change the diff to do splraise/splx while looping, to make the
> > behaviour more similar to before, and then re-do my testing.
> 
> That'd be nice.  You could also start a new thread to get attention of
> more people, maybe dlg@, visa@ or kettenis@ have an opinion on this.

You can read my mind. :)

Re: mutex(9): initialize some more mutexes before use?

2021-09-09 Thread Patrick Wildt

Am Thu, Sep 09, 2021 at 12:55:13PM +0200 schrieb Mark Kettenis:
> > Date: Wed, 8 Sep 2021 10:45:53 +0200
> > From: Martin Pieuchot 
> > 
> > On 07/09/21(Tue) 14:19, Patrick Wildt wrote:
> > > Hi,
> > > 
> > > I was playing around a little with the mutex code and found that on
> > > arm64 there some uninitialized mutexes out there.
> > > 
> > > I think the arm64 specific one is comparatively easy to solve.  We
> > > either initialize the mtx when we initialize the rest of the pmap, or
> > > we move it into the global definition of those.  I opted for the former
> > > version.
> > 
> > Is the kernel pmap mutex supposed to be used?  On i386 it isn't so the
> > mutex's IPL is set to -1 and we added a KASSERT() in splraise() to spot
> > any mistake.
> 
> Indeed.  The kernel pmap is special:
> 
> * It can never disappear.
> 
> * Page table pages are pre-allocated and are never freed.
> 
> * Mappings are (largely) unmanaged (by uvm).
> 
> Therefore the per-pmap lock isn't used for the kernel map on most
> (all?) architectures.

The one that 'crashed' was pmap_tramp.  I only changed the kernel pmap
because it was like 5 lines above (or below) and seemed to be missing it
as well.

> > > The other one prolly needs more discussion/debugging.  So uvm_init()
> > > calls first pmap_init() and then uvm_km_page_init().  The latter does
> > > initialize the mutex, but arm64's pmap_init() already uses pools, which
> > > uses km_alloc, which then uses that mutex.  Now one easy fix would be
> > > to just initialize the definition right away instead of during runtime.
> > > 
> > > But there might be the question if arm64's pmap is allowed to use pools
> > > and km_alloc during pmap_init.
> > 
> > That's a common question for the family of pmaps calling pool_setlowat()
> > in pmap_init().  That's where pool_prime() is called from.
> > 
> > > #0  0xff800073f984 in mtx_enter (mtx=0xff8000f3b048 
> > > ) at /usr/src/sys/kern/kern_lock.c:281
> > > #1  0xff8000937e6c in km_alloc (sz= > > dwarf expression opcode 0xa3>, kv=0xff8000da6a30 , 
> > > kp=0xff8000da6a48 , kd=0xff8000e934d8)
> > > at /usr/src/sys/uvm/uvm_km.c:899
> > > #2  0xff800084d804 in pool_page_alloc (pp= > > Unhandled dwarf expression opcode 0xa3>, flags= > > Unhandled dwarf expression opcode 0xa3>,
> > > slowdown= > > 0xa3>) at /usr/src/sys/kern/subr_pool.c:1633
> > > #3  0xff800084f8dc in pool_allocator_alloc (pp=0xff8000ea6e40 
> > > , flags=65792, slowdown=0xff80026cd098) at 
> > > /usr/src/sys/kern/subr_pool.c:1602
> > > #4  0xff800084ef08 in pool_p_alloc (pp=0xff8000ea6e40 
> > > , flags=2, slowdown=0xff8000e9359c) at 
> > > /usr/src/sys/kern/subr_pool.c:926
> > > #5  0xff800084f808 in pool_prime (pp=, n= > > reading variable: Unhandled dwarf expression opcode 0xa3>) at 
> > > /usr/src/sys/kern/subr_pool.c:896
> > > #6  0xff800048c20c in pmap_init () at 
> > > /usr/src/sys/arch/arm64/arm64/pmap.c:1682
> > > #7  0xff80009384dc in uvm_init () at /usr/src/sys/uvm/uvm_init.c:118
> > > #8  0xff800048e664 in main (framep= > > dwarf expression opcode 0xa3>) at /usr/src/sys/kern/init_main.c:235
> > > 
> > > diff --git a/sys/arch/arm64/arm64/pmap.c b/sys/arch/arm64/arm64/pmap.c
> > > index 79a344cc84e..f070f4540ec 100644
> > > --- a/sys/arch/arm64/arm64/pmap.c
> > > +++ b/sys/arch/arm64/arm64/pmap.c
> > > @@ -1308,10 +1308,12 @@ pmap_bootstrap(long kvo, paddr_t lpt1, long 
> > > kernelstart, long kernelend,
> > >   pmap_kernel()->pm_vp.l1 = (struct pmapvp1 *)va;
> > >   pmap_kernel()->pm_privileged = 1;
> > >   pmap_kernel()->pm_asid = 0;
> > > + mtx_init(&pmap_kernel()->pm_mtx, IPL_VM);
> > >  
> > >   pmap_tramp.pm_vp.l1 = (struct pmapvp1 *)va + 1;
> > >   pmap_tramp.pm_privileged = 1;
> > >   pmap_tramp.pm_asid = 0;
> > > + mtx_init(&pmap_tramp.pm_mtx, IPL_VM);
> > >  
> > >   /* Mark ASID 0 as in-use. */
> > >   pmap_asid[0] |= (3U << 0);
> > > diff --git a/sys/uvm/uvm_km.c b/sys/uvm/uvm_km.c
> > > index 4a60377e9d7..e77afeda832 100644
> > > --- a/sys/uvm/uvm_km.c
> > > +++ b/sys/uvm/uvm_km.c
> > > @@ -644,7 +644,7 @@ uvm_km_page_lateinit(void)
> > >   * not zero filled.
> > >   */
> > >  
> > > -struct uvm_km_pages uvm_km_pages;
> > > +struct uvm_km_pages uvm_km_pages = { .mtx = MUTEX_INITIALIZER(IPL_VM) };
> > >  
> > >  void uvm_km_createthread(void *);
> > >  void uvm_km_thread(void *);
> > > @@ -664,7 +664,6 @@ uvm_km_page_init(void)
> > >   int len, bulk;
> > >   vaddr_t addr;
> > >  
> > > - mtx_init(&uvm_km_pages.mtx, IPL_VM);
> > >   if (!uvm_km_pages.lowat) {
> > >   /* based on physmem, calculate a good value here */
> > >   uvm_km_pages.lowat = physmem / 256;
> > > 
> > 
> > 
>

Re: Turn SCHED_LOCK() into a mutex

2021-09-09 Thread Patrick Wildt

Am Wed, Nov 04, 2020 at 09:13:22AM -0300 schrieb Martin Pieuchot:
> Diff below removes the recursive attribute of the SCHED_LOCK() by
> turning it into a IPL_NONE mutex.  I'm not intending to commit it
> yet as it raises multiple questions, see below. 
> 
> This work has been started by art@ more than a decade ago and I'm
> willing to finish it as I believe it's the easiest way to reduce
> the scope of this lock.  Having a global mutex is the first step to
> have a per runqueue and per sleepqueue mutex. 
> 
> This is also a way to avoid lock ordering problems exposed by the recent
> races in single_thread_set().
> 
> About the diff:
> 
>  The diff below includes a (hugly) refactoring of rw_exit() to avoid a
>  recursion on the SCHED_LOCK().  In this case the lock is used to protect
>  the global sleepqueue and is grabbed in sleep_setup().
> 
>  The same pattern can be observed in single_thread_check().  However in
>  this case the lock is used to protect different fields so there's no
>  "recursive access" to the same data structure.
> 
>  assertwaitok() has been moved down in mi_switch() which isn't ideal.
> 
>  It becomes obvious that the per-CPU and per-thread accounting fields
>  updated in mi_switch() won't need a separate mutex as proposed last
>  year and that splitting this global mutex will be enough.
> 
> It's unclear to me if/how WITNESS should be modified to handle this lock
> change.
> 
> This has been tested on sparc64 and amd64.  I'm not convinced it exposed
> all the recursions.  So if you want to give it a go and can break it, it
> is more than welcome.
> 
> Comments?  Questions?

I think that's a good direction.  This then allows us to add a few
smaller mutexes and move some users to those.

In the meantime apparently some stuff has changed a little bit.  I had
to fix one reject, which was very easy because only the context around
the diff changed a tiny bit.

I then found one schedlock recursion:

single_thread_check() takes the sched lock, which we call through
sleep_signal_check().  We do the call twice, one time with the sched
lock, and one time without.  My 'quick' fix is to introduce a locked
version that calls single_thread_check_locked().  There might be
a nicer way to do that, which I don't know yet, but this diff seems
to work fine on my arm64 machine with amdgpu.

Build times increase a little, which might be because mutexes are
not 'fair' like mplocks.  If we are looking for 'instant gratification'
we might have to do two steps at once (mplock -> mtx + move contention
to multiple mtxs).  In any case, I believe this is a good step to go
and I'll have a look at reducing the contention.

Patrick

diff --git a/sys/kern/kern_fork.c b/sys/kern/kern_fork.c
index 1e51f71301a..4e4e1fc2a27 100644
--- a/sys/kern/kern_fork.c
+++ b/sys/kern/kern_fork.c
@@ -671,7 +671,7 @@ void
 proc_trampoline_mp(void)
 {
SCHED_ASSERT_LOCKED();
-   __mp_unlock(&sched_lock);
+   mtx_leave(&sched_lock);
spl0();
SCHED_ASSERT_UNLOCKED();
KERNEL_ASSERT_UNLOCKED();
diff --git a/sys/kern/kern_lock.c b/sys/kern/kern_lock.c
index 5cc55bb256a..2206ac53321 100644
--- a/sys/kern/kern_lock.c
+++ b/sys/kern/kern_lock.c
@@ -97,9 +97,6 @@ ___mp_lock_init(struct __mp_lock *mpl, const struct lock_type 
*type)
if (mpl == &kernel_lock)
mpl->mpl_lock_obj.lo_flags = LO_WITNESS | LO_INITIALIZED |
LO_SLEEPABLE | (LO_CLASS_KERNEL_LOCK << LO_CLASSSHIFT);
-   else if (mpl == &sched_lock)
-   mpl->mpl_lock_obj.lo_flags = LO_WITNESS | LO_INITIALIZED |
-   LO_RECURSABLE | (LO_CLASS_SCHED_LOCK << LO_CLASSSHIFT);
WITNESS_INIT(&mpl->mpl_lock_obj, type);
 #endif
 }
diff --git a/sys/kern/kern_rwlock.c b/sys/kern/kern_rwlock.c
index d79b59748e8..2feeff16943 100644
--- a/sys/kern/kern_rwlock.c
+++ b/sys/kern/kern_rwlock.c
@@ -129,36 +129,6 @@ rw_enter_write(struct rwlock *rwl)
}
 }
 
-void
-rw_exit_read(struct rwlock *rwl)
-{
-   unsigned long owner;
-
-   rw_assert_rdlock(rwl);
-   WITNESS_UNLOCK(&rwl->rwl_lock_obj, 0);
-
-   membar_exit_before_atomic();
-   owner = rwl->rwl_owner;
-   if (__predict_false((owner & RWLOCK_WAIT) ||
-   rw_cas(&rwl->rwl_owner, owner, owner - RWLOCK_READ_INCR)))
-   rw_do_exit(rwl, 0);
-}
-
-void
-rw_exit_write(struct rwlock *rwl)
-{
-   unsigned long owner;
-
-   rw_assert_wrlock(rwl);
-   WITNESS_UNLOCK(&rwl->rwl_lock_obj, LOP_EXCLUSIVE);
-
-   membar_exit_before_atomic();
-   owner = rwl->rwl_owner;
-   if (__predict_false((owner & RWLOCK_WAIT) ||
-   rw_cas(&rwl->rwl_owner, owner, 0)))
-   rw_do_exit(rwl, RWLOCK_WRLOCK);
-}
-
 #ifdef DIAGNOSTIC
 /*
  * Put the diagnostic functions here to keep the main code free
@@ -313,9 +283,10 @@ retry:
 }
 
 void
-rw_exit(struct rwlock *rwl)
+_rw_exit(struct rwlock *rwl, int locked)
 {
unsigned long wrlock;
+   unsigned long owner, set;

Re: [please test] amd64: schedule clock interrupts against system clock

2021-09-09 Thread Patrick Wildt

Am Thu, Sep 09, 2021 at 04:10:31PM +0200 schrieb Mark Kettenis:
> > Date: Mon, 6 Sep 2021 21:43:29 +0200
> > From: Patrick Wildt 
> > 
> > Am Fri, Jul 30, 2021 at 07:55:29PM +0200 schrieb Alexander Bluhm:
> > > On Mon, Jul 26, 2021 at 08:12:39AM -0500, Scott Cheloha wrote:
> > > > On Fri, Jun 25, 2021 at 06:09:27PM -0500, Scott Cheloha wrote:
> > > > 1 month bump.  I really appreciate the tests I've gotten so far, thank
> > > > you.
> > > 
> > > On my Xeon machine it works and all regress tests pass.
> > > 
> > > But it fails on my old Opteron machine.  It hangs after attaching
> > > cpu1.
> > 
> > This seems to be caused by contention on the mutex in i8254's gettick().
> > 
> > With Scott's diff, delay_func is i8254_delay() on that old AMD machine.
> > Its gettick() implementation uses a mutex to protect I/O access to the
> > i8254.
> > 
> > When secondary CPUs come up, they will wait for CPU0 to let them boot up
> > further by checking for a flag:
> > 
> > /*
> >  * We need to wait until we can identify, otherwise dmesg
> >  * output will be messy.
> >  */
> > while ((ci->ci_flags & CPUF_IDENTIFY) == 0)
> > delay(10);
> > 
> > Now that machine has 3 secondary cores that are spinning like that.  At
> > the same time CPU0 waits for the core to come up:
> > 
> > /* wait for it to identify */
> > for (i = 200; (ci->ci_flags & CPUF_IDENTIFY) && i > 0; i--)
> > delay(10);
> > 
> > That means we have 3-4 cores spinning just to be able to delay().  Our
> > mutex implementation isn't fair, which means whoever manages to claim
> > the free mutex wins.  Now if CPU2 and CPU3 are spinning all the time,
> > CPU1 identifies and needs delay() and CPU0 waits for CPU1, maybe the
> > one that needs to make progress never gets it.
> > 
> > I changed those delay(10) in cpu_hatch() to CPU_BUSY_CYCLE() and it went
> > ahead a bit better instead of hanging forever.
> > 
> > Then I remembered an idea something from years ago: fair kernel mutexes,
> > so basically mutexes implemented as ticket lock, like our kerne lock.
> > 
> > I did a quick diff, which probably contains a million bugs, but with
> > this bluhm's machine boots up well.
> > 
> > I'm not saying this is the solution, but it might be.
> 
> So the idea really is that the kernel mutexes are cheap and simple
> spin locks.  The assumption has always been that there shouldn't be a
> lot of contention on them.  If you have contention, your locking
> probably isn't fine-grained enough, or you're using the wrong lock
> type.  Note that our mpsafe pmaps use a per-page mutex.  So increasing
> the size of struct mutex is going to have a significant impact.
> 
> Maybe we need another lock type, although we already have one that
> tries to be "fair": struct __mp_lock, which is what we use for the
> kernel lock and the scheduler lock.  A non-recursive version of that
> might make sense.

After further testing I have come to the conclusion that changing
our mutexes to become a ticket lock is a flawed concept.  There are
lock order issues with the kernel lock caused by the spinning path
to be interruptible with code that can take the kernel lock.

CPU0: has the mutex, will soon release it
CPU1: gets the next ticket, spins on the mutex, gets an IRQ that tries
  to take the kernel lock
CPU2: has the kernel lock, tries to get the mutex, gets the ticket after
  CPU1, spins on the mutex

So now CPU2 is waiting for CPU1 to be done with the mutex, but since
CPU1 is waiting for CPU2 to be done with the kernel lock, we're stuck.

As long as a mutex can be interrupted by stuff that takes the kernel
lock, this concept won't work.

Back to the original problem: contention on the i2854_delay lock.

Changing it to an mp lock could work, because then it would be fair.
Does it need to be fair?  If we reduce the contention, there's no need
to change the lock.

So there are two things one could have a look at.  First, make use of
CPU_BUSY_CYCLE() instead of delay(10) when spinning up cores.  Second,
try to not use i8254_delay as delay_func.

Patrick

avoid sched lock recursion in sleep_signal_check()

2021-09-09 Thread Patrick Wildt

Hi,

one step to (at some point) change the sched lock to a mutex is to start
avoiding recursion on the sched lock.

single_thread_check() always takes the sched lock.  If we want to avoid
recursion, we need to call the locked version and take the sched lock
ourselves if we need to.

Another option might be to just put SCHED_LOCK()/SCHED_UNLOCK() around
call to sleep_signal_check(), but then this might end up in future
confusion if we ever wonder why we are taking the lock there in the
first place.

Opinions?

Patrick

diff --git a/sys/kern/kern_synch.c b/sys/kern/kern_synch.c
index b476a6b4253..030b3a38326 100644
--- a/sys/kern/kern_synch.c
+++ b/sys/kern/kern_synch.c
@@ -66,6 +66,7 @@
 #endif
 
 intsleep_signal_check(void);
+intsleep_signal_check_locked(int s);
 intthrsleep(struct proc *, struct sys___thrsleep_args *);
 intthrsleep_unlock(void *);
 
@@ -410,7 +411,7 @@ sleep_finish(struct sleep_state *sls, int do_sleep)
 * that case we need to unwind immediately.
 */
atomic_setbits_int(&p->p_flag, P_SINTR);
-   if ((error = sleep_signal_check()) != 0) {
+   if ((error = sleep_signal_check_locked(sls->sls_s)) != 0) {
p->p_stat = SONPROC;
sls->sls_catch = 0;
do_sleep = 0;
@@ -470,14 +471,30 @@ sleep_finish(struct sleep_state *sls, int do_sleep)
 
 /*
  * Check and handle signals and suspensions around a sleep cycle.
+ *
+ * single_thread_check() always takes the sched lock.  To avoid
+ * recursion on the sched lock, call the locked version and take
+ * it ourselves if we need to.
  */
 int
 sleep_signal_check(void)
+{
+   int err, s;
+
+   SCHED_LOCK(s);
+   err = sleep_signal_check_locked(s);
+   SCHED_UNLOCK(s);
+
+   return err;
+}
+
+int
+sleep_signal_check_locked(int s)
 {
struct proc *p = curproc;
int err, sig;
 
-   if ((err = single_thread_check(p, 1)) != 0)
+   if ((err = single_thread_check_locked(p, 1, s)) != 0)
return err;
if ((sig = cursig(p)) != 0) {
if (p->p_p->ps_sigacts->ps_sigintr & sigmask(sig))
diff --git a/sys/sys/proc.h b/sys/sys/proc.h
index 4dbc097f242..978fd5632cd 100644
--- a/sys/sys/proc.h
+++ b/sys/sys/proc.h
@@ -605,6 +605,7 @@ int single_thread_set(struct proc *, enum 
single_thread_mode, int);
 intsingle_thread_wait(struct process *, int);
 void   single_thread_clear(struct proc *, int);
 intsingle_thread_check(struct proc *, int);
+intsingle_thread_check_locked(struct proc *, int, int);
 
 void   child_return(void *);

ssdfb(4): SSD1309 OLED display (128x64)

2018-07-27 Thread Patrick Wildt

Hi,

I have an SPI-connected SSD1309 OLED display (128x64) that I would like
to support.  At some point I'd like to attach a graphical program to it,
so that a userland tool can draw graphics (or maybe something like X11).

I don't know the display subsystem very well, so this diff attempts to
attach wsdisplay(4) to ssdfb(4).  Since it's SPI connected and there is
no memory mapped framebuffer I have to manually update the display.  One
attempt I tried was hooking up the putchar/col/row functions, but I have
simplified it into a 100ms periodic timeout.  This seems usable enough.
Is there a better approach?

With the integrated 8x16 font I do get 4 rows with 16 characters per
row.  This is rather hard to use, a 5x8 font would make some more room.
But at the moment I don't intend to change that.

Feedback?

Patrick

diff --git a/sys/arch/arm64/conf/GENERIC b/sys/arch/arm64/conf/GENERIC
index d93417f4cce..144e54bd1c8 100644
--- a/sys/arch/arm64/conf/GENERIC
+++ b/sys/arch/arm64/conf/GENERIC
@@ -112,6 +112,8 @@ sdmmc*  at imxesdhc?
 imxpd* at fdt?
 imxdwusb*  at fdt?
 imxspi*at fdt?
+ssdfb* at spi?
+wsdisplay* at ssdfb?
 
 # Raspberry Pi 3
 bcmaux*at fdt?
diff --git a/sys/arch/armv7/conf/GENERIC b/sys/arch/armv7/conf/GENERIC
index 6f43b436c54..7a8cea0c4d0 100644
--- a/sys/arch/armv7/conf/GENERIC
+++ b/sys/arch/armv7/conf/GENERIC
@@ -60,6 +60,8 @@ imxehci*  at fdt? # EHCI
 usb*   at imxehci?
 imxrtc*at fdt? # SNVS RTC
 imxspi*at fdt?
+ssdfb* at spi?
+wsdisplay* at ssdfb?
 
 # OMAP3xxx/OMAP4xxx SoC
 omap0  at mainbus?
diff --git a/sys/dev/fdt/files.fdt b/sys/dev/fdt/files.fdt
index f65d9c3a5b8..52851b404d9 100644
--- a/sys/dev/fdt/files.fdt
+++ b/sys/dev/fdt/files.fdt
@@ -260,3 +260,7 @@ filedev/fdt/ccp_fdt.c   ccp_fdt
 
 attach com at fdt with com_fdt
 file   dev/fdt/com_fdt.c   com_fdt
+
+device ssdfb: wsemuldisplaydev, rasops1
+attach ssdfb at spi
+file   dev/fdt/ssdfb.c ssdfb
diff --git a/sys/dev/fdt/ssdfb.c b/sys/dev/fdt/ssdfb.c
new file mode 100644
index 000..ca136e0a28f
--- /dev/null
+++ b/sys/dev/fdt/ssdfb.c
@@ -0,0 +1,441 @@
+/* $OpenBSD$ */
+/*
+ * Copyright (c) 2018 Patrick Wildt 
+ *
+ * Permission to use, copy, modify, and distribute this software for any
+ * purpose with or without fee is hereby granted, provided that the above
+ * copyright notice and this permission notice appear in all copies.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
+ * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
+ * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
+ * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
+ * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
+ * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
+ * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+
+#define SSDFB_SET_LOWER_COLUMN_START_ADRESS0x00
+#define SSDFB_SET_HIGHER_COLUMN_START_ADRESS   0x10
+#define SSDFB_SET_MEMORY_ADRESSING_MODE0x20
+#define SSDFB_SET_START_LINE   0x40
+#define SSDFB_SET_CONTRAST_CONTROL 0x81
+#define SSDFB_SET_COLUMN_DIRECTION_REVERSE 0xa1
+#define SSDFB_SET_MULTIPLEX_RATIO  0xa8
+#define SSDFB_SET_COM_OUTPUT_DIRECTION 0xc0
+#define SSDFB_ENTIRE_DISPLAY_ON0xa4
+#define SSDFB_SET_DISPLAY_MODE_NORMAL  0xa6
+#define SSDFB_SET_DISPLAY_MODE_INVERS  0xa7
+#define SSDFB_SET_DISPLAY_OFF  0xae
+#define SSDFB_SET_DISPLAY_ON   0xaf
+#define SSDFB_SET_DISPLAY_OFFSET   0xd3
+#define SSDFB_SET_DISPLAY_CLOCK_DIVIDE_RATIO   0xd5
+#define SSDFB_SET_PRE_CHARGE_PERIOD0xd9
+#define SSDFB_SET_COM_PINS_HARD_CONF   0xda
+#define SSDFB_SET_VCOM_DESELECT_LEVEL  0xdb
+#define SSDFB_SET_PAGE_START_ADRESS0xb0
+
+#define SSDFB_WIDTH128
+#define SSDFB_HEIGHT   64
+
+struct ssdfb_softc {
+   struct devicesc_dev;
+   spi_tag_tsc_tag;
+   int  sc_node;
+
+   struct spi_configsc_conf;
+   uint32_t*sc_gpio;
+   size_t   sc_gpiolen;
+   int  sc_cd;
+
+   uint8_t *sc_fb;
+   size_t   sc_fbsize;
+   struct rasops_info   sc_rinfo;
+
+   struct task  sc_task;
+   struct timeout   sc_to;
+};
+
+int ssdfb_match(struct device *, void *, void *);
+voidssdfb_attach(struct d

ssdfb(4): mmap(2) support for X11

2018-07-31 Thread Patrick Wildt

Hi,

this adds mmap(2) support for ssdfb(4), so we can map the framebuffer
from userland and have X11 or another graphical program write into it.

ok?

Patrick

diff --git a/sys/dev/fdt/ssdfb.c b/sys/dev/fdt/ssdfb.c
index e91762a0ad2..8f141608c91 100644
--- a/sys/dev/fdt/ssdfb.c
+++ b/sys/dev/fdt/ssdfb.c
@@ -24,6 +24,8 @@
 #include 
 #include 
 
+#include 
+
 #include 
 
 #include 
@@ -190,7 +192,8 @@ ssdfb_attach(struct device *parent, struct device *self, 
void *aux)
sc->sc_fb = malloc(sc->sc_fbsize, M_DEVBUF, M_WAITOK | M_ZERO);
 
ri = &sc->sc_rinfo;
-   ri->ri_bits = malloc(sc->sc_fbsize, M_DEVBUF, M_WAITOK | M_ZERO);
+   ri->ri_bits = km_alloc(roundup(sc->sc_fbsize, PAGE_SIZE),
+   &kv_any, &kp_zero, &kd_waitok);
ri->ri_bs = ssdfb_bs;
ri->ri_flg = RI_CLEAR | RI_VCONS;
ri->ri_depth = 1;
@@ -230,7 +233,8 @@ ssdfb_detach(struct device *self, int flags)
struct rasops_info *ri = &sc->sc_rinfo;
timeout_del(&sc->sc_to);
task_del(systq, &sc->sc_task);
-   free(ri->ri_bits, M_DEVBUF, sc->sc_fbsize);
+   km_free(ri->ri_bits, roundup(sc->sc_fbsize, PAGE_SIZE),
+   &kv_any, &kp_zero);
free(sc->sc_fb, M_DEVBUF, sc->sc_fbsize);
free(sc->sc_gpio, M_DEVBUF, sc->sc_gpiolen);
return 0;
@@ -392,7 +396,17 @@ ssdfb_ioctl(void *v, u_long cmd, caddr_t data, int flag, 
struct proc *p)
 paddr_t
 ssdfb_mmap(void *v, off_t off, int prot)
 {
-   return -1;
+   struct ssdfb_softc  *sc = v;
+   struct rasops_info  *ri = &sc->sc_rinfo;
+   paddr_t  pa;
+
+   if (off >= sc->sc_fbsize || off < 0)
+   return (-1);
+
+   if (!pmap_extract(pmap_kernel(), (vaddr_t)ri->ri_bits, &pa))
+   return (-1);
+
+   return (pa + off);
 }
 
 int

Re: ssdfb(4): SSD1309 OLED display (128x64)

2018-08-01 Thread Patrick Wildt

On Tue, Jul 31, 2018 at 03:05:11PM +0200, Mark Kettenis wrote:
> You might want to look at what the udl(4) driver does.  The kernel
> driver lives in sys/dev/usb/udl.c and implements a "damage" ioctl that
> updates a region of the actual framebuffer from the virtual
> framebuffer that lives in physical memory.  X supposedly keeps track
> of the "damage" and the xf86-video-wsudl driver uses that to transfer
> the appropriate pixels.  Maybe this mechanism could be generalized by
> implementing a wscons ioctl?
> 
> Your approach is a bit wasteful since you're doing work even when the
> display is completely static.  Another problem is that your timeout
> might run in the middle of an update of the virtual framebuffer.
> 
> Not sure how this would work with graphics stuff that doesn't go
> through X though.  Something that just draws into a mmap'ed virtual
> framebuffer and issues ioctls should work though.

That is very helpful indeed.  I have come up with a quick diff to test
it, and it makes X work rather smooth on that SPI-connected OLED.  I'm
rather surprised.

I guess I will have to come up with a diff to try and make this DAMAGE
ioctl more generic.

Patrick

diff --git a/sys/dev/fdt/ssdfb.c b/sys/dev/fdt/ssdfb.c
index 4c774855147..ea5e213076f 100644
--- a/sys/dev/fdt/ssdfb.c
+++ b/sys/dev/fdt/ssdfb.c
@@ -24,7 +24,10 @@
 #include 
 #include 
 
+#include 
+
 #include 
+#include 
 
 #include 
 #include 
@@ -72,6 +75,7 @@ struct ssdfb_softc {
uint8_t *sc_fb;
size_t   sc_fbsize;
struct rasops_info   sc_rinfo;
+   int  sc_mode;
 
uint8_t  sc_column_range[2];
uint8_t  sc_page_range[2];
@@ -200,7 +204,8 @@ ssdfb_attach(struct device *parent, struct device *self, 
void *aux)
sc->sc_fb = malloc(sc->sc_fbsize, M_DEVBUF, M_WAITOK | M_ZERO);
 
ri = &sc->sc_rinfo;
-   ri->ri_bits = malloc(sc->sc_fbsize, M_DEVBUF, M_WAITOK | M_ZERO);
+   ri->ri_bits = malloc(roundup(sc->sc_fbsize, PAGE_SIZE),
+   M_DEVBUF, M_WAITOK | M_ZERO);
ri->ri_bs = ssdfb_bs;
ri->ri_flg = RI_CLEAR | RI_VCONS;
ri->ri_depth = 1;
@@ -240,7 +245,7 @@ ssdfb_detach(struct device *self, int flags)
struct rasops_info *ri = &sc->sc_rinfo;
timeout_del(&sc->sc_to);
task_del(systq, &sc->sc_task);
-   free(ri->ri_bits, M_DEVBUF, sc->sc_fbsize);
+   free(ri->ri_bits, M_DEVBUF, roundup(sc->sc_fbsize, PAGE_SIZE));
free(sc->sc_fb, M_DEVBUF, sc->sc_fbsize);
free(sc->sc_gpio, M_DEVBUF, sc->sc_gpiolen);
return 0;
@@ -417,13 +422,15 @@ ssdfb_ioctl(void *v, u_long cmd, caddr_t data, int flag, 
struct proc *p)
struct ssdfb_softc  *sc = v;
struct rasops_info  *ri = &sc->sc_rinfo;
struct wsdisplay_fbinfo *wdf;
+   struct udl_ioctl_damage *d;
+   int  mode;
 
switch (cmd) {
case WSDISPLAYIO_GETPARAM:
case WSDISPLAYIO_SETPARAM:
return (-1);
case WSDISPLAYIO_GTYPE:
-   *(u_int *)data = WSDISPLAY_TYPE_UNKNOWN;
+   *(u_int *)data = WSDISPLAY_TYPE_DL;
break;
case WSDISPLAYIO_GINFO:
wdf = (struct wsdisplay_fbinfo *)data;
@@ -436,10 +443,38 @@ ssdfb_ioctl(void *v, u_long cmd, caddr_t data, int flag, 
struct proc *p)
*(u_int *)data = ri->ri_stride;
break;
case WSDISPLAYIO_SMODE:
+   mode = *(u_int *)data;
+   switch (mode) {
+   case WSDISPLAYIO_MODE_EMUL:
+   if (sc->sc_mode != WSDISPLAYIO_MODE_EMUL) {
+   memset(ri->ri_bits, 0, roundup(sc->sc_fbsize,
+   PAGE_SIZE));
+   ssdfb_update(sc);
+   sc->sc_mode = mode;
+   }
+   break;
+   case WSDISPLAYIO_MODE_DUMBFB:
+   if (sc->sc_mode != WSDISPLAYIO_MODE_DUMBFB) {
+   memset(ri->ri_bits, 0, roundup(sc->sc_fbsize,
+   PAGE_SIZE));
+   timeout_del(&sc->sc_to);
+   task_del(systq, &sc->sc_task);
+   sc->sc_mode = mode;
+   }
+   break;
+   case WSDISPLAYIO_MODE_MAPPED:
+   default:
+   return (-1);
+   }
break;
case WSDISPLAYIO_GETSUPPORTEDDEPTH:
*(u_int *)data = WSDISPLAYIO_DEPTH_1;
break;
+   case UDLIO_DAMAGE:
+   d = (struct udl_ioctl_damage *)data;
+   d->status = UDLIO_STATUS_OK;
+   ssdfb_partial(sc, d->x1, d->x2, d->y1, d->y2);
+   break;
default:
return (-1);
}
@@ -

Re: Using shift on external keyboards in softraid passphrases from efiboot

2018-08-24 Thread Patrick Wildt

On Fri, Aug 24, 2018 at 10:47:27AM +0200, Theo Buehler wrote:
> On Fri, Aug 24, 2018 at 11:50:51AM +0900, YASUOKA Masahiko wrote:
> > Hi,
> > 
> > I think the diff should be brought to arm64 as well.  ok?
> 
> ok. But shouldn't armv7 also be kept in sync?

Exactly.

> 
> > 
> > On Thu, 23 Aug 2018 11:21:57 +0900 (JST)
> > YASUOKA Masahiko  wrote:
> > > On Mon, 20 Aug 2018 13:50:13 +0200
> > > Theo Buehler  wrote:
> > >> On Thu, Aug 16, 2018 at 09:51:32PM +0200, Frank Groeneveld wrote:
> > >>> I haven't been able to type the passphrase of my softraid device on
> > >>> boot when using an external keyboard on my Thinkpad X260. Finally I
> > >>> had some time to debug this problem and this is what I discovered.
> > >>> 
> > >>> On a different laptop with EFI, the ReadKeyStroke call will not return
> > >>> a packet when shift is pressed on the external keyboard. On the
> > >>> Thinkpad however, a packet is returned with UnicodeChar == 0, which
> > >>> results in a wrong passphrase being used.
> > >>> 
> > >>> This seems like a bug in the firmware to me, because according to some
> > >>> EFI specifications I found online, this should not return a packet.
> > >>> I've attached a simple patch that fixes this, but I'm not sure whether
> > >>> this might break things on different systems.
> > >> 
> > >> I can't comment on the technical side of this patch but I can confirm
> > >> that it allows me to enter the password from an external keyboard with
> > >> my x280.
> > > 
> > > In the spec,
> > > 
> > > | The UnicodeChar is the actual printable character or is zero if the
> > > | key does not represent a printable character (control key, function
> > > | key, etc.).
> > > 
> > > It seems that UnicodeChar can be zero.  So I think the diff is OK even
> > > on the spec.
> > > 
> > > If there is no futher comment I'll commit it.  Thanks.
> > 
> > 
> > Index: sys/arch/arm64/stand/efiboot/efiboot.c
> > ===
> > RCS file: /cvs/src/sys/arch/arm64/stand/efiboot/efiboot.c,v
> > retrieving revision 1.20
> > diff -u -p -r1.20 efiboot.c
> > --- sys/arch/arm64/stand/efiboot/efiboot.c  23 Aug 2018 15:31:12 -  
> > 1.20
> > +++ sys/arch/arm64/stand/efiboot/efiboot.c  24 Aug 2018 02:44:30 -
> > @@ -129,7 +129,7 @@ efi_cons_getc(dev_t dev)
> > }
> >  
> > status = conin->ReadKeyStroke(conin, &key);
> > -   while (status == EFI_NOT_READY) {
> > +   while (status == EFI_NOT_READY || key.UnicodeChar == 0) {
> > if (dev & 0x80)
> > return (0);
> > /*
>

Re: memory leaks in bwfm

2018-09-18 Thread Patrick Wildt

The code hands off the reponsibility of ctl and ctl->buf to bs_txctl
which will free both buffers if there is an error enqueueing the
command.  Only if bs_txctl succeeds in enqueueing and there is a
response timeout we can free it.  Thus, not ok.  If this pattern
is not understandable then we can work on that, but the diff as is
will add double frees on error.

On Tue, Sep 18, 2018 at 03:52:45PM +1000, Jonathan Gray wrote:
> Index: bwfm.c
> ===
> RCS file: /cvs/src/sys/dev/ic/bwfm.c,v
> retrieving revision 1.54
> diff -u -p -r1.54 bwfm.c
> --- bwfm.c25 Jul 2018 20:37:11 -  1.54
> +++ bwfm.c18 Sep 2018 05:21:30 -
> @@ -1297,6 +1297,7 @@ bwfm_proto_bcdc_query_dcmd(struct bwfm_s
>  
>   if (bwfm_proto_bcdc_txctl(sc, reqid, (char *)dcmd, &size)) {
>   DPRINTF(("%s: tx failed\n", DEVNAME(sc)));
> + free(dcmd, M_TEMP, size);
>   return ret;
>   }
>  
> @@ -1337,6 +1338,7 @@ bwfm_proto_bcdc_set_dcmd(struct bwfm_sof
>  
>   if (bwfm_proto_bcdc_txctl(sc, reqid, (char *)dcmd, &size)) {
>   DPRINTF(("%s: txctl failed\n", DEVNAME(sc)));
> + free(dcmd, M_TEMP, size);
>   return ret;
>   }
>  
> @@ -1361,6 +1363,7 @@ bwfm_proto_bcdc_txctl(struct bwfm_softc 
>  
>   if (sc->sc_bus_ops->bs_txctl(sc, ctl)) {
>   DPRINTF(("%s: tx failed\n", DEVNAME(sc)));
> + free(ctl, M_TEMP, sizeof(*ctl));
>   return 1;
>   }
>  
>

Re: Add "Spleen 5x8" font to wsfont

2018-09-21 Thread Patrick Wildt

On Thu, Sep 20, 2018 at 09:44:09PM +0200, Frederic Cambus wrote:
> Hi tech@,
> 
> Here is a diff to add "Spleen 5x8" to wsfont, a font targetted at small
> OLED displays to be used with devices handled by ssdfb(4). It contains
> all printable ASCII characters (96 glyphes).
> 
> The font is 2-Clause BSD licensed and is my original creation.
> 
> In order to enable and test the font, this option should be added to the
> kernel configuration file: option FONT_SPLEEN5x8
> 
> Screenshot: https://www.cambus.net/files/openbsd/dmesg-spleen5x8.png
> 
> Comments? OK?

I have already tested the other versions and I'm very happy with the
results.  So ok by me.  Thanks for all your efforts!

> Index: sys/dev/wsfont/wsfont.c
> ===
> RCS file: /cvs/src/sys/dev/wsfont/wsfont.c,v
> retrieving revision 1.52
> diff -u -p -r1.52 wsfont.c
> --- sys/dev/wsfont/wsfont.c   8 Sep 2017 05:36:53 -   1.52
> +++ sys/dev/wsfont/wsfont.c   20 Sep 2018 18:52:29 -
> @@ -43,6 +43,11 @@
>  
>  #undef HAVE_FONT
>  
> +#ifdef FONT_SPLEEN5x8
> +#define HAVE_FONT 1
> +#include 
> +#endif
> +
>  #ifdef FONT_BOLD8x16
>  #define HAVE_FONT 1
>  #include 
> @@ -105,6 +110,9 @@ static struct font builtin_fonts[] = {
>  #endif
>  #ifdef FONT_GALLANT12x22
>   BUILTIN_FONT(gallant12x22, 3),
> +#endif
> +#ifdef FONT_SPLEEN5x8
> + BUILTIN_FONT(spleen5x8, 4),
>  #endif
>  #undef BUILTIN_FONT
>  };
> Index: sys/dev/wsfont/spleen5x8.h
> ===
> RCS file: sys/dev/wsfont/spleen5x8.h
> diff -N sys/dev/wsfont/spleen5x8.h
> --- /dev/null 1 Jan 1970 00:00:00 -
> +++ sys/dev/wsfont/spleen5x8.h20 Sep 2018 18:52:29 -
> @@ -0,0 +1,910 @@
> +/*   $OpenBSD$ */
> +
> +/*
> + * Copyright (c) 2018 Frederic Cambus 
> + * All rights reserved.
> + *
> + * Redistribution and use in source and binary forms, with or without
> + * modification, are permitted provided that the following conditions
> + * are met:
> + * 1. Redistributions of source code must retain the above copyright
> + *notice, this list of conditions and the following disclaimer.
> + * 2. Redistributions in binary form must reproduce the above copyright
> + *notice, this list of conditions and the following disclaimer in the
> + *documentation and/or other materials provided with the distribution.
> + *
> + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
> + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
> + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
> + * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
> + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
> + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
> + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
> + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
> + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
> + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
> + * SUCH DAMAGE.
> + */
> +
> +static u_char spleen5x8_data[];
> +
> +struct wsdisplay_font spleen5x8 = {
> + "Spleen 5x8",   /* typeface name */
> + 0,  /* index */
> + ' ',/* firstchar */
> + 128 - ' ',  /* numchars */
> + WSDISPLAY_FONTENC_ISO,  /* encoding */
> + 5,  /* width */
> + 8,  /* height */
> + 1,  /* stride */
> + WSDISPLAY_FONTORDER_L2R,/* bit order */
> + WSDISPLAY_FONTORDER_L2R,/* byte order */
> + NULL,   /* cookie */
> + spleen5x8_data  /* data */
> +};
> +
> +static u_char spleen5x8_data[] = {
> + 0x00,   /*  */
> + 0x00,   /*  */
> + 0x00,   /*  */
> + 0x00,   /*  */
> + 0x00,   /*  */
> + 0x00,   /*  */
> + 0x00,   /*  */
> + 0x00,   /*  */
> +
> + 0x20,   /* ..*. */
> + 0x20,   /* ..*. */
> + 0x20,   /* ..*. */
> + 0x20,   /* ..*. */
> + 0x20,   /* ..*. */
> + 0x00,   /*  */
> + 0x20,   /* ..*. */
> + 0x00,   /*  */
> +
> + 0x50,   /* .*.* */
> + 0x50,   /* .*.* */
> + 0x50,   /* .*.* */
> + 0x00,   /*  */
> + 0x00,   /*  */
> + 0x00,   /*  */
> + 0x00,   /*  */
> + 0x00,   /*  */
> +
> + 0x00,   /*  */
> + 0x50,   /* .*.* */
> + 0xf8,   /* *... */
> + 0x50,   /* .*.* */
> + 0x50,   /* .*.* */
> + 0xf8,   /* *... */
> + 0x50,   /* .*.* */
> + 0x00,   /* ..

lldb: build and install

2018-10-02 Thread Patrick Wildt

Hi,

we already do have the sources for LLDB, the only thing left to do is
add the build infrastructure so that we actually compile all the
independent pieces and link them together.  Aparently LLDB also makes
use of some of the clang libraries, so those are part of LLDB linking
dependencies as well.

Since we have no Python in base we have to explicitly disable Python,
otherwise it will try to use Python headers and probably also link
against it.

According to kettenis@, debugging core files should work, actually
running stuff probably won't.  Still, having lldb is a first step.

Compiled on amd64, tests on other clang architectures would be nice.

Feedback?  ok?

Patrick

diff --git a/gnu/usr.bin/clang/Makefile b/gnu/usr.bin/clang/Makefile
index 250fb1d64f5..62490b00968 100644
--- a/gnu/usr.bin/clang/Makefile
+++ b/gnu/usr.bin/clang/Makefile
@@ -43,6 +43,8 @@ SUBDIR+=libLLVMCoverage
 SUBDIR+=libLLVMDebugInfoCodeView
 SUBDIR+=libLLVMDebugInfoDWARF
 SUBDIR+=libLLVMDebugInfoMSF
+SUBDIR+=libLLVMDebugInfoPDB
+SUBDIR+=libLLVMExecutionEngine
 SUBDIR+=libLLVMGlobalISel
 SUBDIR+=libLLVMLTO
 SUBDIR+=libLLVMPasses
@@ -86,5 +88,44 @@ SUBDIR+=liblldELF
 
 SUBDIR+=lld
 
+SUBDIR+=liblldbABI
+SUBDIR+=liblldbAPI
+SUBDIR+=liblldbBreakpoint
+SUBDIR+=liblldbCommands
+SUBDIR+=liblldbCore
+SUBDIR+=liblldbDataFormatters
+SUBDIR+=liblldbExpression
+SUBDIR+=liblldbHostCommon
+SUBDIR+=liblldbHostOpenBSD
+SUBDIR+=liblldbHostPOSIX
+SUBDIR+=liblldbInitialization
+SUBDIR+=liblldbInterpreter
+SUBDIR+=liblldbPluginArchitecture
+SUBDIR+=liblldbPluginDisassembler
+SUBDIR+=liblldbPluginDynamicLoader
+SUBDIR+=liblldbPluginExpressionParser
+SUBDIR+=liblldbPluginInstruction
+SUBDIR+=liblldbPluginInstrumentationRuntime
+SUBDIR+=liblldbPluginJITLoader
+SUBDIR+=liblldbPluginLanguage
+SUBDIR+=liblldbPluginLanguageRuntime
+SUBDIR+=liblldbPluginMemoryHistory
+SUBDIR+=liblldbPluginObjectContainer
+SUBDIR+=liblldbPluginObjectFile
+SUBDIR+=liblldbPluginOperatingSystem
+SUBDIR+=liblldbPluginPlatform
+SUBDIR+=liblldbPluginProcess
+SUBDIR+=liblldbPluginScriptInterpreter
+SUBDIR+=liblldbPluginStructuredData
+SUBDIR+=liblldbPluginSymbolFile
+SUBDIR+=liblldbPluginSymbolVendor
+SUBDIR+=liblldbPluginSystemRuntime
+SUBDIR+=liblldbPluginUnwindAssembly
+SUBDIR+=liblldbSymbol
+SUBDIR+=liblldbTarget
+SUBDIR+=liblldbUtility
+
+SUBDIR+=lldb
+
 .include 
 .include 
diff --git a/gnu/usr.bin/clang/Makefile.inc b/gnu/usr.bin/clang/Makefile.inc
index 0b99edce43d..90fbe660c35 100644
--- a/gnu/usr.bin/clang/Makefile.inc
+++ b/gnu/usr.bin/clang/Makefile.inc
@@ -17,6 +17,8 @@ DEBUG=
 NOPIE=
 
 CLANG_INCLUDES=-I${LLVM_SRCS}/tools/clang/include
+LLDB_INCLUDES= -I${LLVM_SRCS}/tools/lldb/include \
+   -I${LLVM_SRCS}/tools/lldb/source
 CPPFLAGS+= -I${LLVM_SRCS}/include -I${.CURDIR}/../include -I${.OBJDIR} \
-I${.OBJDIR}/../include
 CPPFLAGS+= -DNDEBUG
@@ -42,6 +44,7 @@ 
CPPFLAGS+=-DLLVM_NATIVE_DISASSEMBLER=LLVMInitialize${LLVM_ARCH}Disassembler
 CPPFLAGS+=-DLLVM_NATIVE_TARGET=LLVMInitialize${LLVM_ARCH}Target
 CPPFLAGS+=-DLLVM_NATIVE_TARGETINFO=LLVMInitialize${LLVM_ARCH}TargetInfo
 CPPFLAGS+=-DLLVM_NATIVE_TARGETMC=LLVMInitialize${LLVM_ARCH}TargetMC
+CPPFLAGS+=-DLLDB_DISABLE_PYTHON
 
 # upstream defaults
 CFLAGS+=   -ffunction-sections
@@ -57,7 +60,9 @@ CXXFLAGS+=-Wall -W -Wno-unused-parameter -Wwrite-strings 
-Wcast-qual \
-Wno-missing-field-initializers -pedantic -Wno-long-long \
-Wdelete-non-virtual-dtor -Wno-comment
 
+LDADD+=-Wl,--start-group
 .for lib in ${LLVM_LIBDEPS}
 DPADD+=${.OBJDIR}/../lib${lib}/lib${lib}.a
 LDADD+=${.OBJDIR}/../lib${lib}/lib${lib}.a
 .endfor
+LDADD+=-Wl,--end-group
diff --git a/gnu/usr.bin/clang/include/lldb/Host/Config.h 
b/gnu/usr.bin/clang/include/lldb/Host/Config.h
new file mode 100644
index 000..1fc5396e2bb
--- /dev/null
+++ b/gnu/usr.bin/clang/include/lldb/Host/Config.h
@@ -0,0 +1,29 @@
+//===-- Config.h ---*- C++ -*-===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===--===//
+
+#ifndef LLDB_HOST_CONFIG_H
+#define LLDB_HOST_CONFIG_H
+
+#define LLDB_CONFIG_TERMIOS_SUPPORTED
+
+/* #define LLDB_DISABLE_POSIX */
+
+#define HAVE_SYS_EVENT_H 1
+
+#define HAVE_PPOLL 1
+
+#define HAVE_SIGACTION 1
+
+#define HAVE_PROCESS_VM_READV 0
+
+#define HAVE_NR_PROCESS_VM_READV 0
+
+/* #define HAVE_LIBCOMPRESSION */
+
+#endif // #ifndef LLDB_HOST_CONFIG_H
diff --git a/gnu/usr.bin/clang/libLLVMDebugInfoCodeView/Makefile 
b/gnu/usr.bin/clang/libLLVMDebugInfoCodeView/Makefile
index fbd9cd29083..f4d185d89f6 100644
--- a/gnu/usr.bin/clang/libLLVMDebugInfoCodeView/Makefile
+++ b/gnu/usr.bin/clang/libLLVMDebugInfoCodeView/Makefile
@@ -7,22 +7,41 @@ NOPROFILE=
 CPPFLAGS+= -I${LLVM_SRCS}/include/llvm/DebugInfo/C

Re: lldb: build and install

2018-10-03 Thread Patrick Wildt

On Tue, Oct 02, 2018 at 06:07:22PM +0200, Mark Kettenis wrote:
> > Date: Tue, 2 Oct 2018 17:24:42 +0200
> > From: Patrick Wildt 
> > 
> > Hi,
> > 
> > we already do have the sources for LLDB, the only thing left to do is
> > add the build infrastructure so that we actually compile all the
> > independent pieces and link them together.  Aparently LLDB also makes
> > use of some of the clang libraries, so those are part of LLDB linking
> > dependencies as well.
> > 
> > Since we have no Python in base we have to explicitly disable Python,
> > otherwise it will try to use Python headers and probably also link
> > against it.
> > 
> > According to kettenis@, debugging core files should work, actually
> > running stuff probably won't.  Still, having lldb is a first step.
> > 
> > Compiled on amd64, tests on other clang architectures would be nice.
> > 
> > Feedback?  ok?
> 
> I would like to get this in.  That said, I'm not sure lldb in its
> current state is useful enough to ship in 6.4.

Even then I'd still like to put this in so that we can have a go at
making it useful.  We can still disable it for the release if we see
that it wouldn't be reasonable shipping it.

So I'd go ahead and commit tomorrow or so if there are no further
objections.

ure(4): VLANs and Jumbo frames

2018-10-30 Thread Patrick Wildt

Hi,

I recently wanted to use VLANs on ure(4) but failed.  As it turns out,
hardmtu is set to 1500 which apparently does not leave enough space
for the VLAN tag.  It looks like the Gigabit Version does even support
Jumbo frames.  Thus we can set the maximum framelen on the RX path
to something bigger and enable Jumbo frames.

I tried this on a Winyao USB1000F.  Would be nice if people tested this
change who have those ure(4)s in active use.

Thanks,
Patrick

diff --git a/sys/dev/usb/if_ure.c b/sys/dev/usb/if_ure.c
index b6c6c99ef34..637fd5eca5f 100644
--- a/sys/dev/usb/if_ure.c
+++ b/sys/dev/usb/if_ure.c
@@ -695,6 +695,9 @@ ure_rtl8152_init(struct ure_softc *sc)
 
ure_init_fifo(sc);
 
+   /* Set allowed frame size. */
+   ure_write_2(sc, URE_PLA_RMS, URE_MCU_TYPE_PLA, URE_MAX_FRAMELEN_8152);
+
ure_write_1(sc, URE_USB_TX_AGG, URE_MCU_TYPE_USB,
URE_TX_AGG_MAX_THRESHOLD);
ure_write_4(sc, URE_USB_RX_BUF_TH, URE_MCU_TYPE_USB, URE_RX_THR_HIGH);
@@ -835,6 +838,10 @@ ure_rtl8153_init(struct ure_softc *sc)
 
ure_init_fifo(sc);
 
+   /* Set allowed frame size. */
+   ure_write_2(sc, URE_PLA_RMS, URE_MCU_TYPE_PLA, URE_MAX_FRAMELEN_8153);
+   ure_write_2(sc, URE_PLA_MTPS, URE_MCU_TYPE_PLA, URE_MTPS_JUMBO);
+
/* Enable Rx aggregation. */
ure_write_2(sc, URE_USB_USB_CTRL, URE_MCU_TYPE_USB,
ure_read_2(sc, URE_USB_USB_CTRL, URE_MCU_TYPE_USB) &
@@ -1147,6 +1154,12 @@ ure_attach(struct device *parent, struct device *self, 
void *aux)
ifp->if_start = ure_start;
ifp->if_capabilities = 0;
 
+   if (sc->ure_flags & URE_FLAG_8152)
+   ifp->ifp_hardmtu = URE_MAX_FRAMELEN_8152;
+   else
+   ifp->ifp_hardmtu = URE_MAX_FRAMELEN_8153;
+   ifp->ifp_hardmtu -= (ETHER_HDR_LEN + ETHER_VLAN_ENCAP_LEN);
+
mii = &sc->ure_mii;
mii->mii_ifp = ifp;
mii->mii_readreg = ure_miibus_readreg;
diff --git a/sys/dev/usb/if_urereg.h b/sys/dev/usb/if_urereg.h
index 2260ec37890..8963d2753ed 100644
--- a/sys/dev/usb/if_urereg.h
+++ b/sys/dev/usb/if_urereg.h
@@ -41,7 +41,8 @@
 #defineURE_BYTE_EN_BYTE0x11
 #defineURE_BYTE_EN_SIX_BYTES   0x3f
 
-#defineURE_MAX_FRAMELEN(ETHER_MAX_LEN + ETHER_VLAN_ENCAP_LEN)
+#defineURE_MAX_FRAMELEN_8152   (ETHER_MAX_LEN + ETHER_VLAN_ENCAP_LEN)
+#defineURE_MAX_FRAMELEN_8153   (9 * 1024)
 
 #defineURE_PLA_IDR 0xc000
 #defineURE_PLA_RCR 0xc010
@@ -186,6 +187,10 @@
 /* PLA_TCR1 */
 #defineURE_VERSION_MASK0x7cf0
 
+/* PLA_MTPS */
+#define URE_MTPS_JUMBO (12 * 1024 / 64)
+#define URE_MTPS_DEFAULT   (6 * 1024 / 64)
+
 /* PLA_CR */
 #defineURE_CR_RST  0x10
 #defineURE_CR_RE   0x08

1 2 3 4 5 6 >

1 - 100 of 502 matches

Mail list logo