Re: ssh-add(1): fix NULL in fprintf

2022-06-17 Thread Darren Tucker
On Fri, 17 Jun 2022 at 04:49, Martin Vahlensieck
 wrote:
> ping, diff attached

Applied, thanks.

-- 
Darren Tucker (dtucker at dtucker.net)
GPG key 11EAA6FA / A86E 3E07 5B19 5880 E860  37F4 9357 ECEF 11EA A6FA (new)
Good judgement comes with experience. Unfortunately, the experience
usually comes from bad judgement.



Re: Lenovo ThinkCentre M910q fails to suspend

2022-06-17 Thread Edd Barrett
On Fri, Jun 17, 2022 at 12:46:49PM -0700, Mike Larkin wrote:
> Sorry, out of ideas.

No worries. Thanks for all of the back and forth :)

-- 
Best Regards
Edd Barrett

https://www.theunixzoo.co.uk



Re: Lenovo ThinkCentre M910q fails to suspend

2022-06-17 Thread Mike Larkin
On Fri, Jun 17, 2022 at 08:32:29PM +0100, Edd Barrett wrote:
> Hi Mike,
>
> On Fri, Jun 17, 2022 at 11:55:51AM -0700, Mike Larkin wrote:
> > >  - disabling xhci in ukc: the system fails to boot multi-user. The first
> > >oddness comes where cpus #1-3 fail to "become ready" (as reported by 
> > > dmesg).
> > >It spends a while thinking about these cores not coming up, before
> > >eventually proceeding, but eventually hard resetting. I guess the 
> > > system
> > >really needs xhci to function then...
> >
> > That really makes no sense. can you try the same experiment again using 
> > GENERIC?
>
> Doing `boot bsd.sp -c` and then `disable xhci` means that the system can at
> least boot with no xhci, but sadly it still won't stay in the suspended state.
>
> That might rule out xhci as a source of the issue, maybe.
>
> --
> Best Regards
> Edd Barrett
>
> https://www.theunixzoo.co.uk

Sorry, out of ideas.



Re: Lenovo ThinkCentre M910q fails to suspend

2022-06-17 Thread Edd Barrett
Hi Mike,

On Fri, Jun 17, 2022 at 11:55:51AM -0700, Mike Larkin wrote:
> >  - disabling xhci in ukc: the system fails to boot multi-user. The first
> >oddness comes where cpus #1-3 fail to "become ready" (as reported by 
> > dmesg).
> >It spends a while thinking about these cores not coming up, before
> >eventually proceeding, but eventually hard resetting. I guess the system
> >really needs xhci to function then...
> 
> That really makes no sense. can you try the same experiment again using 
> GENERIC?

Doing `boot bsd.sp -c` and then `disable xhci` means that the system can at
least boot with no xhci, but sadly it still won't stay in the suspended state.

That might rule out xhci as a source of the issue, maybe.

-- 
Best Regards
Edd Barrett

https://www.theunixzoo.co.uk



Re: Lenovo ThinkCentre M910q fails to suspend

2022-06-17 Thread Mike Larkin
On Fri, Jun 17, 2022 at 06:41:23PM +0100, Edd Barrett wrote:
> Hi Mike,
>
> On Fri, Jun 17, 2022 at 10:20:35AM -0700, Mike Larkin wrote:
> > Oh, didn't read this closely enough the first time. If ZZZ doesn't 
> > instantly wake
> > the machine, then it's one of the two S3 devices described in your next 
> > email.
> > Since one is XHCI, I'd disable xhci in ukc and see if that helps. Or maybe 
> > the
> > other *hci(4)s also.
>
> Alright, so here's some more info.
>
>  - disabling xhci in ukc: the system fails to boot multi-user. The first
>oddness comes where cpus #1-3 fail to "become ready" (as reported by 
> dmesg).
>It spends a while thinking about these cores not coming up, before
>eventually proceeding, but eventually hard resetting. I guess the system
>really needs xhci to function then...

That really makes no sense. can you try the same experiment again using GENERIC?

>
>  - disabling ehci or ahci: no change to sleep behaviour.
>
>  - disabling usb: no change to sleep behaviour.
>
> I don't see any [XEA]HCI options in the BIOS that I could tweak.
>
> Unless you have any other ideas, I'll try disabling random devices in the hope
> that I can narrow it down... I've already tried the network card, it aint 
> that.
>
> Thanks.
>
> --
> Best Regards
> Edd Barrett
>
> https://www.theunixzoo.co.uk
>



Re: Lenovo ThinkCentre M910q fails to suspend

2022-06-17 Thread Edd Barrett
Hi Mike,

On Fri, Jun 17, 2022 at 10:20:35AM -0700, Mike Larkin wrote:
> Oh, didn't read this closely enough the first time. If ZZZ doesn't instantly 
> wake
> the machine, then it's one of the two S3 devices described in your next email.
> Since one is XHCI, I'd disable xhci in ukc and see if that helps. Or maybe the
> other *hci(4)s also.

Alright, so here's some more info.

 - disabling xhci in ukc: the system fails to boot multi-user. The first
   oddness comes where cpus #1-3 fail to "become ready" (as reported by dmesg).
   It spends a while thinking about these cores not coming up, before
   eventually proceeding, but eventually hard resetting. I guess the system
   really needs xhci to function then...

 - disabling ehci or ahci: no change to sleep behaviour.

 - disabling usb: no change to sleep behaviour.

I don't see any [XEA]HCI options in the BIOS that I could tweak.

Unless you have any other ideas, I'll try disabling random devices in the hope
that I can narrow it down... I've already tried the network card, it aint that.

Thanks.

-- 
Best Regards
Edd Barrett

https://www.theunixzoo.co.uk



Re: Lenovo ThinkCentre M910q fails to suspend

2022-06-17 Thread Mike Larkin
On Fri, Jun 17, 2022 at 09:14:45AM +0100, Edd Barrett wrote:
> Hi Mike,
>
> On Thu, Jun 16, 2022 at 09:19:50PM -0700, Mike Larkin wrote:
> > From your original dmesg:
> >
> > > acpi0: wakeup devices PEG0(S4) PEGP(S4) PEG1(S4) PEGP(S4) PEG2(S4) 
> > > PEGP(S4) SIO1(S3) RP09(S4) PXSX(S4) RP10(S4) PXSX(S4) RP11(S4) PXSX(S4) 
> > > RP12(S4) PXSX(S4) RP13(S4) [...]
> >
> > Notice the [...] at the end, this is printed after 16 devices. What I'd 
> > suggest
> > is this:
> >
> > 1. remove the code that truncates this list after 16, and note down all the 
> > wake
> > devices.
> >
> > 2. If there are any in S3, try using ZZZ instead of zzz. If the machine 
> > does not instantly
> > wake, it's possible it's because of one of those S3 devices doing the wake 
> > (since ZZZ
> > uses S4).
>
> I'll try removing the truncation then. Bear with me.
>
> In the meantime, notice that the truncated list does include one S3 item
> `SIO1(S3)`. I don't know if that's what we are looking for?
>
> FWIW, I have already tried `ZZZ` on this machine and it does succeed to
> hibernate, but upon wake up, it hangs when decompressing the memory image. I
> left it decompressing a ~50MB image for more than an hour and concluded it had
> got stuck.

Oh, didn't read this closely enough the first time. If ZZZ doesn't instantly 
wake
the machine, then it's one of the two S3 devices described in your next email.
Since one is XHCI, I'd disable xhci in ukc and see if that helps. Or maybe the
other *hci(4)s also.

Now, why ZZZ fails to unpack is some other problem but the instant wake is not
related to that.

-ml

>
> > 3. If everything is S4, well, you're going to have to trace down those 
> > short names
> > like PEGP, PXSX, etc, and disable one at a time until you find the one that 
> > is
> > doing the wake. And it's possible it's none of these and is a fixed function
> > button or something.
>
> One additional piece of info, which may be worthless. I tried a Debian live 
> USB
> stick, to see if Linux was able to sleep this box. It was able to.
>
> I don't know if that rules out the idea of a fixed-function button?
>
> --
> Best Regards
> Edd Barrett
>
> https://www.theunixzoo.co.uk



Re: Lenovo ThinkCentre M910q fails to suspend

2022-06-17 Thread Mike Larkin
On Fri, Jun 17, 2022 at 09:14:45AM +0100, Edd Barrett wrote:
> Hi Mike,
>
> On Thu, Jun 16, 2022 at 09:19:50PM -0700, Mike Larkin wrote:
> > From your original dmesg:
> >
> > > acpi0: wakeup devices PEG0(S4) PEGP(S4) PEG1(S4) PEGP(S4) PEG2(S4) 
> > > PEGP(S4) SIO1(S3) RP09(S4) PXSX(S4) RP10(S4) PXSX(S4) RP11(S4) PXSX(S4) 
> > > RP12(S4) PXSX(S4) RP13(S4) [...]
> >
> > Notice the [...] at the end, this is printed after 16 devices. What I'd 
> > suggest
> > is this:
> >
> > 1. remove the code that truncates this list after 16, and note down all the 
> > wake
> > devices.
> >
> > 2. If there are any in S3, try using ZZZ instead of zzz. If the machine 
> > does not instantly
> > wake, it's possible it's because of one of those S3 devices doing the wake 
> > (since ZZZ
> > uses S4).
>
> I'll try removing the truncation then. Bear with me.
>
> In the meantime, notice that the truncated list does include one S3 item
> `SIO1(S3)`. I don't know if that's what we are looking for?
>
> FWIW, I have already tried `ZZZ` on this machine and it does succeed to
> hibernate, but upon wake up, it hangs when decompressing the memory image. I
> left it decompressing a ~50MB image for more than an hour and concluded it had
> got stuck.
>
> > 3. If everything is S4, well, you're going to have to trace down those 
> > short names
> > like PEGP, PXSX, etc, and disable one at a time until you find the one that 
> > is
> > doing the wake. And it's possible it's none of these and is a fixed function
> > button or something.
>
> One additional piece of info, which may be worthless. I tried a Debian live 
> USB
> stick, to see if Linux was able to sleep this box. It was able to.
>
> I don't know if that rules out the idea of a fixed-function button?
>
> --
> Best Regards
> Edd Barrett
>
> https://www.theunixzoo.co.uk

You're going to have to play trial and error then disabling devices until
you find the one that hangs. Without the hardware in front of me, that's the
best advice I can offer. Sorry.

-ml




Re: bgpd switch kroute_find to bgpd_addr argument

2022-06-17 Thread Claudio Jeker
On Fri, Jun 17, 2022 at 12:32:50PM +0200, Theo Buehler wrote:
> On Fri, Jun 17, 2022 at 11:37:29AM +0200, Claudio Jeker wrote:
> > On Thu, Jun 16, 2022 at 09:01:33PM +0200, Theo Buehler wrote:
> > > On Thu, Jun 16, 2022 at 07:15:51PM +0200, Claudio Jeker wrote:
> > > > Not much holds us back to switch kroute_find() to use a struct bgpd_addr
> > > > as prefix argument. Only the match code needs some adoption.
> > > > I created applymask() for this which works on bgpd_addrs like
> > > > inet4applymask works on in_addrs.
> > > > 
> > > > I also switched a simple case in session.c over to applymask().
> > > > There is another one in bgpctl which I will send out once this is in.
> > > 
> > > ok
> > > 
> > > A nit and a question below.
> > > 
> > > [...]
> > > 
> > > > +kroute_find(struct ktable *kt, const struct bgpd_addr *prefix,
> > > > +uint8_t prefixlen, uint8_t prio)
> > > >  {
> > > > struct kroute_node  s;
> > > > struct kroute_node  *kn, *tmp;
> > > >  
> > > > -   s.r.prefix.s_addr = prefix;
> > > > +   s.r.prefix= prefix->v4;
> > > 
> > > Missing space before =.
> > > 
> > > [...]
> > > 
> > > > Index: util.c
> > > > ===
> > > > RCS file: /cvs/src/usr.sbin/bgpd/util.c,v
> > > > retrieving revision 1.64
> > > > diff -u -p -r1.64 util.c
> > > > --- util.c  16 Jun 2022 15:33:05 -  1.64
> > > > +++ util.c  16 Jun 2022 16:50:47 -
> > > > @@ -783,6 +783,26 @@ inet6applymask(struct in6_addr *dest, co
> > > > dest->s6_addr[i] = src->s6_addr[i] & mask.s6_addr[i];
> > > >  }
> > > >  
> > > > +void
> > > > +applymask(struct bgpd_addr *dest, const struct bgpd_addr *src, int 
> > > > prefixlen)
> > > > +{
> > > > +   struct bgpd_addr tmp;
> > > > +
> > > > +   /* use temporary storage in case src and dest point to same 
> > > > struct */
> > > > +   tmp = *src;
> > > 
> > > While I'm fine with doing it this way, why is this tmp dance needed? I
> > > would have thought that both inet4applymask() and inet6applymask() work
> > > fine if src and dest point to the same struct. bgpctl's parse.y assumes
> > > that this is the case.
> > 
> > While inet4applymask() and inet6applymask() work the problem comes with the
> > rest of the struct bgpd_addr fields.
> > 
> > Now I could do a trick and copy from offsetof(rd) to sizeof(*src) -
> > offsetof(rd) plus in the IPv4 case the rest of the union ba needs to
> > be cleared as well. So I think while possible to do this without this
> > temporary storage the code will be rather complex.
> > 
> > An alternative option is to just store the v4/v6 values before
> > memmove(dst, src) and then using those values in the inetXapplymask calls.
> > I guess that is a reasonable alternative which is a bit more optimized.
> > 
> > What do you think?
> 
> While I'm not against your alternative option, I think the copies are
> cheap enough that we should not need to optimize them.
> 
> What confuses me is the comment and why we need temporary storage at
> all.
> 
> The simple assignment is well-defined if src and dest point to the same
> struct (https://port70.net/%7Ensz/c/c99/n1256.html#6.5.16.1p3), so I
> would have written applymask() this way:
> 
> void
> applymask(struct bgpd_addr *dest, const struct bgpd_addr *src, int prefixlen)
> {
>   *dest = *src;
>   switch (src->aid) {
>   case AID_INET:
>   case AID_VPN_IPv4:
>   inet4applymask(>v4, >v4, prefixlen);
>   break;
>   case AID_INET6:
>   case AID_VPN_IPv6:
>   inet6applymask(>v6, >v6, prefixlen);
>   break;
>   }
> }
> 

Yes, I blame it on the heat that my brain made me think this is more
complex than it actually is. Indeed that should work. I changed my diff to
that version.

-- 
:wq Claudio



Re: bgpd switch kroute_find to bgpd_addr argument

2022-06-17 Thread Theo Buehler
On Fri, Jun 17, 2022 at 11:37:29AM +0200, Claudio Jeker wrote:
> On Thu, Jun 16, 2022 at 09:01:33PM +0200, Theo Buehler wrote:
> > On Thu, Jun 16, 2022 at 07:15:51PM +0200, Claudio Jeker wrote:
> > > Not much holds us back to switch kroute_find() to use a struct bgpd_addr
> > > as prefix argument. Only the match code needs some adoption.
> > > I created applymask() for this which works on bgpd_addrs like
> > > inet4applymask works on in_addrs.
> > > 
> > > I also switched a simple case in session.c over to applymask().
> > > There is another one in bgpctl which I will send out once this is in.
> > 
> > ok
> > 
> > A nit and a question below.
> > 
> > [...]
> > 
> > > +kroute_find(struct ktable *kt, const struct bgpd_addr *prefix,
> > > +uint8_t prefixlen, uint8_t prio)
> > >  {
> > >   struct kroute_node  s;
> > >   struct kroute_node  *kn, *tmp;
> > >  
> > > - s.r.prefix.s_addr = prefix;
> > > + s.r.prefix= prefix->v4;
> > 
> > Missing space before =.
> > 
> > [...]
> > 
> > > Index: util.c
> > > ===
> > > RCS file: /cvs/src/usr.sbin/bgpd/util.c,v
> > > retrieving revision 1.64
> > > diff -u -p -r1.64 util.c
> > > --- util.c16 Jun 2022 15:33:05 -  1.64
> > > +++ util.c16 Jun 2022 16:50:47 -
> > > @@ -783,6 +783,26 @@ inet6applymask(struct in6_addr *dest, co
> > >   dest->s6_addr[i] = src->s6_addr[i] & mask.s6_addr[i];
> > >  }
> > >  
> > > +void
> > > +applymask(struct bgpd_addr *dest, const struct bgpd_addr *src, int 
> > > prefixlen)
> > > +{
> > > + struct bgpd_addr tmp;
> > > +
> > > + /* use temporary storage in case src and dest point to same struct */
> > > + tmp = *src;
> > 
> > While I'm fine with doing it this way, why is this tmp dance needed? I
> > would have thought that both inet4applymask() and inet6applymask() work
> > fine if src and dest point to the same struct. bgpctl's parse.y assumes
> > that this is the case.
> 
> While inet4applymask() and inet6applymask() work the problem comes with the
> rest of the struct bgpd_addr fields.
> 
> Now I could do a trick and copy from offsetof(rd) to sizeof(*src) -
> offsetof(rd) plus in the IPv4 case the rest of the union ba needs to
> be cleared as well. So I think while possible to do this without this
> temporary storage the code will be rather complex.
> 
> An alternative option is to just store the v4/v6 values before
> memmove(dst, src) and then using those values in the inetXapplymask calls.
> I guess that is a reasonable alternative which is a bit more optimized.
> 
> What do you think?

While I'm not against your alternative option, I think the copies are
cheap enough that we should not need to optimize them.

What confuses me is the comment and why we need temporary storage at
all.

The simple assignment is well-defined if src and dest point to the same
struct (https://port70.net/%7Ensz/c/c99/n1256.html#6.5.16.1p3), so I
would have written applymask() this way:

void
applymask(struct bgpd_addr *dest, const struct bgpd_addr *src, int prefixlen)
{
*dest = *src;
switch (src->aid) {
case AID_INET:
case AID_VPN_IPv4:
inet4applymask(>v4, >v4, prefixlen);
break;
case AID_INET6:
case AID_VPN_IPv6:
inet6applymask(>v6, >v6, prefixlen);
break;
}
}


> 
> > > + switch (src->aid) {
> > > + case AID_INET:
> > > + case AID_VPN_IPv4:
> > > + inet4applymask(, >v4, prefixlen);
> > > + break;
> > > + case AID_INET6:
> > > + case AID_VPN_IPv6:
> > > + inet6applymask(, >v6, prefixlen);
> > > + break;
> > > + }
> > > + *dest = tmp;
> > > +}
> > > +
> > >  /* address family translation functions */
> > >  const struct aid aid_vals[AID_MAX] = AID_VALS;
> > >  
> > > 
> > 
> 
> -- 
> :wq Claudio



Re: bgpd switch kroute_find to bgpd_addr argument

2022-06-17 Thread Claudio Jeker
On Thu, Jun 16, 2022 at 09:01:33PM +0200, Theo Buehler wrote:
> On Thu, Jun 16, 2022 at 07:15:51PM +0200, Claudio Jeker wrote:
> > Not much holds us back to switch kroute_find() to use a struct bgpd_addr
> > as prefix argument. Only the match code needs some adoption.
> > I created applymask() for this which works on bgpd_addrs like
> > inet4applymask works on in_addrs.
> > 
> > I also switched a simple case in session.c over to applymask().
> > There is another one in bgpctl which I will send out once this is in.
> 
> ok
> 
> A nit and a question below.
> 
> [...]
> 
> > +kroute_find(struct ktable *kt, const struct bgpd_addr *prefix,
> > +uint8_t prefixlen, uint8_t prio)
> >  {
> > struct kroute_node  s;
> > struct kroute_node  *kn, *tmp;
> >  
> > -   s.r.prefix.s_addr = prefix;
> > +   s.r.prefix= prefix->v4;
> 
> Missing space before =.
> 
> [...]
> 
> > Index: util.c
> > ===
> > RCS file: /cvs/src/usr.sbin/bgpd/util.c,v
> > retrieving revision 1.64
> > diff -u -p -r1.64 util.c
> > --- util.c  16 Jun 2022 15:33:05 -  1.64
> > +++ util.c  16 Jun 2022 16:50:47 -
> > @@ -783,6 +783,26 @@ inet6applymask(struct in6_addr *dest, co
> > dest->s6_addr[i] = src->s6_addr[i] & mask.s6_addr[i];
> >  }
> >  
> > +void
> > +applymask(struct bgpd_addr *dest, const struct bgpd_addr *src, int 
> > prefixlen)
> > +{
> > +   struct bgpd_addr tmp;
> > +
> > +   /* use temporary storage in case src and dest point to same struct */
> > +   tmp = *src;
> 
> While I'm fine with doing it this way, why is this tmp dance needed? I
> would have thought that both inet4applymask() and inet6applymask() work
> fine if src and dest point to the same struct. bgpctl's parse.y assumes
> that this is the case.

While inet4applymask() and inet6applymask() work the problem comes with the
rest of the struct bgpd_addr fields.

Now I could do a trick and copy from offsetof(rd) to sizeof(*src) -
offsetof(rd) plus in the IPv4 case the rest of the union ba needs to
be cleared as well. So I think while possible to do this without this
temporary storage the code will be rather complex.

An alternative option is to just store the v4/v6 values before
memmove(dst, src) and then using those values in the inetXapplymask calls.
I guess that is a reasonable alternative which is a bit more optimized.

What do you think?

> > +   switch (src->aid) {
> > +   case AID_INET:
> > +   case AID_VPN_IPv4:
> > +   inet4applymask(, >v4, prefixlen);
> > +   break;
> > +   case AID_INET6:
> > +   case AID_VPN_IPv6:
> > +   inet6applymask(, >v6, prefixlen);
> > +   break;
> > +   }
> > +   *dest = tmp;
> > +}
> > +
> >  /* address family translation functions */
> >  const struct aid aid_vals[AID_MAX] = AID_VALS;
> >  
> > 
> 

-- 
:wq Claudio



Re: Only scan device 0 on pcie downstream ports

2022-06-17 Thread Jonathan Gray
On Fri, Jun 17, 2022 at 10:38:08AM +0200, Mark Kettenis wrote:
> > Date: Fri, 17 Jun 2022 17:22:58 +1000
> > From: Jonathan Gray 
> > 
> > On Fri, Jun 17, 2022 at 12:33:47AM +0200, Mark Kettenis wrote:
> > > On the Ampere Altra machines, some PCIe devices show up 32 times; once
> > > for each possible PCI device number.  This is a hardware bug, since a
> > > downstream switch port (or root port) is supposed to terminate
> > > configuration request targeted at device numbers 1-31.  But it is a
> > > somewhat common bug since I have seen this before on other hardware.
> > > Linux and FreeBSD both have code that only scans device 0 on
> > > downstream switch ports.  So let's do that in OpenBSD as well.
> > > 
> > > ok?
> > > 
> > 
> > ok jsg@
> > 
> > Linux and NetBSD also handle additional types as downstream.
> > Referred to in the pcie base spec as:
> > 
> > 0100b Root Port of PCI Express Root Complex*
> > 1000b PCI/PCI-X to PCI Express Bridge*
> > 
> > *This value is only valid for Functions that implement a Type
> > 01h PCI Configuration Space header.
> > 
> > FreeBSD also checks for Root Port but not the bridge.
> 
> Ah, I missed that Linux also checks for the root port.  That's
> probably a good idea.  The PCI-PCIe bridge is probably rare enough not
> to matter, but lets align with Linux and NetBSD here.
> 
> How about this?

sure, still ok with this version

> 
> 
> Index: dev/pci/pci.c
> ===
> RCS file: /cvs/src/sys/dev/pci/pci.c,v
> retrieving revision 1.124
> diff -u -p -r1.124 pci.c
> --- dev/pci/pci.c 11 Mar 2022 18:00:51 -  1.124
> +++ dev/pci/pci.c 17 Jun 2022 08:34:30 -
> @@ -807,11 +807,31 @@ pci_enumerate_bus(struct pci_softc *sc,
>  {
>   pci_chipset_tag_t pc = sc->sc_pc;
>   int device, function, nfunctions, ret;
> + int maxndevs = sc->sc_maxndevs;
>   const struct pci_quirkdata *qd;
> - pcireg_t id, bhlcr;
> + pcireg_t id, bhlcr, cap;
>   pcitag_t tag;
>  
> - for (device = 0; device < sc->sc_maxndevs; device++) {
> + /*
> +  * PCIe downstream ports and root ports should only forward
> +  * configuration requests for device number 0.  However, not
> +  * all hardware implements this correctly, and some devices
> +  * will respond to other device numbers making the device show
> +  * up 32 times.  Prevent this by only scanning a single
> +  * device.
> +  */
> + if (sc->sc_bridgetag && pci_get_capability(pc, *sc->sc_bridgetag,
> + PCI_CAP_PCIEXPRESS, NULL, )) {
> + switch (PCI_PCIE_XCAP_TYPE(cap)) {
> + case PCI_PCIE_XCAP_TYPE_RP:
> + case PCI_PCIE_XCAP_TYPE_DOWN:
> + case PCI_PCIE_XCAP_TYPE_PCI2PCIE:
> + maxndevs = 1;
> + break;
> + }
> + }
> +
> + for (device = 0; device < maxndevs; device++) {
>   tag = pci_make_tag(pc, sc->sc_bus, device, 0);
>  
>   bhlcr = pci_conf_read(pc, tag, PCI_BHLC_REG);
> Index: dev/pci/pcireg.h
> ===
> RCS file: /cvs/src/sys/dev/pci/pcireg.h,v
> retrieving revision 1.60
> diff -u -p -r1.60 pcireg.h
> --- dev/pci/pcireg.h  31 Dec 2021 11:24:24 -  1.60
> +++ dev/pci/pcireg.h  17 Jun 2022 08:34:30 -
> @@ -563,6 +563,10 @@ typedef u_int8_t pci_revision_t;
>  #define PCI_PCIE_XCAP0x00
>  #define PCI_PCIE_XCAP_SI 0x0100
>  #define PCI_PCIE_XCAP_VER(x) (((x) >> 16) & 0x0f)
> +#define PCI_PCIE_XCAP_TYPE(x)(((x) >> 20) & 0x0f)
> +#define  PCI_PCIE_XCAP_TYPE_RP   0x4
> +#define  PCI_PCIE_XCAP_TYPE_DOWN 0x6
> +#define  PCI_PCIE_XCAP_TYPE_PCI2PCIE 0x8
>  #define PCI_PCIE_DCAP0x04
>  #define PCI_PCIE_DCSR0x08
>  #define PCI_PCIE_DCSR_ERO0x0010
> 



Re: Only scan device 0 on pcie downstream ports

2022-06-17 Thread Mark Kettenis
> Date: Fri, 17 Jun 2022 17:22:58 +1000
> From: Jonathan Gray 
> 
> On Fri, Jun 17, 2022 at 12:33:47AM +0200, Mark Kettenis wrote:
> > On the Ampere Altra machines, some PCIe devices show up 32 times; once
> > for each possible PCI device number.  This is a hardware bug, since a
> > downstream switch port (or root port) is supposed to terminate
> > configuration request targeted at device numbers 1-31.  But it is a
> > somewhat common bug since I have seen this before on other hardware.
> > Linux and FreeBSD both have code that only scans device 0 on
> > downstream switch ports.  So let's do that in OpenBSD as well.
> > 
> > ok?
> > 
> 
> ok jsg@
> 
> Linux and NetBSD also handle additional types as downstream.
> Referred to in the pcie base spec as:
> 
> 0100b Root Port of PCI Express Root Complex*
> 1000b PCI/PCI-X to PCI Express Bridge*
> 
> *This value is only valid for Functions that implement a Type
> 01h PCI Configuration Space header.
> 
> FreeBSD also checks for Root Port but not the bridge.

Ah, I missed that Linux also checks for the root port.  That's
probably a good idea.  The PCI-PCIe bridge is probably rare enough not
to matter, but lets align with Linux and NetBSD here.

How about this?


Index: dev/pci/pci.c
===
RCS file: /cvs/src/sys/dev/pci/pci.c,v
retrieving revision 1.124
diff -u -p -r1.124 pci.c
--- dev/pci/pci.c   11 Mar 2022 18:00:51 -  1.124
+++ dev/pci/pci.c   17 Jun 2022 08:34:30 -
@@ -807,11 +807,31 @@ pci_enumerate_bus(struct pci_softc *sc,
 {
pci_chipset_tag_t pc = sc->sc_pc;
int device, function, nfunctions, ret;
+   int maxndevs = sc->sc_maxndevs;
const struct pci_quirkdata *qd;
-   pcireg_t id, bhlcr;
+   pcireg_t id, bhlcr, cap;
pcitag_t tag;
 
-   for (device = 0; device < sc->sc_maxndevs; device++) {
+   /*
+* PCIe downstream ports and root ports should only forward
+* configuration requests for device number 0.  However, not
+* all hardware implements this correctly, and some devices
+* will respond to other device numbers making the device show
+* up 32 times.  Prevent this by only scanning a single
+* device.
+*/
+   if (sc->sc_bridgetag && pci_get_capability(pc, *sc->sc_bridgetag,
+   PCI_CAP_PCIEXPRESS, NULL, )) {
+   switch (PCI_PCIE_XCAP_TYPE(cap)) {
+   case PCI_PCIE_XCAP_TYPE_RP:
+   case PCI_PCIE_XCAP_TYPE_DOWN:
+   case PCI_PCIE_XCAP_TYPE_PCI2PCIE:
+   maxndevs = 1;
+   break;
+   }
+   }
+
+   for (device = 0; device < maxndevs; device++) {
tag = pci_make_tag(pc, sc->sc_bus, device, 0);
 
bhlcr = pci_conf_read(pc, tag, PCI_BHLC_REG);
Index: dev/pci/pcireg.h
===
RCS file: /cvs/src/sys/dev/pci/pcireg.h,v
retrieving revision 1.60
diff -u -p -r1.60 pcireg.h
--- dev/pci/pcireg.h31 Dec 2021 11:24:24 -  1.60
+++ dev/pci/pcireg.h17 Jun 2022 08:34:30 -
@@ -563,6 +563,10 @@ typedef u_int8_t pci_revision_t;
 #define PCI_PCIE_XCAP  0x00
 #define PCI_PCIE_XCAP_SI   0x0100
 #define PCI_PCIE_XCAP_VER(x)   (((x) >> 16) & 0x0f)
+#define PCI_PCIE_XCAP_TYPE(x)  (((x) >> 20) & 0x0f)
+#define  PCI_PCIE_XCAP_TYPE_RP 0x4
+#define  PCI_PCIE_XCAP_TYPE_DOWN   0x6
+#define  PCI_PCIE_XCAP_TYPE_PCI2PCIE   0x8
 #define PCI_PCIE_DCAP  0x04
 #define PCI_PCIE_DCSR  0x08
 #define PCI_PCIE_DCSR_ERO  0x0010



Re: Lenovo ThinkCentre M910q fails to suspend

2022-06-17 Thread Edd Barrett
On Fri, Jun 17, 2022 at 09:14:45AM +0100, Edd Barrett wrote:
> > 1. remove the code that truncates this list after 16, and note down all the
> > wake devices.

Here's the full list:

acpi0: wakeup devices PEG0(S4) PEGP(S4) PEG1(S4) PEGP(S4) PEG2(S4) PEGP(S4) 
SIO1(S3) RP09(S4) PXSX(S4) RP10(S4) PXSX(S4) RP11(S4) PXSX(S4) RP12(S4) 
PXSX(S4) RP13(S4) PXSX(S4) RP01(S4) PXSX(S4) RP02(S4) PXSX(S4) RP03(S4) 
PXSX(S4) RP04(S4) PXSX(S4) RP05(S4) PXSX(S4) RP06(S4) PXSX(S4) RP07(S4) 
PXSX(S4) RP08(S4) PXSX(S4) RP17(S4) PXSX(S4) RP18(S4) PXSX(S4) RP19(S4) 
PXSX(S4) RP20(S4) PXSX(S4) RP21(S4) PXSX(S4) RP22(S4) PXSX(S4) RP23(S4) 
PXSX(S4) RP24(S4) PXSX(S4) RP14(S4) PXSX(S4) RP15(S4) PXSX(S4) RP16(S4) 
PXSX(S4) GLAN(S4) XHC_(S3) XDCI(S4) HDAS(S4)

So there are just two S3 things listed:
 - SIO1(S3)
 - XHC_(S3)

-- 
Best Regards
Edd Barrett

https://www.theunixzoo.co.uk



Re: Lenovo ThinkCentre M910q fails to suspend

2022-06-17 Thread Edd Barrett
Hi Mike,

On Thu, Jun 16, 2022 at 09:19:50PM -0700, Mike Larkin wrote:
> From your original dmesg:
> 
> > acpi0: wakeup devices PEG0(S4) PEGP(S4) PEG1(S4) PEGP(S4) PEG2(S4) PEGP(S4) 
> > SIO1(S3) RP09(S4) PXSX(S4) RP10(S4) PXSX(S4) RP11(S4) PXSX(S4) RP12(S4) 
> > PXSX(S4) RP13(S4) [...]
> 
> Notice the [...] at the end, this is printed after 16 devices. What I'd 
> suggest
> is this:
> 
> 1. remove the code that truncates this list after 16, and note down all the 
> wake
> devices.
> 
> 2. If there are any in S3, try using ZZZ instead of zzz. If the machine does 
> not instantly
> wake, it's possible it's because of one of those S3 devices doing the wake 
> (since ZZZ
> uses S4).

I'll try removing the truncation then. Bear with me.

In the meantime, notice that the truncated list does include one S3 item
`SIO1(S3)`. I don't know if that's what we are looking for?

FWIW, I have already tried `ZZZ` on this machine and it does succeed to
hibernate, but upon wake up, it hangs when decompressing the memory image. I
left it decompressing a ~50MB image for more than an hour and concluded it had
got stuck.

> 3. If everything is S4, well, you're going to have to trace down those short 
> names
> like PEGP, PXSX, etc, and disable one at a time until you find the one that is
> doing the wake. And it's possible it's none of these and is a fixed function
> button or something.

One additional piece of info, which may be worthless. I tried a Debian live USB
stick, to see if Linux was able to sleep this box. It was able to.

I don't know if that rules out the idea of a fixed-function button?

-- 
Best Regards
Edd Barrett

https://www.theunixzoo.co.uk



Re: Only scan device 0 on pcie downstream ports

2022-06-17 Thread Jonathan Gray
On Fri, Jun 17, 2022 at 12:33:47AM +0200, Mark Kettenis wrote:
> On the Ampere Altra machines, some PCIe devices show up 32 times; once
> for each possible PCI device number.  This is a hardware bug, since a
> downstream switch port (or root port) is supposed to terminate
> configuration request targeted at device numbers 1-31.  But it is a
> somewhat common bug since I have seen this before on other hardware.
> Linux and FreeBSD both have code that only scans device 0 on
> downstream switch ports.  So let's do that in OpenBSD as well.
> 
> ok?
> 

ok jsg@

Linux and NetBSD also handle additional types as downstream.
Referred to in the pcie base spec as:

0100b Root Port of PCI Express Root Complex*
1000b PCI/PCI-X to PCI Express Bridge*

*This value is only valid for Functions that implement a Type
01h PCI Configuration Space header.

FreeBSD also checks for Root Port but not the bridge.

> 
> Index: dev/pci/pci.c
> ===
> RCS file: /cvs/src/sys/dev/pci/pci.c,v
> retrieving revision 1.124
> diff -u -p -r1.124 pci.c
> --- dev/pci/pci.c 11 Mar 2022 18:00:51 -  1.124
> +++ dev/pci/pci.c 16 Jun 2022 22:27:35 -
> @@ -807,11 +807,25 @@ pci_enumerate_bus(struct pci_softc *sc,
>  {
>   pci_chipset_tag_t pc = sc->sc_pc;
>   int device, function, nfunctions, ret;
> + int maxndevs = sc->sc_maxndevs;
>   const struct pci_quirkdata *qd;
> - pcireg_t id, bhlcr;
> + pcireg_t id, bhlcr, cap;
>   pcitag_t tag;
>  
> - for (device = 0; device < sc->sc_maxndevs; device++) {
> + /*
> +  * PCIe downstream ports should only forward configuration
> +  * requests for device number 0.  However, not all hardware
> +  * implements this correctly, and some devices will respond to
> +  * other device numbers making the device show up 32 times.
> +  * Prevent this by only scanning a single device.
> +  */
> + if (sc->sc_bridgetag && pci_get_capability(pc, *sc->sc_bridgetag,
> + PCI_CAP_PCIEXPRESS, NULL, )) {
> + if (PCI_PCIE_XCAP_TYPE(cap) == PCI_PCIE_XCAP_TYPE_DOWN)
> + maxndevs = 1;
> + }
> +
> + for (device = 0; device < maxndevs; device++) {
>   tag = pci_make_tag(pc, sc->sc_bus, device, 0);
>  
>   bhlcr = pci_conf_read(pc, tag, PCI_BHLC_REG);
> Index: dev/pci/pcireg.h
> ===
> RCS file: /cvs/src/sys/dev/pci/pcireg.h,v
> retrieving revision 1.60
> diff -u -p -r1.60 pcireg.h
> --- dev/pci/pcireg.h  31 Dec 2021 11:24:24 -  1.60
> +++ dev/pci/pcireg.h  16 Jun 2022 22:27:35 -
> @@ -563,6 +563,8 @@ typedef u_int8_t pci_revision_t;
>  #define PCI_PCIE_XCAP0x00
>  #define PCI_PCIE_XCAP_SI 0x0100
>  #define PCI_PCIE_XCAP_VER(x) (((x) >> 16) & 0x0f)
> +#define PCI_PCIE_XCAP_TYPE(x)(((x) >> 20) & 0x0f)
> +#define  PCI_PCIE_XCAP_TYPE_DOWN 0x6
>  #define PCI_PCIE_DCAP0x04
>  #define PCI_PCIE_DCSR0x08
>  #define PCI_PCIE_DCSR_ERO0x0010
> 
> 



Re: kernel: remove global "randompid" toggle

2022-06-17 Thread Theo de Raadt
Miod Vallat  wrote:

> > Are we keeping this "randompid" global around to make it possible to
> > disable random PIDs by toggling it in ddb(4)?
> 
> IIRC this logic was needed because some kthreads were created before
> arc4random was operational, but that was before substantial changes to
> the random generator. However since the panic-if-invoked-early has been
> removed in sys/dev/rnd.c 1.126, this is indeed probably no longer
> needed.

Oh!  yes, that's why we don't need this hack anymore.