Re: [Diff] Implement multiple device cloning for hotplug

2021-05-11 Thread Theo de Raadt
Ashton Fagg  wrote:

> On May 11, 2021, at 22:53, Theo de Raadt  wrote:
> > This is because the design of hotplug is completely broken.
> 
> Thank you, Theo, for the comprehensive reply. This was just the sort of 
> feedback I’d hoped to get.
> 
> Ok so we have established this is a terrible idea because of the way hotplug 
> works. :)
> 
> Let’s say someone was interested in solving* this. And let’s assume there was 
> some infrastructure that you signify attach, detach of devices along with 
> basic state (ready, not ready). 
> 
> (* - where solved is some value of “better than what’s there now”)
> 
> […snip…]
> 
> > Unfortunately there is no instrumentation in drivers or subsystems to
> > capture "ready to do work", and create the events at that point instead.
> > hotplug is hooked into subr_autoconf.c, this code is for handling the
> > device heirarchy, but it is not involved in operation, and thus unaware
> > of "readiness".  Fixing this would require a subsystem by subsystem study,
> > figuring out where the glue should exist, and creating the events at
> > the CORRECT time.
> 
> Would it make more sense to have a general solution which aggregates all the 
> information from every device? (Maybe something like /dev/devstate, I dunno, 
> names are hard). I could see that maybe this could be a side channel, though, 
> if it’s somehow useful to know the state of a certain device (though no more 
> so than /dev/hotplug already is admittedly). You’d also be getting a firehose 
> of everything even though you might be only interested in disk devices, for 
> example. 
> 
> Or, have something per subsystem? (Infrastructure could be shared, i guess). 
> So for scsi, you might have something like “/dev/scsistate” which provides 
> information to userland stuff on everything that’s under the scsi umbrella. 
> This feels more friendly in that you would unveil only a particular 
> subsystems monitor device, limiting the firehose and also the potential side 
> channel. You might also want to attach more specific information to the 
> message (for scsi, maybe attach LUN and target info in addition to the device 
> name and type). But you pay for that in extra queues, devices etc. 
> 
> Any thoughts?
> 
> I would be willing to try and develop something, it’s interesting to me and 
> it will hopefully solve my iSCSI mount issues once and for all. :)


We don't need a new subsystem.

We just the operation changed, so it sends the event when the device is ready,
rather than before it is ready.




Re: [Diff] Implement multiple device cloning for hotplug

2021-05-11 Thread Ashton Fagg
On May 11, 2021, at 22:53, Theo de Raadt  wrote:
> This is because the design of hotplug is completely broken.

Thank you, Theo, for the comprehensive reply. This was just the sort of 
feedback I’d hoped to get.

Ok so we have established this is a terrible idea because of the way hotplug 
works. :)

Let’s say someone was interested in solving* this. And let’s assume there was 
some infrastructure that you signify attach, detach of devices along with basic 
state (ready, not ready). 

(* - where solved is some value of “better than what’s there now”)

[…snip…]

> Unfortunately there is no instrumentation in drivers or subsystems to
> capture "ready to do work", and create the events at that point instead.
> hotplug is hooked into subr_autoconf.c, this code is for handling the
> device heirarchy, but it is not involved in operation, and thus unaware
> of "readiness".  Fixing this would require a subsystem by subsystem study,
> figuring out where the glue should exist, and creating the events at
> the CORRECT time.

Would it make more sense to have a general solution which aggregates all the 
information from every device? (Maybe something like /dev/devstate, I dunno, 
names are hard). I could see that maybe this could be a side channel, though, 
if it’s somehow useful to know the state of a certain device (though no more so 
than /dev/hotplug already is admittedly). You’d also be getting a firehose of 
everything even though you might be only interested in disk devices, for 
example. 

Or, have something per subsystem? (Infrastructure could be shared, i guess). So 
for scsi, you might have something like “/dev/scsistate” which provides 
information to userland stuff on everything that’s under the scsi umbrella. 
This feels more friendly in that you would unveil only a particular subsystems 
monitor device, limiting the firehose and also the potential side channel. You 
might also want to attach more specific information to the message (for scsi, 
maybe attach LUN and target info in addition to the device name and type). But 
you pay for that in extra queues, devices etc. 

Any thoughts?

I would be willing to try and develop something, it’s interesting to me and it 
will hopefully solve my iSCSI mount issues once and for all. :)




Re: [Diff] Implement multiple device cloning for hotplug

2021-05-11 Thread Theo de Raadt
Ashton Fagg  wrote:

> The committed diff fixes my original problem. However, I've now
> encountered another one. On one of my machines (far slower than the
> first one), the devices themselves are much slower to attach - this
> leads to the same race condition as before, as "iscsictl reload" is only
> aware that the connections have been initiated, not whether the drives
> themselves are actually attached.

This is because the design of hotplug is completely broken.

The driver does something, but not what people anticipate or understand.

A hotplug event is delivered immediately after a device driver's attach
function is called.  Very few device drivers are intialize fully in one
step.  By very few, I really mean none.  So you get an event, and you
try to act upon it, but the driver still has to perform numerous timing
sensitive operations to actually become ready.

Thus, this incredible mismatch with what the subsystem, and the daemon
promise: Events which do not indicate readiness.

Unfortunately there is no instrumentation in drivers or subsystems to
capture "ready to do work", and create the events at that point instead.
hotplug is hooked into subr_autoconf.c, this code is for handling the
device heirarchy, but it is not involved in operation, and thus unaware
of "readiness".  Fixing this would require a subsystem by subsystem study,
figuring out where the glue should exist, and creating the events at
the CORRECT time.






[Diff] Implement multiple device cloning for hotplug

2021-05-11 Thread Ashton Fagg
Attached is a diff that implements device cloning to allow the
/dev/hotplug device to be cloned (to allow multiple concurrent readers).

Rationale:

Currently, iscsid/iscsictl cannot see when the connections it initiates
results in a device (disk) being attached. Recently, Claudio Jeker
committed a diff I wrote to block "iscsictl reload" until all
connections have been finalized (either connecting successfully, or
notified of failure). This fixes an issue I encountered where iscsi
mounts in /etc/fstab would not be ready at boot time, and would result
in the machine going into single user mode when fsck would try to access
them.

The committed diff fixes my original problem. However, I've now
encountered another one. On one of my machines (far slower than the
first one), the devices themselves are much slower to attach - this
leads to the same race condition as before, as "iscsictl reload" is only
aware that the connections have been initiated, not whether the drives
themselves are actually attached.

What I am hoping here is that I can use hotplug to monitor for device
attach events, and match those to the targets we expect to be attaching
from iscsi. Obviously, we can't just use the hotplug device as is
because potentialy we'd be competing with hotplugd.

I am wondering if this is the right solution...some quick thoughts i'd
love to hear feedback on:

1. Kind of feels like I'm conflating what hotplug is for - this isn't
really "hotplugging" per se. This specific use-case is assuming the
volumes are listed in /etc/fstab.

2. Based on 1, would it be more appropriate to create a separate device
with it's own semantics (potentially very different from /dev/hotplug)
that lets us do this...

Inspiration for this diff was drawn from Joshua Stein [1], seeminly he
had a use-case that was somewhat similar to mine at one point but it
doesn't look like this was ever committed. Nor can I find any discussion
on it.

[1]: https://jcs.org/2018/08/31/surface_go

Feedback greatly welcomed.

Thanks,

Ash

diff --git a/sys/dev/hotplug.c b/sys/dev/hotplug.c
index e1e7bad95c9..522939f7f7a 100644
--- a/sys/dev/hotplug.c
+++ b/sys/dev/hotplug.c
@@ -25,15 +25,24 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
 #define HOTPLUG_MAXEVENTS	64
 
-static int opened;
+struct hotplug_dev {
+	int unit;
+	int evqueue_head;
+	struct selinfo sel_info;
+
+	LIST_ENTRY(hotplug_dev) dev_list;
+};
+
+LIST_HEAD(, hotplug_dev) hotplug_dev_list;
+static int evqueue_head;
 static struct hotplug_event evqueue[HOTPLUG_MAXEVENTS];
-static int evqueue_head, evqueue_tail, evqueue_count;
-static struct selinfo hotplug_sel;
+
 
 void filt_hotplugrdetach(struct knote *);
 int  filt_hotplugread(struct knote *, long);
@@ -45,21 +54,40 @@ const struct filterops hotplugread_filtops = {
 	.f_event	= filt_hotplugread,
 };
 
+#define EVQUEUE_PREV(p) (p == 0 ? HOTPLUG_MAXEVENTS : p - 1)
 #define EVQUEUE_NEXT(p) (p == HOTPLUG_MAXEVENTS - 1 ? 0 : p + 1)
+#define EMPTY_EVENT(ev) (ev.he_type == HOTPLUG_EMPTY)
+#define EVENT_IS_HEAD(p) (p == evqueue_head)
+#define HEAD_EVENT evqueue[evqueue_head]
 
 
 int hotplug_put_event(struct hotplug_event *);
-int hotplug_get_event(struct hotplug_event *);
+int hotplug_get_event(struct hotplug_dev *, struct hotplug_event *);
 
 void hotplugattach(int);
+struct hotplug_dev * hotplug_device_lookup(int);
 
 void
 hotplugattach(int count)
 {
-	opened = 0;
-	evqueue_head = 0;
-	evqueue_tail = 0;
-	evqueue_count = 0;
+	int i;
+
+	for (i = 0; i < HOTPLUG_MAXEVENTS; ++i)
+		evqueue[i].he_type = HOTPLUG_EMPTY;
+
+	LIST_INIT(_dev_list);
+}
+
+struct hotplug_dev *
+hotplug_device_lookup(int unit)
+{
+	struct hotplug_dev *d;
+
+	LIST_FOREACH(d, _dev_list, dev_list)
+	if (d->unit == unit)
+		return d;
+
+	return (NULL);
 }
 
 void
@@ -87,59 +115,79 @@ hotplug_device_detach(enum devclass class, char *name)
 int
 hotplug_put_event(struct hotplug_event *he)
 {
-	if (evqueue_count == HOTPLUG_MAXEVENTS && opened) {
-		printf("hotplug: event lost, queue full\n");
-		return (1);
-	}
+	struct hotplug_dev *d;
 
-	evqueue[evqueue_head] = *he;
+	HEAD_EVENT = *he;
 	evqueue_head = EVQUEUE_NEXT(evqueue_head);
-	if (evqueue_count == HOTPLUG_MAXEVENTS)
-		evqueue_tail = EVQUEUE_NEXT(evqueue_tail);
-	else 
-		evqueue_count++;
+	HEAD_EVENT.he_type = HOTPLUG_EMPTY;
+
+	LIST_FOREACH(d, _dev_list, dev_list) {
+		if (EVENT_IS_HEAD(d->evqueue_head))
+			d->evqueue_head = EVQUEUE_NEXT(evqueue_head);
+		selwakeup(>sel_info);
+	}
+
+
 	wakeup();
-	selwakeup(_sel);
 	return (0);
 }
 
 int
-hotplug_get_event(struct hotplug_event *he)
+hotplug_get_event(struct hotplug_dev *d, struct hotplug_event *he)
 {
-	int s;
+	int s = splbio();
 
-	if (evqueue_count == 0)
-		return (1);
+	if (!EMPTY_EVENT(evqueue[d->evqueue_head])) {
+		*he = evqueue[d->evqueue_head];
+		d->evqueue_head = EVQUEUE_NEXT(d->evqueue_head);
+		splx(s);
+		return (0);
+	}
 
-	s = splbio();
-	*he = evqueue[evqueue_tail];
-	evqueue_tail = EVQUEUE_NEXT(evqueue_tail);
-	

Re: potentially uninitialized string printed by vmd

2021-05-11 Thread Mike Larkin
On Mon, Mar 15, 2021 at 09:29:29AM +, James Cook wrote:
> > The array "base" which is passed to log_warnx might be uninitialized:
> > virtio_get_base doesn't necessarily touch it if it returns -1. Maybe it
> > would be better just omit base from the output, e.g.
> >
> > log_warnx("vm \"%s\" unable to read "
> > "base for disk %s", vcp->vcp_name,
> > vcp->vcp_disks[i]);
>
> Here it is as a patch.
>
> - James
>
> diff --git a/usr.sbin/vmd/config.c b/usr.sbin/vmd/config.c
> index 9ef5dca626e..3ce82052e4a 100644
> --- a/usr.sbin/vmd/config.c
> +++ b/usr.sbin/vmd/config.c
> @@ -393,8 +393,8 @@ config_setvm(struct privsep *ps, struct vmd_vm *vm, 
> uint32_t peerid, uid_t uid)
>   break;
>   if (n == -1) {
>   log_warnx("vm \"%s\" unable to read "
> - "base %s for disk %s", vcp->vcp_name,
> - base, vcp->vcp_disks[i]);
> + "base for disk %s", vcp->vcp_name,
> + vcp->vcp_disks[i]);
>   goto fail;
>   }
>   (void)strlcpy(path, base, sizeof(path));
>

Committed. I was going through old emails and found this. Sorry this took so
long.

Thanks!



Re: services(5): more cleanup

2021-05-11 Thread Kurt Mosiejczuk
On Wed, May 12, 2021 at 01:13:55AM +0200, Jeremie Courreges-Anglas wrote:

> I'd like to drop SWAT, unofficial and dropped by the samba project
> around the switch to samba4.

> > - moved smtps/465 to the standards section (rfc8314)

> The new service was named "submissions".  I guess we should use both
> that and the "smtps" alias.
> https://datatracker.ietf.org/doc/html/rfc8314#section-7.3

> ok?

ok kmos

--Kurt

> 
> Index: services
> ===
> RCS file: /d/cvs/src/etc/services,v
> retrieving revision 1.100
> diff -u -p -p -u -r1.100 services
> --- services  5 May 2021 11:49:17 -   1.100
> +++ services  11 May 2021 23:03:12 -
> @@ -123,7 +123,7 @@ microsoft-ds  445/tcp # 
> Microsoft-DS
>  microsoft-ds 445/udp # Microsoft-DS
>  kpasswd  464/tcp # Kerberos 5 password 
> changing
>  kpasswd  464/udp # Kerberos 5 password 
> changing
> -smtps465/tcp # mail message 
> submission (TLS)
> +submissions  465/tcp smtps   # mail message submission (TLS)
>  photuris 468/tcp # Photuris Key Management
>  photuris 468/udp
>  isakmp   500/udp # ISAKMP key management
> @@ -296,7 +296,6 @@ kerberos_master   751/udp # 
> Kerberos 4 
>  kerberos_master  751/tcp # Kerberos 4 kadmin
>  krb_prop 754/tcp hprop   # Kerberos slave propagation
>  krbupdate760/tcp kreg# BSD Kerberos registration
> -swat 901/tcp # Samba Web Administration Tool
>  datametrics  1645/udp
>  ekshell2 2106/tcp# Encrypted kshell - UColorado, 
> Boulder
>  webster  2627/tcp# Network dictionary
> 
> -- 
> jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF  DDCC 0DFA 74AE 1524 E7EE
> 



services(5): more cleanup

2021-05-11 Thread Jeremie Courreges-Anglas
On Wed, May 05 2021, Stuart Henderson  wrote:
> On 2021/05/04 12:07, Jan Klemkow wrote:
>> Hi,
>> 
>> Add missing ftps defaults ports to servies(5).
>> 
>> OK?
>> 
>> bye,
>> Jan
>> 
>> Index: services
>> ===
>> RCS file: /cvs/src/etc/services,v
>> retrieving revision 1.99
>> diff -u -p -r1.99 services
>> --- services 18 Feb 2021 02:30:29 -  1.99
>> +++ services 4 May 2021 10:01:35 -
>> @@ -318,6 +318,10 @@ krb_prop754/tcp hprop   # 
>> Kerberos slav
>>  krbupdate   760/tcp kreg# BSD Kerberos registration
>>  supfilesrv  871/tcp # SUP server
>>  swat901/tcp # Samba Web 
>> Administration Tool
>> +ftps-data   989/tcp # ftp data over TLS/SSL
>> +ftps-data   989/udp # ftp data over TLS/SSL
>> +ftps990/tcp # ftp control over 
>> TLS/SSL
>> +ftps990/udp # ftp control over 
>> TLS/SSL
>
> I'm OK with adding the TCP ones (though ftp-over-tls always makes me
> want to rant...). It's not going to run on UDP though so I think those
> should not be added.

+2

> Every new entry in this file reduces the range available for dynamic
> port selection, so it would seem a good idea to cull a few if we're
> adding some. Here are some likely candidates;
>
> - removed a few UDP entries for protocols that won't use it
>
> - dropped some obsolete protocols

I'd like to drop SWAT, unofficial and dropped by the samba project
around the switch to samba4.

> - moved smtps/465 to the standards section (rfc8314)

The new service was named "submissions".  I guess we should use both
that and the "smtps" alias.
https://datatracker.ietf.org/doc/html/rfc8314#section-7.3

ok?


Index: services
===
RCS file: /d/cvs/src/etc/services,v
retrieving revision 1.100
diff -u -p -p -u -r1.100 services
--- services5 May 2021 11:49:17 -   1.100
+++ services11 May 2021 23:03:12 -
@@ -123,7 +123,7 @@ microsoft-ds445/tcp # 
Microsoft-DS
 microsoft-ds   445/udp # Microsoft-DS
 kpasswd464/tcp # Kerberos 5 password 
changing
 kpasswd464/udp # Kerberos 5 password 
changing
-smtps  465/tcp # mail message submission (TLS)
+submissions465/tcp smtps   # mail message submission (TLS)
 photuris   468/tcp # Photuris Key Management
 photuris   468/udp
 isakmp 500/udp # ISAKMP key management
@@ -296,7 +296,6 @@ kerberos_master 751/udp # 
Kerberos 4 
 kerberos_master751/tcp # Kerberos 4 kadmin
 krb_prop   754/tcp hprop   # Kerberos slave propagation
 krbupdate  760/tcp kreg# BSD Kerberos registration
-swat   901/tcp # Samba Web Administration Tool
 datametrics1645/udp
 ekshell2   2106/tcp# Encrypted kshell - UColorado, 
Boulder
 webster2627/tcp# Network dictionary

-- 
jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF  DDCC 0DFA 74AE 1524 E7EE



Re: athn(4): fix corrupt input frames

2021-05-11 Thread Mikolaj Kucharski
On Tue, May 11, 2021 at 04:12:34PM +0200, Stefan Sperling wrote:
> kettenis@ noticed that an athn(4) node cache can contain bogus clients
> with MAC addreses that look like bit-flipped versions of MAC addresses
> of legitimate clients.
> 
> I think this is due to my recent fix to allow block ack requests to
> pass through athn(4).
> 
> The driver calls ieee80211_find_rxnode() on such frames.
> In hostap mode this function is responsible for creating new node cache
> entries in case the transmitter's address is not known yet. This results
> in bogus cache entries. This patch attempts to fix that issue by searching
> the transmitter address in the cache and dropping the frame if the address
> is unknown.
> 
> In client mode ieee80211_find_rxnode() always returns ic->ic_bss.
> We can simply validate the transmitter address in the frame against
> the AP's MAC adress which is always known.
> 
> In monitor mode we can pass such frames since they will be dropped
> anyway after bpf_mtap.
> 
> Even if the address is correct other parts of the frame might still be
> corrupt. I don't think there is a way to verify the frame's checksum if
> the hardware cannot do it for us.
> I hope that net80211's input path is robust enough to deal with this...
> 
> ok?

Currently testing on:

// Wi-Fi client to pce-0041
pce-0067# dmesg | grep athn
athn0 at pci4 dev 0 function 0 "Atheros AR9281" rev 0x01: apic 5 int 16
athn0: AR9280 rev 2 (2T2R), ROM rev 22, address 04:f0:21:45:6a:c2

// accesspoint
pce-0041# dmesg | grep athn
athn0 at pci4 dev 0 function 0 "Atheros AR9281" rev 0x01: apic 5 int 16
athn0: AR9280 rev 2 (2T2R), ROM rev 22, address 04:f0:21:34:e4:23

// accesspoint
pce-0035# dmesg | grep athn
athn0 at pci4 dev 0 function 0 "Atheros AR9281" rev 0x01: apic 5 int 16
athn0: AR9280 rev 2 (2T2R), ROM rev 22, address 04:f0:21:45:6a:c4

At the time of this mail, they have about 20+ minutes of uptime. No
noticable issues so far.


> diff 61810dc72eb09d7c410d5b2c7982a940951e4dd4 /usr/src
> blob - 816a3b09ee2f59b2a22c4a2d236e39272b10
> file + sys/dev/ic/ar5008.c
> --- sys/dev/ic/ar5008.c
> +++ sys/dev/ic/ar5008.c
> @@ -970,29 +970,52 @@ ar5008_rx_process(struct athn_softc *sc, struct mbuf_l
>   /* Finalize mbuf. */
>   m->m_pkthdr.len = m->m_len = len;
>  
> - /* Grab a reference to the source node. */
>   wh = mtod(m, struct ieee80211_frame *);
> - ni = ieee80211_find_rxnode(ic, wh);
>  
>   if (michael_mic_failure) {
>   /*
>* Check that it is not a control frame
>* (invalid MIC failures on valid ctl frames).
> +  * Validate the transmitter's address to avoid passing
> +  * corrupt frames with bogus addresses to net80211.
>*/
> - if (!(wh->i_fc[0] & IEEE80211_FC0_TYPE_CTL) &&
> - (ic->ic_flags & IEEE80211_F_RSNON) &&
> - (ni->ni_rsncipher == IEEE80211_CIPHER_TKIP ||
> - ni->ni_rsngroupcipher == IEEE80211_CIPHER_TKIP)) {
> - /* Report Michael MIC failures to net80211. */
> - ic->ic_stats.is_rx_locmicfail++;
> - ieee80211_michael_mic_failure(ic, 0);
> + if ((wh->i_fc[0] & IEEE80211_FC0_TYPE_CTL)) {
> + switch (ic->ic_opmode) {
> +#ifndef IEEE80211_STA_ONLY
> + case IEEE80211_M_HOSTAP:
> + if (ieee80211_find_node(ic, wh->i_addr2))
> + michael_mic_failure = 0;
> + break;
> +#endif
> + case IEEE80211_M_STA:
> + if (IEEE80211_ADDR_EQ(wh->i_addr2,
> + ic->ic_bss->ni_macaddr))
> + michael_mic_failure = 0;
> + break;
> + case IEEE80211_M_MONITOR:
> + michael_mic_failure = 0;
> + break;
> + default:
> + break;
> + }
> + }
> +
> + if (michael_mic_failure) {
> + /* Report Michael MIC failures to net80211. */
> + if ((ic->ic_rsnciphers & IEEE80211_CIPHER_TKIP) ||
> + ic->ic_rsngroupcipher == IEEE80211_CIPHER_TKIP) {
> + ic->ic_stats.is_rx_locmicfail++;
> + ieee80211_michael_mic_failure(ic, 0);
> + }
>   ifp->if_ierrors++;
> - ieee80211_release_node(ic, ni);
>   m_freem(m);
>   goto skip;
>   }
>   }
>  
> + /* Grab a reference to the source node. */
> + ni = ieee80211_find_rxnode(ic, wh);
> +
>   /* Remove any HW padding after the 802.11 header. */
>   if (!(wh->i_fc[0] & IEEE80211_FC0_TYPE_CTL)) {
>   

Re: macppc bsd.mp pmap's hash lock

2021-05-11 Thread Dale Rahn
This structure really should be cache-line aligned, which should prevent it
from spilling across a page boundary.

The powerpc pmap was originally designed to have  8 'way' locks so that
only a single way would get locked, thus (as long as one doesn't have way
more than 8 cores) any core should be able to choose some way to replace
and get the work done. This also would have allowed limited recursion.
However at some point those were taken out. Note that before locking one of
the ways of the hashtable, it would hold the lock on the pmap that it was
inserting the mapping from (which could be the kernel). Not certain if the
kernel pmap lock is recursive or not.
The point was to allow multiple cores to access different regions of the
hashtable at the same time (minimal contention). However it would also
allow a core to reenter the lock_try on a different way where it should be
able to succeed (as long as the recursion doesn't go 8 deep?)

On Tue, May 11, 2021 at 12:42 AM George Koehler  wrote:

> I made a mistake in my last diff by deleting the memory barriers.
> Here is a diff that keeps membar_enter() and membar_exit().
>
> With my last diff, my G5 froze after 15 hours of uptime, and again
> after 10 hours of uptime.  I'm not sure whether the missing memory
> barriers caused the freeze, but I will be running this diff and hoping
> for fewer freezes.
>
> On Sat, 8 May 2021 18:59:35 +0200 (CEST)
> Mark Kettenis  wrote:
>
> > Good find!  On powerpc64 I avoid the issue because I guarantee that
> > the kernel mappings are never evicted from the hash.  But doing so on
> > powerpc would require more serious development.  I'm not sure we
> > really need a ticket lock for this, but since you already did the
> > work, let's stick with it for now.
>
> __ppc_lock (before and after this diff) doesn't give tickets like
> __mp_lock does, but __ppc_lock and __mp_lock are both recursive locks.
> I don't know an easy way to avoid the recursion, if some of the kernel
> mappings might not be in the hash.  My __ppc_lock diff should be easy
> if I don't make mistakes like wrong memory barriers.
>
> --George
>
> Index: arch/powerpc/include/mplock.h
> ===
> RCS file: /cvs/src/sys/arch/powerpc/include/mplock.h,v
> retrieving revision 1.4
> diff -u -p -r1.4 mplock.h
> --- arch/powerpc/include/mplock.h   15 Apr 2020 08:09:00 -  1.4
> +++ arch/powerpc/include/mplock.h   10 May 2021 23:33:00 -
> @@ -30,13 +30,13 @@
>  #define __USE_MI_MPLOCK
>
>  /*
> + * __ppc_lock exists because pte_spill_r() can't use __mp_lock.
>   * Really simple spinlock implementation with recursive capabilities.
>   * Correctness is paramount, no fancyness allowed.
>   */
>
>  struct __ppc_lock {
> -   volatile struct cpu_info *mpl_cpu;
> -   volatile long   mpl_count;
> +   volatile unsigned int   mpl_bolt;
>  };
>
>  #ifndef _LOCORE
> @@ -44,10 +44,6 @@ struct __ppc_lock {
>  void __ppc_lock_init(struct __ppc_lock *);
>  void __ppc_lock(struct __ppc_lock *);
>  void __ppc_unlock(struct __ppc_lock *);
> -int __ppc_release_all(struct __ppc_lock *);
> -int __ppc_release_all_but_one(struct __ppc_lock *);
> -void __ppc_acquire_count(struct __ppc_lock *, int);
> -int __ppc_lock_held(struct __ppc_lock *, struct cpu_info *);
>
>  #endif
>
> Index: arch/powerpc/powerpc/lock_machdep.c
> ===
> RCS file: /cvs/src/sys/arch/powerpc/powerpc/lock_machdep.c,v
> retrieving revision 1.9
> diff -u -p -r1.9 lock_machdep.c
> --- arch/powerpc/powerpc/lock_machdep.c 15 Apr 2020 08:09:00 -  1.9
> +++ arch/powerpc/powerpc/lock_machdep.c 10 May 2021 23:33:00 -
> @@ -1,6 +1,7 @@
>  /* $OpenBSD: lock_machdep.c,v 1.9 2020/04/15 08:09:00 mpi Exp $*/
>
>  /*
> + * Copyright (c) 2021 George Koehler 
>   * Copyright (c) 2007 Artur Grabowski 
>   *
>   * Permission to use, copy, modify, and distribute this software for any
> @@ -22,15 +23,29 @@
>  #include 
>
>  #include 
> -#include 
>
>  #include 
>
> +/*
> + * If __ppc_lock() crosses a page boundary in the kernel text, then it
> + * may cause a page fault (on G5 with ppc_nobat), and pte_spill_r()
> + * would recursively call __ppc_lock().  The lock must be in a valid
> + * state when the page fault happens.
> + *
> + * This lock has 2 fields, "cpuid" and "count", packed in a 32-bit
> + * "bolt", so we can use 32-bit atomic ops to ensure that our lock is
> + * always in a valid state.
> + */
> +#define BOLT(cpuid, count) ((cpuid) << 24 | (count) & 0x00ff)
> +#define BOLT_CPUID(bolt)   ((bolt) >> 24)
> +#define BOLT_COUNT(bolt)   ((bolt) & 0x00ff)
> +#define BOLT_LOCKED(bolt)  (BOLT_COUNT(bolt) != 0)
> +#define BOLT_UNLOCKED(bolt)(BOLT_COUNT(bolt) == 0)
> +
>  void
>  __ppc_lock_init(struct __ppc_lock *lock)
>  {
> -   lock->mpl_cpu = NULL;
> -   lock->mpl_count = 0;
> +   lock->mpl_bolt = 0;
>  }
>
>  #if 

athn(4): fix corrupt input frames

2021-05-11 Thread Stefan Sperling
kettenis@ noticed that an athn(4) node cache can contain bogus clients
with MAC addreses that look like bit-flipped versions of MAC addresses
of legitimate clients.

I think this is due to my recent fix to allow block ack requests to
pass through athn(4).

The driver calls ieee80211_find_rxnode() on such frames.
In hostap mode this function is responsible for creating new node cache
entries in case the transmitter's address is not known yet. This results
in bogus cache entries. This patch attempts to fix that issue by searching
the transmitter address in the cache and dropping the frame if the address
is unknown.

In client mode ieee80211_find_rxnode() always returns ic->ic_bss.
We can simply validate the transmitter address in the frame against
the AP's MAC adress which is always known.

In monitor mode we can pass such frames since they will be dropped
anyway after bpf_mtap.

Even if the address is correct other parts of the frame might still be
corrupt. I don't think there is a way to verify the frame's checksum if
the hardware cannot do it for us.
I hope that net80211's input path is robust enough to deal with this...

ok?

diff 61810dc72eb09d7c410d5b2c7982a940951e4dd4 /usr/src
blob - 816a3b09ee2f59b2a22c4a2d236e39272b10
file + sys/dev/ic/ar5008.c
--- sys/dev/ic/ar5008.c
+++ sys/dev/ic/ar5008.c
@@ -970,29 +970,52 @@ ar5008_rx_process(struct athn_softc *sc, struct mbuf_l
/* Finalize mbuf. */
m->m_pkthdr.len = m->m_len = len;
 
-   /* Grab a reference to the source node. */
wh = mtod(m, struct ieee80211_frame *);
-   ni = ieee80211_find_rxnode(ic, wh);
 
if (michael_mic_failure) {
/*
 * Check that it is not a control frame
 * (invalid MIC failures on valid ctl frames).
+* Validate the transmitter's address to avoid passing
+* corrupt frames with bogus addresses to net80211.
 */
-   if (!(wh->i_fc[0] & IEEE80211_FC0_TYPE_CTL) &&
-   (ic->ic_flags & IEEE80211_F_RSNON) &&
-   (ni->ni_rsncipher == IEEE80211_CIPHER_TKIP ||
-   ni->ni_rsngroupcipher == IEEE80211_CIPHER_TKIP)) {
-   /* Report Michael MIC failures to net80211. */
-   ic->ic_stats.is_rx_locmicfail++;
-   ieee80211_michael_mic_failure(ic, 0);
+   if ((wh->i_fc[0] & IEEE80211_FC0_TYPE_CTL)) {
+   switch (ic->ic_opmode) {
+#ifndef IEEE80211_STA_ONLY
+   case IEEE80211_M_HOSTAP:
+   if (ieee80211_find_node(ic, wh->i_addr2))
+   michael_mic_failure = 0;
+   break;
+#endif
+   case IEEE80211_M_STA:
+   if (IEEE80211_ADDR_EQ(wh->i_addr2,
+   ic->ic_bss->ni_macaddr))
+   michael_mic_failure = 0;
+   break;
+   case IEEE80211_M_MONITOR:
+   michael_mic_failure = 0;
+   break;
+   default:
+   break;
+   }
+   }
+
+   if (michael_mic_failure) {
+   /* Report Michael MIC failures to net80211. */
+   if ((ic->ic_rsnciphers & IEEE80211_CIPHER_TKIP) ||
+   ic->ic_rsngroupcipher == IEEE80211_CIPHER_TKIP) {
+   ic->ic_stats.is_rx_locmicfail++;
+   ieee80211_michael_mic_failure(ic, 0);
+   }
ifp->if_ierrors++;
-   ieee80211_release_node(ic, ni);
m_freem(m);
goto skip;
}
}
 
+   /* Grab a reference to the source node. */
+   ni = ieee80211_find_rxnode(ic, wh);
+
/* Remove any HW padding after the 802.11 header. */
if (!(wh->i_fc[0] & IEEE80211_FC0_TYPE_CTL)) {
u_int hdrlen = ieee80211_get_hdrlen(wh);



Re: fix rpki-client on alpine using libressl

2021-05-11 Thread Theo Buehler
On Tue, May 11, 2021 at 01:01:13PM +0200, Claudio Jeker wrote:
> So on Alpine Linux the libressl version is older then the fix to
> ASN1_time_parse (rev 1.16 of lib/libcrypto/asn1/a_time_tm.c).
> Because of this the expire times shown in the CSV and JSON output are all
> over the place.
> 
> Lets add explicit memset before calling ASN1_time_parse() to make this
> work even with older libressl versions. Alpine Linux should ship more
> up to date versions of libressl (but this is not a security critical
> library so why bother).
> 
> Btw. if you compile rpki-client on Alpine with OpenSSL this does not
> happen because the compat version of ASN1_time_parse has the fix.

It might be worth doing in mft.c as well even if it the memset() is only
really necessary for UTCTime. Up to you.

ok

> -- 
> :wq Claudio
> 
> Index: parser.c
> ===
> RCS file: /cvs/src/usr.sbin/rpki-client/parser.c,v
> retrieving revision 1.9
> diff -u -p -r1.9 parser.c
> --- parser.c  9 May 2021 11:18:57 -   1.9
> +++ parser.c  11 May 2021 10:49:36 -
> @@ -101,6 +101,7 @@ proc_parser_roa(struct entity *entp,
>   err(1, "X509_CRL_get0_nextUpdate failed");
>   goto out;
>   }
> + memset(_tm, 0, sizeof(expires_tm));
>   if (ASN1_time_parse(at->data, at->length, _tm,
>   V_ASN1_UTCTIME) != V_ASN1_UTCTIME) {
>   err(1, "ASN1_time_parse failed");
> @@ -126,6 +127,7 @@ proc_parser_roa(struct entity *entp,
>   err(1, "X509_get0_notafter failed");
>   goto out;
>   }
> + memset(_tm, 0, sizeof(expires_tm));
>   if (ASN1_time_parse(at->data, at->length, _tm,
>   V_ASN1_UTCTIME) != V_ASN1_UTCTIME) {
>   err(1, "ASN1_time_parse failed");
> Index: roa.c
> ===
> RCS file: /cvs/src/usr.sbin/rpki-client/roa.c,v
> retrieving revision 1.18
> diff -u -p -r1.18 roa.c
> --- roa.c 6 May 2021 17:03:57 -   1.18
> +++ roa.c 11 May 2021 10:49:11 -
> @@ -366,6 +366,7 @@ roa_parse(X509 **x509, const char *fn)
>   warnx("%s: X509_get0_notAfter failed", fn);
>   goto out;
>   }
> + memset(_tm, 0, sizeof(expires_tm));
>   if (ASN1_time_parse(at->data, at->length, _tm, 0) == -1) {
>   warnx("%s: ASN1_time_parse failed", fn);
>   goto out;
> 



Re: rpki-client fix possible uninitalised variables

2021-05-11 Thread Theo Buehler
On Tue, May 11, 2021 at 09:51:22AM +0200, Claudio Jeker wrote:
> Modern gcc warns about these variables being not initalized.
> 
> main.c: In function 'main':
> main.c:1064:11: warning: 'rrdppid' may be used uninitialized in this function 
> [-Wmaybe-uninitialized]
>else if (pid == rrdppid)
>^
> 
> rrdp_delta.c: In function 'start_publish_withdraw_elem':
> rrdp_delta.c:150:15: warning: 'uri' may be used uninitialized in this 
> function [-Wmaybe-uninitialized]
>   dxml->pxml = new_publish_xml(pub, uri, hash,
>^~~
>   hasHash ? sizeof(hash) : 0);
>   ~~~
> 
> rrdp_notification.c: In function 'notification_done':
> rrdp_notification.c:380:5: warning: 'last_s' may be used uninitialized in 
> this function [-Wmaybe-uninitialized]
>   if (last_s != nxml->serial)
>  ^
> 
> The main.c change just adds the same bit of code that the other
> sub-processes use (http and rsync). For the rrdp files I think the could
> is not reachable with the variable uninitalized but it is for sure not
> obvious so better make it explicit.

I agree that these are false positives in the rrdp code.

ok

> 
> -- 
> :wq Claudio
> 
> Index: main.c
> ===
> RCS file: /cvs/src/usr.sbin/rpki-client/main.c,v
> retrieving revision 1.139
> diff -u -p -r1.139 main.c
> --- main.c19 Apr 2021 17:04:35 -  1.139
> +++ main.c11 May 2021 07:49:09 -
> @@ -855,8 +855,10 @@ main(int argc, char *argv[])
>  
>   close(fd[0]);
>   rrdp = fd[1];
> - } else
> + } else {
>   rrdp = -1;
> + rrdppid = -1;
> + }
>  
>   /* TODO unveil cachedir and outputdir, no other access allowed */
>   if (pledge("stdio rpath wpath cpath fattr sendfd", NULL) == -1)
> Index: rrdp_delta.c
> ===
> RCS file: /cvs/src/usr.sbin/rpki-client/rrdp_delta.c,v
> retrieving revision 1.1
> diff -u -p -r1.1 rrdp_delta.c
> --- rrdp_delta.c  1 Apr 2021 16:04:48 -   1.1
> +++ rrdp_delta.c  10 May 2021 13:12:36 -
> @@ -115,7 +115,7 @@ start_publish_withdraw_elem(struct delta
>  int withdraw)
>  {
>   XML_Parser p = dxml->parser;
> - char *uri, hash[SHA256_DIGEST_LENGTH];
> + char *uri = NULL, hash[SHA256_DIGEST_LENGTH];
>   int i, hasUri = 0, hasHash = 0;
>   enum publish_type pub = PUB_UPD;
>  
> Index: rrdp_notification.c
> ===
> RCS file: /cvs/src/usr.sbin/rpki-client/rrdp_notification.c,v
> retrieving revision 1.4
> diff -u -p -r1.4 rrdp_notification.c
> --- rrdp_notification.c   15 Apr 2021 08:58:46 -  1.4
> +++ rrdp_notification.c   10 May 2021 13:14:18 -
> @@ -351,7 +351,7 @@ enum rrdp_task
>  notification_done(struct notification_xml *nxml, char *last_mod)
>  {
>   struct delta_item *d;
> - long long s, last_s;
> + long long s, last_s = 0;
>  
>   nxml->current->last_mod = last_mod;
>   nxml->current->session_id = xstrdup(nxml->session_id);
> 



fix rpki-client on alpine using libressl

2021-05-11 Thread Claudio Jeker
So on Alpine Linux the libressl version is older then the fix to
ASN1_time_parse (rev 1.16 of lib/libcrypto/asn1/a_time_tm.c).
Because of this the expire times shown in the CSV and JSON output are all
over the place.

Lets add explicit memset before calling ASN1_time_parse() to make this
work even with older libressl versions. Alpine Linux should ship more
up to date versions of libressl (but this is not a security critical
library so why bother).

Btw. if you compile rpki-client on Alpine with OpenSSL this does not
happen because the compat version of ASN1_time_parse has the fix.
-- 
:wq Claudio

Index: parser.c
===
RCS file: /cvs/src/usr.sbin/rpki-client/parser.c,v
retrieving revision 1.9
diff -u -p -r1.9 parser.c
--- parser.c9 May 2021 11:18:57 -   1.9
+++ parser.c11 May 2021 10:49:36 -
@@ -101,6 +101,7 @@ proc_parser_roa(struct entity *entp,
err(1, "X509_CRL_get0_nextUpdate failed");
goto out;
}
+   memset(_tm, 0, sizeof(expires_tm));
if (ASN1_time_parse(at->data, at->length, _tm,
V_ASN1_UTCTIME) != V_ASN1_UTCTIME) {
err(1, "ASN1_time_parse failed");
@@ -126,6 +127,7 @@ proc_parser_roa(struct entity *entp,
err(1, "X509_get0_notafter failed");
goto out;
}
+   memset(_tm, 0, sizeof(expires_tm));
if (ASN1_time_parse(at->data, at->length, _tm,
V_ASN1_UTCTIME) != V_ASN1_UTCTIME) {
err(1, "ASN1_time_parse failed");
Index: roa.c
===
RCS file: /cvs/src/usr.sbin/rpki-client/roa.c,v
retrieving revision 1.18
diff -u -p -r1.18 roa.c
--- roa.c   6 May 2021 17:03:57 -   1.18
+++ roa.c   11 May 2021 10:49:11 -
@@ -366,6 +366,7 @@ roa_parse(X509 **x509, const char *fn)
warnx("%s: X509_get0_notAfter failed", fn);
goto out;
}
+   memset(_tm, 0, sizeof(expires_tm));
if (ASN1_time_parse(at->data, at->length, _tm, 0) == -1) {
warnx("%s: ASN1_time_parse failed", fn);
goto out;



Re: bgpd fix for rde evaluate all

2021-05-11 Thread Claudio Jeker
On Tue, May 04, 2021 at 09:55:32AM +0200, Claudio Jeker wrote:
> Noticed by the arouteserver author Pier Carlo Chiodi the new rde evaluate
> all feature has a bug when a 2nd best route is withdrawn. In that case
> that route got not withdrawn from the adj-rib-out.
> 
> The problem is in up_generate_updates() and the fact that 'rde evaluate
> all' calls up_generate_updates() with old = NULL. For 'rde evaluate all'
> the code traverses the prefix list in new and once the last element is hit
> (new == NULL) the withdraw should be triggered. But since old == NULL it
> would just return without doing the withdraw.
> 
> The fix is to remove the old == NULL check from the withdraw case but to
> do that the prefix address and prefixlen needs to be extracted before
> entering the selection loop.
> 
> So extract the address once at start based on new or old (whichever is
> set). In case new and old are NULL just return (like before), since there
> is nothing todo.
> 

There is another bug that causes this to not work properly. This time it
is rde_evaluate_all() which is returning 0 (as in default mode) even
though there are peers with 'rde evaluate all' set. The problem is that
the rde_eval_all flag is tracked in the wrong spot and so if you change
the flag after prefixes have loaded and then withdraw a prefix it will not
realize that a peer needs evaluate all and skip to process this withdraw.

The following diff (rde.c part) changes the tracking of rde_eval_all. It
resets the flag during reload but also sets rde_eval_all when a new peer
is added.

The diff also includes the rde_update.c change from before.
-- 
:wq Claudio

Index: rde.c
===
RCS file: /cvs/src/usr.sbin/bgpd/rde.c,v
retrieving revision 1.520
diff -u -p -r1.520 rde.c
--- rde.c   6 May 2021 09:18:54 -   1.520
+++ rde.c   11 May 2021 08:28:58 -
@@ -113,6 +113,7 @@ volatile sig_atomic_trde_quit = 0;
 struct filter_head *out_rules, *out_rules_tmp;
 struct rde_memstats rdemem;
 int softreconfig;
+static int  rde_eval_all;
 
 extern struct rde_peer_head peerlist;
 extern struct rde_peer *peerself;
@@ -400,6 +402,9 @@ rde_dispatch_imsg_session(struct imsgbuf
fatalx("incorrect size of session request");
memcpy(, imsg.data, sizeof(pconf));
peer_add(imsg.hdr.peerid, );
+   /* make sure rde_eval_all is on if needed. */
+   if (pconf.flags & PEERFLAG_EVALUATE_ALL)
+   rde_eval_all = 1;
break;
case IMSG_NETWORK_ADD:
if (imsg.hdr.len - IMSG_HEADER_SIZE !=
@@ -2884,8 +2889,6 @@ rde_send_kroute(struct rib *rib, struct 
 /*
  * update specific functions
  */
-static int rde_eval_all;
-
 int
 rde_evaluate_all(void)
 {
@@ -2915,17 +2918,13 @@ rde_generate_updates(struct rib *rib, st
else
aid = old->pt->aid;
 
-   rde_eval_all = 0;
LIST_FOREACH(peer, , peer_l) {
/* skip ourself */
if (peer == peerself)
continue;
if (peer->state != PEER_UP)
continue;
-   /* handle evaluate all, keep track if it is needed */
-   if (peer->flags & PEERFLAG_EVALUATE_ALL)
-   rde_eval_all = 1;
-   else if (eval_all)
+   if ((peer->flags & PEERFLAG_EVALUATE_ALL) == 0 && eval_all)
/* skip default peers if the best path didn't change */
continue;
/* skip peers using a different rib */
@@ -3287,9 +3286,12 @@ rde_reload_done(void)
 
rde_filter_calc_skip_steps(out_rules);
 
+   /* make sure that rde_eval_all is correctly set after a config change */
+   rde_eval_all = 0;
+
/* check if filter changed */
LIST_FOREACH(peer, , peer_l) {
-   if (peer->conf.id == 0)
+   if (peer->conf.id == 0) /* ignore peerself*/
continue;
peer->reconf_out = 0;
peer->reconf_rib = 0;
@@ -3319,6 +3321,8 @@ rde_reload_done(void)
}
peer->export_type = peer->conf.export_type;
peer->flags = peer->conf.flags;
+   if (peer->flags & PEERFLAG_EVALUATE_ALL)
+   rde_eval_all = 1;
 
if (peer->reconf_rib) {
if (prefix_dump_new(peer, AID_UNSPEC,
@@ -3336,6 +3340,7 @@ rde_reload_done(void)
peer->reconf_out = 1;
}
}
+
/* bring ribs in sync */
for (rid = 0; rid < rib_size; rid++) {
struct rib *rib = rib_byid(rid);
Index: rde_update.c
===
RCS file: 

rpki-client fix possible uninitalised variables

2021-05-11 Thread Claudio Jeker
Modern gcc warns about these variables being not initalized.

main.c: In function 'main':
main.c:1064:11: warning: 'rrdppid' may be used uninitialized in this function 
[-Wmaybe-uninitialized]
   else if (pid == rrdppid)
   ^

rrdp_delta.c: In function 'start_publish_withdraw_elem':
rrdp_delta.c:150:15: warning: 'uri' may be used uninitialized in this function 
[-Wmaybe-uninitialized]
  dxml->pxml = new_publish_xml(pub, uri, hash,
   ^~~
  hasHash ? sizeof(hash) : 0);
  ~~~

rrdp_notification.c: In function 'notification_done':
rrdp_notification.c:380:5: warning: 'last_s' may be used uninitialized in this 
function [-Wmaybe-uninitialized]
  if (last_s != nxml->serial)
 ^

The main.c change just adds the same bit of code that the other
sub-processes use (http and rsync). For the rrdp files I think the could
is not reachable with the variable uninitalized but it is for sure not
obvious so better make it explicit.

-- 
:wq Claudio

Index: main.c
===
RCS file: /cvs/src/usr.sbin/rpki-client/main.c,v
retrieving revision 1.139
diff -u -p -r1.139 main.c
--- main.c  19 Apr 2021 17:04:35 -  1.139
+++ main.c  11 May 2021 07:49:09 -
@@ -855,8 +855,10 @@ main(int argc, char *argv[])
 
close(fd[0]);
rrdp = fd[1];
-   } else
+   } else {
rrdp = -1;
+   rrdppid = -1;
+   }
 
/* TODO unveil cachedir and outputdir, no other access allowed */
if (pledge("stdio rpath wpath cpath fattr sendfd", NULL) == -1)
Index: rrdp_delta.c
===
RCS file: /cvs/src/usr.sbin/rpki-client/rrdp_delta.c,v
retrieving revision 1.1
diff -u -p -r1.1 rrdp_delta.c
--- rrdp_delta.c1 Apr 2021 16:04:48 -   1.1
+++ rrdp_delta.c10 May 2021 13:12:36 -
@@ -115,7 +115,7 @@ start_publish_withdraw_elem(struct delta
 int withdraw)
 {
XML_Parser p = dxml->parser;
-   char *uri, hash[SHA256_DIGEST_LENGTH];
+   char *uri = NULL, hash[SHA256_DIGEST_LENGTH];
int i, hasUri = 0, hasHash = 0;
enum publish_type pub = PUB_UPD;
 
Index: rrdp_notification.c
===
RCS file: /cvs/src/usr.sbin/rpki-client/rrdp_notification.c,v
retrieving revision 1.4
diff -u -p -r1.4 rrdp_notification.c
--- rrdp_notification.c 15 Apr 2021 08:58:46 -  1.4
+++ rrdp_notification.c 10 May 2021 13:14:18 -
@@ -351,7 +351,7 @@ enum rrdp_task
 notification_done(struct notification_xml *nxml, char *last_mod)
 {
struct delta_item *d;
-   long long s, last_s;
+   long long s, last_s = 0;
 
nxml->current->last_mod = last_mod;
nxml->current->session_id = xstrdup(nxml->session_id);