Use after free in X config parser

2023-10-05 Thread Crystal Kolipe
This is an interesting one...

There is a use after free bug in the X config parser, which can be trivially
demonstrated by creating a config file with the first line being a comment and
the second line containing invalid syntax:

$ echo "#foo\nfoo" > custom_config
$ X -config custom_config

[...]

(++) Using config file: "custom_config"
(==) Using system config directory "/usr/X11R6/share/X11/xorg.conf.d"
Parse error on line 3 of section InputClass in file custom_config
"on" is not a valid keyword in this section.
(EE) 
Fatal server error:
(EE) no screens found(EE) 

[...]

So part of the error message is reading 0xdf bytes, which are obviously from a
buffer which has been free'd.

Looking in /usr/xenocara/xserver/hw/xfree86/parser/scan.c, in xf86getToken(),
there is an XXX comment in the code mentioning that, (for some unspecified
reason), the configRBuf pointer value is being returned in xf86_lex_val.str
instead of a private copy being made, (as is done for, E.G. non-comment
strings).

The comment seems to imply that xf86addComment would handle this gracefully
and not free() the pointer, but clearly the pointer _is_ free'd and the
reasons behind this design choice rather than the seemingly more obvious way
of doing the same malloc and strcpy that is done for regular strings are not
clear.

Indeed, the following cludge does mitigate the issue and the 0xdf bytes 
disappear:

--- scan.c  Thu Oct  5 19:48:29 2023
+++ scan.c.cludge   Thu Oct  5 20:07:06 2023
@@ -335,7 +335,8 @@
 /* XXX no private copy.
  * Use xf86addComment when setting a comment.
  */
-xf86_lex_val.str = configRBuf;
+   xf86_lex_val.str =malloc(4096);
+   strcpy(xf86_lex_val.str, configRBuf);
 return COMMENT;
 }

 
But presumably there was a reason that this wasn't done all along, so
hopefully maybe somebody who is more familiar with this code can come up with
the propper fix.



Re: [patch] [arm64] cpu.c patch based on amd64 idea, provides more debug for multicore kernel

2023-10-05 Thread S V
чт, 5 окт. 2023 г., 22:17 Mark Kettenis :

> > > Really, if those secondary CPUs don't come up, your system is beyond
> > > repair.  You need to do some low-level debugging at that point and DDB
> > > isn't going to help you.  So no, let's keep this code as simple as we
> > > can.
> > >
> >
> > This cores are starting and working properly,
> > but something messes with order
> > (maybe related to shared caches per 2 cores cluster, I'm not so
> > hardware-good ).
> > So 7th core will answer that it booted after 6th core.
> > But with 'while' loop we never got to 6th core.
>
> That suggests that the interrupt that we think we're sending to the
> 6th core is actually going to the 7th core and vice-versa.  Feels like
> a firmware bug to me.  What hardware is this on?
>

This is different Baikal-M (arm64, cortex-a57) based boards with more
or less same firmware of different versions.
I can attach dtb and decompiled dts from one of it, if it can help.



> > > > Index: sys/arch/arm64//arm64/cpu.c
> > > > ===
> > > > RCS file: /cvs/src/sys/arch/arm64/arm64/cpu.c,v
> > > > retrieving revision 1.98
> > > > diff -u -p -r1.98 cpu.c
> > > > --- sys/arch/arm64//arm64/cpu.c   10 Aug 2023 19:29:32 -  
> > > > 1.98
> > > > +++ sys/arch/arm64//arm64/cpu.c   25 Sep 2023 13:24:39 -
> > > > @@ -1096,6 +1096,8 @@ cpu_start_secondary(struct cpu_info *ci,
> > > >  void
> > > >  cpu_boot_secondary(struct cpu_info *ci)
> > > >  {
> > > > + int i;
> > > > +
> > > >   atomic_setbits_int(>ci_flags, CPUF_GO);
> > > >   __asm volatile("dsb sy; sev" ::: "memory");
> > > >
> > > > @@ -1105,8 +1107,16 @@ cpu_boot_secondary(struct cpu_info *ci)
> > > >*/
> > > >   arm_send_ipi(ci, ARM_IPI_NOP);
> > > >
> > > > - while ((ci->ci_flags & CPUF_RUNNING) == 0)
> > > > + for (i = 1000; (!(ci->ci_flags & CPUF_RUNNING)) && i>0;i--) {
> > > >   __asm volatile("wfe");
> > > > + }
> > > > + if (! (ci->ci_flags & CPUF_RUNNING)) {
> > > > + printf("cpu %d failed to start\n", ci->ci_cpuid);
> > > > +#if defined(MPDEBUG) && defined(DDB)
> > > > + printf("dropping into debugger; continue from here to 
> > > > resume boot\n");
> > > > + db_enter();
> > > > +#endif
> > > > + }
> > > >  }
> > > >
> > > >  void
> > > >
> > > >
> >


bober.patched.dts
Description: audio/vnd.dts


bober.patched.dtb
Description: Binary data


Re: fuse(4): make `fuse_rd_filtops' mpsafe

2023-10-05 Thread Vitaliy Makkoveev
ping

> On 27 Sep 2023, at 16:07, Vitaliy Makkoveev  wrote:
> 
> Introduce `fd_mtx' mutex(9) and use it for `fd_fbufs_in' fuse buffers
> queue and `fd_rklist' knotes list protection.
> 
> Index: sys/miscfs/fuse/fuse_device.c
> ===
> RCS file: /cvs/src/sys/miscfs/fuse/fuse_device.c,v
> retrieving revision 1.39
> diff -u -p -r1.39 fuse_device.c
> --- sys/miscfs/fuse/fuse_device.c 8 Sep 2023 20:00:28 -   1.39
> +++ sys/miscfs/fuse/fuse_device.c 27 Sep 2023 13:05:55 -
> @@ -19,13 +19,14 @@
> #include 
> #include 
> #include 
> +#include 
> #include 
> #include 
> +#include 
> #include 
> #include 
> #include 
> #include 
> -#include 
> 
> #include "fusefs_node.h"
> #include "fusefs.h"
> @@ -36,18 +37,24 @@
> #define   DPRINTF(fmt, arg...)
> #endif
> 
> +/*
> + * Locks used to protect struct members and global data
> + *   m   fd_mtx
> + */
> +
> SIMPLEQ_HEAD(fusebuf_head, fusebuf);
> 
> struct fuse_d {
> + struct mutex fd_mtx;
>   struct fusefs_mnt *fd_fmp;
>   int fd_unit;
> 
>   /*fusebufs queues*/
> - struct fusebuf_head fd_fbufs_in;
> + struct fusebuf_head fd_fbufs_in;/* [m] */
>   struct fusebuf_head fd_fbufs_wait;
> 
>   /* kq fields */
> - struct selinfo fd_rsel;
> + struct klist fd_rklist; /* [m] */
>   LIST_ENTRY(fuse_d) fd_list;
> };
> 
> @@ -67,12 +74,16 @@ int   fusewrite(dev_t, struct uio *, int);
> int   fusekqfilter(dev_t dev, struct knote *kn);
> int   filt_fuse_read(struct knote *, long);
> void  filt_fuse_rdetach(struct knote *);
> +int  filt_fuse_modify(struct kevent *, struct knote *);
> +int  filt_fuse_process(struct knote *, struct kevent *);
> 
> static const struct filterops fuse_rd_filtops = {
> - .f_flags= FILTEROP_ISFD,
> + .f_flags= FILTEROP_ISFD | FILTEROP_MPSAFE,
>   .f_attach   = NULL,
>   .f_detach   = filt_fuse_rdetach,
>   .f_event= filt_fuse_read,
> + .f_modify   = filt_fuse_modify,
> + .f_process  = filt_fuse_process,
> };
> 
> #ifdef FUSE_DEBUG
> @@ -142,6 +153,7 @@ fuse_device_cleanup(dev_t dev)
> 
>   /* clear FIFO IN */
>   lprev = NULL;
> + mtx_enter(>fd_mtx);
>   SIMPLEQ_FOREACH_SAFE(f, >fd_fbufs_in, fb_next, ftmp) {
>   DPRINTF("cleanup unprocessed msg in sc_fbufs_in\n");
>   if (lprev == NULL)
> @@ -155,6 +167,7 @@ fuse_device_cleanup(dev_t dev)
>   wakeup(f);
>   lprev = f;
>   }
> + mtx_leave(>fd_mtx);
> 
>   /* clear FIFO WAIT*/
>   lprev = NULL;
> @@ -182,9 +195,11 @@ fuse_device_queue_fbuf(dev_t dev, struct
>   if (fd == NULL)
>   return;
> 
> + mtx_enter(>fd_mtx);
>   SIMPLEQ_INSERT_TAIL(>fd_fbufs_in, fbuf, fb_next);
> + knote_locked(>fd_rklist, 0);
> + mtx_leave(>fd_mtx);
>   stat_fbufs_in++;
> - selwakeup(>fd_rsel);
> }
> 
> void
> @@ -221,6 +236,9 @@ fuseopen(dev_t dev, int flags, int fmt, 
>   fd->fd_unit = unit;
>   SIMPLEQ_INIT(>fd_fbufs_in);
>   SIMPLEQ_INIT(>fd_fbufs_wait);
> + mtx_init(>fd_mtx, IPL_MPFLOOR);
> + klist_init_mutex(>fd_rklist, >fd_mtx);
> +
>   LIST_INSERT_HEAD(_d_list, fd, fd_list);
> 
>   stat_opened_fusedev++;
> @@ -278,6 +296,7 @@ fuseioctl(dev_t dev, u_long cmd, caddr_t
>   ioexch = (struct fb_ioctl_xch *)addr;
> 
>   /* Looking for uuid in fd_fbufs_in */
> + mtx_enter(>fd_mtx);
>   SIMPLEQ_FOREACH(fbuf, >fd_fbufs_in, fb_next) {
>   if (fbuf->fb_uuid == ioexch->fbxch_uuid)
>   break;
> @@ -285,6 +304,7 @@ fuseioctl(dev_t dev, u_long cmd, caddr_t
>   lastfbuf = fbuf;
>   }
>   if (fbuf == NULL) {
> + mtx_leave(>fd_mtx);
>   printf("fuse: Cannot find fusebuf\n");
>   return (EINVAL);
>   }
> @@ -295,6 +315,8 @@ fuseioctl(dev_t dev, u_long cmd, caddr_t
>   else
>   SIMPLEQ_REMOVE_AFTER(>fd_fbufs_in, lastfbuf,
>   fb_next);
> + mtx_leave(>fd_mtx);
> + 
>   stat_fbufs_in--;
> 
>   /* Do not handle fbufs with bad len */
> @@ -389,13 +411,17 @@ fuseread(dev_t dev, struct uio *uio, int
>   if (fd == NULL)
>   return (ENXIO);
> 
> + mtx_enter(>fd_mtx);
>   if (SIMPLEQ_EMPTY(>fd_fbufs_in)) {
> + mtx_leave(>fd_mtx);
> +
>   if (ioflag & O_NONBLOCK)
>   return (EAGAIN);
> 
>   goto end;
>   }
>   fbuf = SIMPLEQ_FIRST(>fd_fbufs_in);
> + mtx_leave(>fd_mtx);
> 
>   /* We get the whole fusebuf or nothing */
>   if (uio->uio_resid != FUSEBUFSIZE)
> @@ -419,7 +445,9 @@ fuseread(dev_t dev, struct uio *uio, int
> 
>   /* Remove the fbuf if it does not contains data */
>   

Re: video(4): make `video_filtops' mpsafe

2023-10-05 Thread Vitaliy Makkoveev
ping

> On 27 Sep 2023, at 23:19, Vitaliy Makkoveev  wrote:
> 
> Introduce `sc_mtx` mutex(9) and use it for `sc_frames_ready' and
> `sc_rklist' knotes list protection.
> 
> Index: sys/dev/video.c
> ===
> RCS file: /cvs/src/sys/dev/video.c,v
> retrieving revision 1.57
> diff -u -p -r1.57 video.c
> --- sys/dev/video.c   2 Jul 2022 08:50:41 -   1.57
> +++ sys/dev/video.c   27 Sep 2023 20:15:34 -
> @@ -20,12 +20,14 @@
> #include 
> #include 
> #include 
> +#include 
> #include 
> #include 
> #include 
> #include 
> #include 
> #include 
> +#include 
> #include 
> #include 
> #include 
> @@ -34,6 +36,11 @@
> 
> #include 
> 
> +/*
> + * Locks used to protect struct members and global data
> + *   m   sc_mtx
> + */
> +
> #ifdef VIDEO_DEBUG
> int video_debug = 1;
> #define DPRINTF(l, x...) do { if ((l) <= video_debug) printf(x); } while (0)
> @@ -50,6 +57,8 @@ struct video_softc {
>   struct process  *sc_owner;  /* owner process */
>   uint8_t  sc_open;   /* device opened */
> 
> + struct mutex sc_mtx;
> +
>   int  sc_fsize;
>   uint8_t *sc_fbuffer;
>   caddr_t  sc_fbuffer_mmap;
> @@ -58,9 +67,9 @@ struct video_softc {
> #define   VIDMODE_NONE0
> #define   VIDMODE_MMAP1
> #define   VIDMODE_READ2
> - int  sc_frames_ready;
> + int  sc_frames_ready;   /* [m] */
> 
> - struct selinfo   sc_rsel;   /* read selector */
> + struct klist sc_rklist; /* [m] read selector */
> };
> 
> int   videoprobe(struct device *, void *, void *);
> @@ -105,6 +114,8 @@ videoattach(struct device *parent, struc
>   sc->sc_dev = parent;
>   sc->sc_fbufferlen = 0;
>   sc->sc_owner = NULL;
> + mtx_init(>sc_mtx, IPL_MPFLOOR);
> + klist_init_mutex(>sc_rklist, >sc_mtx);
> 
>   if (sc->hw_if->get_bufsize)
>   sc->sc_fbufferlen = (sc->hw_if->get_bufsize)(sc->hw_hdl);
> @@ -205,21 +216,28 @@ videoread(dev_t dev, struct uio *uio, in
> 
>   DPRINTF(1, "resid=%zu\n", uio->uio_resid);
> 
> + mtx_enter(>sc_mtx);
> +
>   if (sc->sc_frames_ready < 1) {
>   /* block userland read until a frame is ready */
> - error = tsleep_nsec(sc, PWAIT | PCATCH, "vid_rd", INFSLP);
> + error = msleep_nsec(sc, >sc_mtx, PWAIT | PCATCH,
> + "vid_rd", INFSLP);
>   if (sc->sc_dying)
>   error = EIO;
> - if (error)
> + if (error) {
> + mtx_leave(>sc_mtx);
>   return (error);
> + }
>   }
> + sc->sc_frames_ready--;
> +
> + mtx_leave(>sc_mtx);
> 
>   /* move no more than 1 frame to userland, as per specification */
>   size = ulmin(uio->uio_resid, sc->sc_fsize);
>   if (!video_record_enable)
>   bzero(sc->sc_fbuffer, size);
>   error = uiomove(sc->sc_fbuffer, size, uio);
> - sc->sc_frames_ready--;
>   if (error)
>   return (error);
> 
> @@ -356,7 +374,9 @@ videoioctl(dev_t dev, u_long cmd, caddr_
>   (struct v4l2_buffer *)data);
>   if (!video_record_enable)
>   bzero(sc->sc_fbuffer_mmap + vb->m.offset, vb->length);
> + mtx_enter(>sc_mtx);
>   sc->sc_frames_ready--;
> + mtx_leave(>sc_mtx);
>   break;
>   case VIDIOC_STREAMON:
>   if (sc->hw_if->streamon)
> @@ -429,11 +449,8 @@ void
> filt_videodetach(struct knote *kn)
> {
>   struct video_softc *sc = kn->kn_hook;
> - int s;
> 
> - s = splhigh();
> - klist_remove_locked(>sc_rsel.si_note, kn);
> - splx(s);
> + klist_remove(>sc_rklist, kn);
> }
> 
> int
> @@ -447,11 +464,39 @@ filt_videoread(struct knote *kn, long hi
>   return (0);
> }
> 
> +int
> +filt_videomodify(struct kevent *kev, struct knote *kn)
> +{
> + struct video_softc *sc = kn->kn_hook;
> + int active;
> +
> + mtx_enter(>sc_mtx);
> + active = knote_modify(kev, kn); 
> + mtx_leave(>sc_mtx);
> +
> + return (active);
> +}
> +
> +int
> +filt_videoprocess(struct knote *kn, struct kevent *kev)
> +{
> + struct video_softc *sc = kn->kn_hook;
> + int active;
> +
> + mtx_enter(>sc_mtx);
> + active = knote_process(kn, kev);
> + mtx_leave(>sc_mtx);
> +
> + return (active);
> +}
> +
> const struct filterops video_filtops = {
> - .f_flags= FILTEROP_ISFD,
> + .f_flags= FILTEROP_ISFD | FILTEROP_MPSAFE,
>   .f_attach   = NULL,
>   .f_detach   = filt_videodetach,
>   .f_event= filt_videoread,
> + .f_modify   = filt_videomodify,
> + .f_process  = filt_videoprocess,
> };
> 
> int
> @@ -459,7 +504,7 @@ videokqfilter(dev_t dev, struct knote 

Re: vscsi(4): make `vscsi_filtops' mpsafe

2023-10-05 Thread Vitaliy Makkoveev
ping

> On 28 Sep 2023, at 14:28, Vitaliy Makkoveev  wrote:
> 
> On Thu, Sep 28, 2023 at 01:16:17PM +0200, Claudio Jeker wrote:
>> On Thu, Sep 28, 2023 at 01:58:45PM +0300, Vitaliy Makkoveev wrote:
>>> filt_vscsiread() checks `sc_ccb_i2t' protected by `sc_state_mtx'
>>> mutex(9), so use it to protect `sc_klist' knotes list too.
>>> 
>>> Tested with iscsid(8).
>> 
>> Your diff removes a device_unref(>sc_dev) call in filt_vscsidetach()
>> which seems dubious to me since the reference is still taken in
>> vscsikqfilter().
>> 
> 
> Thanks for catching this. I accidentally removed extra line.
> 
> Index: sys/dev/vscsi.c
> ===
> RCS file: /cvs/src/sys/dev/vscsi.c,v
> retrieving revision 1.61
> diff -u -p -r1.61 vscsi.c
> --- sys/dev/vscsi.c   2 Jul 2022 08:50:41 -   1.61
> +++ sys/dev/vscsi.c   28 Sep 2023 11:24:24 -
> @@ -27,13 +27,18 @@
> #include 
> #include 
> #include 
> -#include 
> +#include 
> 
> #include 
> #include 
> 
> #include 
> 
> +/*
> + * Locks used to protect struct members and global data
> + *   s   sc_state_mtx
> + */
> +
> int   vscsi_match(struct device *, void *, void *);
> void  vscsi_attach(struct device *, struct device *, void *);
> void  vscsi_shutdown(void *);
> @@ -64,14 +69,13 @@ struct vscsi_softc {
> 
>   struct scsi_iopool  sc_iopool;
> 
> - struct vscsi_ccb_list   sc_ccb_i2t;
> + struct vscsi_ccb_list   sc_ccb_i2t; /* [s] */
>   struct vscsi_ccb_list   sc_ccb_t2i;
>   int sc_ccb_tag;
>   struct mutexsc_poll_mtx;
>   struct rwlock   sc_ioc_lock;
> 
> - struct selinfo  sc_sel;
> - struct mutexsc_sel_mtx;
> + struct klistsc_klist;   /* [s] */
> };
> 
> #define DEVNAME(_s) ((_s)->sc_dev.dv_xname)
> @@ -110,12 +114,16 @@ voidvscsi_ccb_put(void *, void *);
> 
> void  filt_vscsidetach(struct knote *);
> int   filt_vscsiread(struct knote *, long);
> +int  filt_vscsimodify(struct kevent *, struct knote *);
> +int  filt_vscsiprocess(struct knote *, struct kevent *);
> 
> const struct filterops vscsi_filtops = {
> - .f_flags= FILTEROP_ISFD,
> + .f_flags= FILTEROP_ISFD | FILTEROP_MPSAFE,
>   .f_attach   = NULL,
>   .f_detach   = filt_vscsidetach,
>   .f_event= filt_vscsiread,
> + .f_modify   = filt_vscsimodify,
> + .f_process  = filt_vscsiprocess,
> };
> 
> 
> @@ -133,15 +141,15 @@ vscsi_attach(struct device *parent, stru
> 
>   printf("\n");
> 
> - mtx_init(>sc_state_mtx, IPL_BIO);
> + mtx_init(>sc_state_mtx, IPL_MPFLOOR);
>   sc->sc_state = VSCSI_S_CLOSED;
> 
>   TAILQ_INIT(>sc_ccb_i2t);
>   TAILQ_INIT(>sc_ccb_t2i);
>   mtx_init(>sc_poll_mtx, IPL_BIO);
> - mtx_init(>sc_sel_mtx, IPL_BIO);
>   rw_init(>sc_ioc_lock, "vscsiioc");
>   scsi_iopool_init(>sc_iopool, sc, vscsi_ccb_get, vscsi_ccb_put);
> + klist_init_mutex(>sc_klist, >sc_state_mtx);
> 
>   saa.saa_adapter = _switch;
>   saa.saa_adapter_softc = sc;
> @@ -181,6 +189,7 @@ vscsi_cmd(struct scsi_xfer *xs)
>   running = 1;
>   TAILQ_INSERT_TAIL(>sc_ccb_i2t, ccb, ccb_entry);
>   }
> + knote_locked(>sc_klist, 0);
>   mtx_leave(>sc_state_mtx);
> 
>   if (!running) {
> @@ -189,8 +198,6 @@ vscsi_cmd(struct scsi_xfer *xs)
>   return;
>   }
> 
> - selwakeup(>sc_sel);
> -
>   if (polled) {
>   mtx_enter(>sc_poll_mtx);
>   while (ccb->ccb_xs != NULL)
> @@ -530,13 +537,10 @@ int
> vscsikqfilter(dev_t dev, struct knote *kn)
> {
>   struct vscsi_softc *sc = DEV2SC(dev);
> - struct klist *klist;
> 
>   if (sc == NULL)
>   return (ENXIO);
> 
> - klist = >sc_sel.si_note;
> -
>   switch (kn->kn_filter) {
>   case EVFILT_READ:
>   kn->kn_fop = _filtops;
> @@ -547,10 +551,7 @@ vscsikqfilter(dev_t dev, struct knote *k
>   }
> 
>   kn->kn_hook = sc;
> -
> - mtx_enter(>sc_sel_mtx);
> - klist_insert_locked(klist, kn);
> - mtx_leave(>sc_sel_mtx);
> + klist_insert(>sc_klist, kn);
> 
>   /* device ref is given to the knote in the klist */
> 
> @@ -561,12 +562,8 @@ void
> filt_vscsidetach(struct knote *kn)
> {
>   struct vscsi_softc *sc = kn->kn_hook;
> - struct klist *klist = >sc_sel.si_note;
> -
> - mtx_enter(>sc_sel_mtx);
> - klist_remove_locked(klist, kn);
> - mtx_leave(>sc_sel_mtx);
> 
> + klist_remove(>sc_klist, kn);
>   device_unref(>sc_dev);
> }
> 
> @@ -574,14 +571,34 @@ int
> filt_vscsiread(struct knote *kn, long hint)
> {
>   struct vscsi_softc *sc = kn->kn_hook;
> - int event = 0;
> +
> + return (!TAILQ_EMPTY(>sc_ccb_i2t));
> +}
> +
> +int
> +filt_vscsimodify(struct kevent *kev, struct knote *kn)
> +{
> + struct vscsi_softc *sc = kn->kn_hook;
> + int 

Re: [patch] [arm64] cpu.c patch based on amd64 idea, provides more debug for multicore kernel

2023-10-05 Thread Mark Kettenis
> From: S V 
> Date: Thu, 5 Oct 2023 12:58:51 +0300
> 
> чт, 5 окт. 2023 г., 09:59 Mark Kettenis :
> >
> > > Date: Thu,  5 Oct 2023 04:10:10 +
> > > From: Klemens Nanni 
> > >
> > > On Mon, Sep 25, 2023 at 01:33:31PM +, Klemens Nanni wrote:
> > > > On Tue, Jul 25, 2023 at 01:30:43PM +0300, Slava Voronzoff wrote:
> > > > > Hi, pinging and refreshing this patch
> > > > >
> > > > > What it does:
> > > > > allow arm64 cpus to break from the loop of waiting to start core and
> > > > > drop to DDB or OS.
> > > > >
> > > > > Patch based on same concept in amd64 cpu.c
> > > > >
> > > > > Any suggestions? Good to go?
> > > >
> > > > So instead of waiting possibly forever for secondary CPUs to come up,
> > > > you can continue debug the system and/or continue boot with less CPUs.
> > > >
> > > > Apart from the trailing empty line you introduce, the approach does
> > > > match amd64 (down to the for loop lacking a space after semicolon).
> > > >
> > > > That allows making progress on these machines and I don't see a 
> > > > downside,
> > > > so OK kn modulo the empty line.
> > > >
> > > > Any input from our arm64 hackers?
> > > >
> > > > Same diff ten days ago: 
> > > > https://marc.info/?l=openbsd-bugs=169465443200821=2
> > >
> > > Anyone?
> >
> > Really, if those secondary CPUs don't come up, your system is beyond
> > repair.  You need to do some low-level debugging at that point and DDB
> > isn't going to help you.  So no, let's keep this code as simple as we
> > can.
> >
> 
> This cores are starting and working properly,
> but something messes with order
> (maybe related to shared caches per 2 cores cluster, I'm not so
> hardware-good ).
> So 7th core will answer that it booted after 6th core.
> But with 'while' loop we never got to 6th core.

That suggests that the interrupt that we think we're sending to the
6th core is actually going to the 7th core and vice-versa.  Feels like
a firmware bug to me.  What hardware is this on?

> > > Index: sys/arch/arm64//arm64/cpu.c
> > > ===
> > > RCS file: /cvs/src/sys/arch/arm64/arm64/cpu.c,v
> > > retrieving revision 1.98
> > > diff -u -p -r1.98 cpu.c
> > > --- sys/arch/arm64//arm64/cpu.c   10 Aug 2023 19:29:32 -  1.98
> > > +++ sys/arch/arm64//arm64/cpu.c   25 Sep 2023 13:24:39 -
> > > @@ -1096,6 +1096,8 @@ cpu_start_secondary(struct cpu_info *ci,
> > >  void
> > >  cpu_boot_secondary(struct cpu_info *ci)
> > >  {
> > > + int i;
> > > +
> > >   atomic_setbits_int(>ci_flags, CPUF_GO);
> > >   __asm volatile("dsb sy; sev" ::: "memory");
> > >
> > > @@ -1105,8 +1107,16 @@ cpu_boot_secondary(struct cpu_info *ci)
> > >*/
> > >   arm_send_ipi(ci, ARM_IPI_NOP);
> > >
> > > - while ((ci->ci_flags & CPUF_RUNNING) == 0)
> > > + for (i = 1000; (!(ci->ci_flags & CPUF_RUNNING)) && i>0;i--) {
> > >   __asm volatile("wfe");
> > > + }
> > > + if (! (ci->ci_flags & CPUF_RUNNING)) {
> > > + printf("cpu %d failed to start\n", ci->ci_cpuid);
> > > +#if defined(MPDEBUG) && defined(DDB)
> > > + printf("dropping into debugger; continue from here to 
> > > resume boot\n");
> > > + db_enter();
> > > +#endif
> > > + }
> > >  }
> > >
> > >  void
> > >
> > >
> 



Re: tcp syn cache unlock

2023-10-05 Thread Vitaliy Makkoveev
On Thu, Oct 05, 2023 at 11:09:52AM -0500, Scott Cheloha wrote:
> On Thu, Oct 05, 2023 at 12:57:24AM +0200, Alexander Bluhm wrote:
> > 
> > This is a first step to unlock TCP syn cache.  The timer function
> > is independent of the socket code.  That makes it easy to start
> > there.
> > 
> > [...]
> > 
> > Still missing:
> > - [...]
> > - Run timer without kernel lock.  I am not aware of such a feature.
> >   There is already some network code that could benefit from that.
> >   Can we get timer without kernel lock like TASKQ_MPSAFE implements
> >   it for tasks?
> 
> This patch adds a TIMEOUT_MPSAFE flag for use with TIMEOUT_PROC.
> Softint timeouts are a different story.
> 
> To run syn_cache_timer() without the kernel lock you would initialize
> it like this:
> 
>   timeout_set_flags(>sc_timer, syn_cache_timer, sc, KCLOCK_NONE,
>   TIMEOUT_PROC | TIMEOUT_MPSAFE);
> 
> Use with caution, this needs another set of eyes.
> 

I don't like that softclock_thread() mixes kernel locked and mpsafe
timers processing. Could you use separate threads for them?



Re: /bin/ls print format bugs

2023-10-05 Thread Crystal Kolipe
Hi Ingo,

On Thu, Oct 05, 2023 at 05:32:07PM +0200, Ingo Schwarze wrote:
> In general, ls(1) strives to dynamically determine the required
> column width.  It already does that for the file size column.
> Given that device numbers use the same column, i think the solution
> that is cleanest, most robust, most aesthetically pleasing, least
> wasteful with respect to space, and easiest to maintain is simply
> doing the same for device numbers, see the patch below.

This looks good and works fine for me.



Re: tcp syn cache unlock

2023-10-05 Thread Scott Cheloha
On Thu, Oct 05, 2023 at 12:57:24AM +0200, Alexander Bluhm wrote:
> 
> This is a first step to unlock TCP syn cache.  The timer function
> is independent of the socket code.  That makes it easy to start
> there.
> 
> [...]
> 
> Still missing:
> - [...]
> - Run timer without kernel lock.  I am not aware of such a feature.
>   There is already some network code that could benefit from that.
>   Can we get timer without kernel lock like TASKQ_MPSAFE implements
>   it for tasks?

This patch adds a TIMEOUT_MPSAFE flag for use with TIMEOUT_PROC.
Softint timeouts are a different story.

To run syn_cache_timer() without the kernel lock you would initialize
it like this:

timeout_set_flags(>sc_timer, syn_cache_timer, sc, KCLOCK_NONE,
TIMEOUT_PROC | TIMEOUT_MPSAFE);

Use with caution, this needs another set of eyes.

Index: share/man/man9/timeout.9
===
RCS file: /cvs/src/share/man/man9/timeout.9,v
retrieving revision 1.56
diff -u -p -r1.56 timeout.9
--- share/man/man9/timeout.91 Jan 2023 01:19:18 -   1.56
+++ share/man/man9/timeout.95 Oct 2023 16:09:33 -
@@ -193,11 +193,16 @@ Counts the time elapsed since the system
 The timeout's behavior may be configured with the bitwise OR of
 zero or more of the following
 .Fa flags :
-.Bl -tag -width TIMEOUT_PROC
+.Bl -tag -width TIMEOUT_MPSAFE
 .It Dv TIMEOUT_PROC
 Execute the timeout in a process context instead of the default
 .Dv IPL_SOFTCLOCK
 interrupt context.
+.It Dv TIMEOUT_MPSAFE
+Execute the timeout without the kernel lock.
+Requires the
+.Dv TIMEOUT_PROC
+flag.
 .El
 .El
 .Pp
@@ -367,8 +372,9 @@ The function
 .Fa fn
 must not block and must be safe to execute on any CPU in the system.
 .Pp
-Currently,
-all timeouts are executed under the kernel lock.
+Timeouts without the
+.Dv TIMEOUT_MPSAFE
+flag are executed under the kernel lock.
 .Sh RETURN VALUES
 .Fn timeout_add ,
 .Fn timeout_add_sec ,
Index: sys/sys/timeout.h
===
RCS file: /cvs/src/sys/sys/timeout.h,v
retrieving revision 1.47
diff -u -p -r1.47 timeout.h
--- sys/sys/timeout.h   31 Dec 2022 16:06:24 -  1.47
+++ sys/sys/timeout.h   5 Oct 2023 16:09:33 -
@@ -54,6 +54,7 @@ struct timeout {
 #define TIMEOUT_ONQUEUE0x02/* on any timeout queue */
 #define TIMEOUT_INITIALIZED0x04/* initialized */
 #define TIMEOUT_TRIGGERED  0x08/* running or ran */
+#define TIMEOUT_MPSAFE 0x10/* run without kernel lock */
 
 struct timeoutstat {
uint64_t tos_added; /* timeout_add*(9) calls */
Index: sys/kern/kern_timeout.c
===
RCS file: /cvs/src/sys/kern/kern_timeout.c,v
retrieving revision 1.95
diff -u -p -r1.95 kern_timeout.c
--- sys/kern/kern_timeout.c 29 Jul 2023 06:52:08 -  1.95
+++ sys/kern/kern_timeout.c 5 Oct 2023 16:09:34 -
@@ -75,6 +75,7 @@ struct circq timeout_wheel_kc[BUCKETS];   
 struct circq timeout_new;  /* [T] New, unscheduled timeouts */
 struct circq timeout_todo; /* [T] Due or needs rescheduling */
 struct circq timeout_proc; /* [T] Due + needs process context */
+struct circq timeout_proc_mpsafe;  /* [T] Process ctx + no kernel lock */
 
 time_t timeout_level_width[WHEELCOUNT];/* [I] Wheel level width 
(seconds) */
 struct timespec tick_ts;   /* [I] Length of a tick (1/hz secs) */
@@ -228,6 +229,7 @@ timeout_startup(void)
CIRCQ_INIT(_new);
CIRCQ_INIT(_todo);
CIRCQ_INIT(_proc);
+   CIRCQ_INIT(_proc_mpsafe);
for (b = 0; b < nitems(timeout_wheel); b++)
CIRCQ_INIT(_wheel[b]);
for (b = 0; b < nitems(timeout_wheel_kc); b++)
@@ -261,10 +263,16 @@ void
 timeout_set_flags(struct timeout *to, void (*fn)(void *), void *arg, int 
kclock,
 int flags)
 {
+   KASSERT(!ISSET(flags, ~(TIMEOUT_PROC | TIMEOUT_MPSAFE)));
+
to->to_func = fn;
to->to_arg = arg;
to->to_kclock = kclock;
to->to_flags = flags | TIMEOUT_INITIALIZED;
+
+   /* For now, only process context timeouts may be marked MP-safe. */
+   if (ISSET(to->to_flags, TIMEOUT_MPSAFE))
+   KASSERT(ISSET(to->to_flags, TIMEOUT_PROC));
 }
 
 void
@@ -659,7 +667,10 @@ softclock_process_kclock_timeout(struct 
if (!new && timespeccmp(>to_abstime, >kc_late, <=))
tostat.tos_late++;
if (ISSET(to->to_flags, TIMEOUT_PROC)) {
-   CIRCQ_INSERT_TAIL(_proc, >to_list);
+   if (ISSET(to->to_flags, TIMEOUT_MPSAFE))
+   CIRCQ_INSERT_TAIL(_proc_mpsafe, >to_list);
+   else
+   CIRCQ_INSERT_TAIL(_proc, >to_list);
return;
}
timeout_run(to);
@@ -681,7 +692,10 @@ softclock_process_tick_timeout(struct ti
if (!new && delta < 0)

Re: /bin/ls print format bugs

2023-10-05 Thread Ingo Schwarze
Hi Crystal,

Crystal Kolipe wrote on Tue, Oct 03, 2023 at 06:47:42PM -0300:

> Two display format bugs seems to have crept in to ls due to the
> addition of scaled human readable values and large minor numbers.

I think you are right that the current situation isn't good.
Thank you for bringing this up.

In general, ls(1) strives to dynamically determine the required
column width.  It already does that for the file size column.
Given that device numbers use the same column, i think the solution
that is cleanest, most robust, most aesthetically pleasing, least
wasteful with respect to space, and easiest to maintain is simply
doing the same for device numbers, see the patch below.

Two remarks:
 * The reason for keeping the local variable "bcfile" is that
   (0, 0) represents a valid device (console).  There is no
   good reason why it should be in the DISPLAY struct, though.
 * The line "d.s_major = d.s_minor = 3;" is likely redundant because
   with bcfile == 0, nothing is supposed to use these fields.
   While local variables should only be initialized when needed, i
   think defensive programming recommends keeping data that is used
   by (even internal) APIs initialized to reasonable values rather
   than having random memory content lying around.  That reduces
   the risk of future code changes in *other* modules accidentally
   accessing random data, if developers get confused regarding
   which invariants the internal API guarantees or requires.

OK?
  Ingo


   $ cd /dev
   $ ls -l MAKEDEV *vid*  
  -r-xr-xr-x  1 root  wheel  12254 Oct  3 07:07 MAKEDEV
  lrwxr-xr-x  1 root  wheel  6 Aug 29  2016 video -> video0
  crw---  1 root  wheel  44, 0 Oct  3 13:34 video0
  crw---  1 root  wheel  44, 1 Oct  3 13:34 video1
   $ ls -lh MAKEDEV *vid*
  -r-xr-xr-x  1 root  wheel  12.0K Oct  3 07:07 MAKEDEV
  lrwxr-xr-x  1 root  wheel 6B Aug 29  2016 video -> video0
  crw---  1 root  wheel  44, 0 Oct  3 13:34 video0
  crw---  1 root  wheel  44, 1 Oct  3 13:34 video1
   $ ls -l null
  crw-rw-rw-  1 root  wheel  2, 2 Oct  5 00:32 null
   $ ls -l cua00  
  crw-rw  1 root  dialer  8, 128 Oct  3 13:34 cua00


Index: ls.c
===
RCS file: /cvs/src/bin/ls/ls.c,v
retrieving revision 1.54
diff -u -p -r1.54 ls.c
--- ls.c7 Oct 2020 21:03:09 -   1.54
+++ ls.c5 Oct 2023 14:47:37 -
@@ -436,6 +436,7 @@ display(FTSENT *p, FTSENT *list)
unsigned long long btotal;
blkcnt_t maxblock;
ino_t maxinode;
+   unsigned int maxmajor, maxminor;
int bcfile, flen, glen, ulen, maxflags, maxgroup, maxuser, maxlen;
int entries, needstats;
int width;
@@ -449,6 +450,7 @@ display(FTSENT *p, FTSENT *list)
btotal = maxblock = maxinode = maxlen = maxnlink = 0;
bcfile = 0;
maxuser = maxgroup = maxflags = 0;
+   maxmajor = maxminor = 0;
maxsize = 0;
for (cur = list, entries = 0; cur != NULL; cur = cur->fts_link) {
if (cur->fts_info == FTS_ERR || cur->fts_info == FTS_NS) {
@@ -523,9 +525,13 @@ display(FTSENT *p, FTSENT *list)
(void)strlcpy(np->group, group, glen + 1);
 
if (S_ISCHR(sp->st_mode) ||
-   S_ISBLK(sp->st_mode))
+   S_ISBLK(sp->st_mode)) {
bcfile = 1;
-
+   if (maxmajor < major(sp->st_rdev))
+   maxmajor = major(sp->st_rdev);
+   if (maxminor < minor(sp->st_rdev))
+   maxminor = minor(sp->st_rdev);
+   }
if (f_flags) {
np->flags = >data[ulen + 1 + glen + 
1];
(void)strlcpy(np->flags, flags, flen + 
1);
@@ -551,7 +557,6 @@ display(FTSENT *p, FTSENT *list)
d.entries = entries;
d.maxlen = maxlen;
if (needstats) {
-   d.bcfile = bcfile;
d.btotal = btotal;
(void)snprintf(buf, sizeof(buf), "%llu",
(unsigned long long)maxblock);
@@ -570,6 +575,17 @@ display(FTSENT *p, FTSENT *list)
d.s_size = strlen(buf);
} else
d.s_size = FMT_SCALED_STRSIZE-2; /* no - or '\0' */
+   d.s_major = d.s_minor = 3;
+   if (bcfile) {
+   (void)snprintf(buf, sizeof(buf), "%u", maxmajor);
+   d.s_major = strlen(buf);
+   (void)snprintf(buf, sizeof(buf), "%u", maxminor);
+   d.s_minor = strlen(buf);
+   if (d.s_size <= d.s_major + 2 + d.s_minor)
+   

Patch: solve IPSEC collisions for IKE1/L2TP peers coming from same IP address

2023-10-05 Thread mathieu . papineau
Hello,

This patch allows handling multiple IKE1/L2TP tunnels that come from the
same IP address.

It happens when clients are behind the same NAT gateway, thus seen using
the same IP address. Currently, the issue is that it works for only one
client at a time.

The idea is to rely on identifiers to distinguish clients. (This is why
they are also sent when doing a SADB_X_DELFLOW.)

Regarding storage of IPSEC policies, those that match the same route are
stored contiguously in ipo_list, glued by ipo_next_ipo_uses_same_route.
rn_match() points to the first item of a block, then it's just a matter
of finding the item containing the right identifiers.

Regarding performances of ipsp_spd_lookup(), there is always a check on
ipo->ipo_next_ipo_uses_same_route so the minimum cost is negligible.
Then, when tunnels are sharing a route, the search is a basic iteration
so this part may not scale very well.

The patch that follows is made against current sources. Some parts have
been written to keep the diff the shortest possible (first chunk of
isakmpd/pf_key_v2.c is definitely made with this in mind and should be
rewritten).

Affected files:
- sbin/isakmpd/ipsec.c
- sbin/isakmpd/pf_key_v2.c
- sys/net/pfkeyv2.c
- sys/net/pfkeyv2_parsemessage.c
- sys/netinet/ip_ipsp.c
- sys/netinet/ip_ipsp.h
- sys/netinet/ip_spd.c

Regards,

Mathieu J. PAPINEAU



diff -r 1f9cc3fc2a87 -r ec38c9134666 sbin/isakmpd/ipsec.c
--- sbin/isakmpd/ipsec.cThu Oct 05 10:26:02 2023 +0200
+++ sbin/isakmpd/ipsec.cMon Sep 18 11:54:00 2023 +0200
@@ -289,6 +289,20 @@
isa->dst_mask == NULL || isa2->dst_mask == NULL)
return 0;
 
+   /* Having different ports means different peers. */
+   struct sockaddr *dst1, *dst2;
+   if (sa->initiator)
+   sa->transport->vtbl->get_src(sa->transport, );
+   else
+   sa->transport->vtbl->get_dst(sa->transport, );
+   if (sa2->initiator)
+   sa2->transport->vtbl->get_src(sa2->transport, );
+   else
+   sa2->transport->vtbl->get_dst(sa2->transport, );
+   if (dst1 && dst2)
+   if (sockaddr_port(dst1) != sockaddr_port(dst2))
+   return 0;
+
return isa->src_net->sa_family == isa2->src_net->sa_family &&
memcmp(sockaddr_addrdata(isa->src_net),
sockaddr_addrdata(isa2->src_net),
diff -r 1f9cc3fc2a87 -r ec38c9134666 sbin/isakmpd/pf_key_v2.c
--- sbin/isakmpd/pf_key_v2.cThu Oct 05 10:26:02 2023 +0200
+++ sbin/isakmpd/pf_key_v2.cMon Sep 18 11:54:00 2023 +0200
@@ -1520,7 +1520,7 @@
if (!flow)
goto cleanup;
 
-   if (!delete) {
+   if (!delete || proto == IPSEC_PROTO_IPSEC_ESP) {
/* Setup the source ID, if provided. */
if (srcid) {
sid = calloc(
@@ -1980,20 +1980,40 @@
struct ipsec_sa *isa = sa->data;
struct sockaddr *dst, *src;
struct proto   *proto = TAILQ_FIRST(>protos);
+   int sidtype = 0, didtype = 0;
+   size_t  sidlen = 0, didlen = 0;
+   u_int8_t   *sid = 0, *did = 0;
 
sa->transport->vtbl->get_dst(sa->transport, );
sa->transport->vtbl->get_src(sa->transport, );
 
+   if (sa->id_i) {
+   if (sa->initiator)
+   sid = pf_key_v2_convert_id(sa->id_i,
+   sa->id_i_len, , );
+   else
+   did = pf_key_v2_convert_id(sa->id_i,
+   sa->id_i_len, , );
+   }
+   if (sa->id_r) {
+   if (sa->initiator)
+   did = pf_key_v2_convert_id(sa->id_r,
+   sa->id_r_len, , );
+   else
+   sid = pf_key_v2_convert_id(sa->id_r,
+   sa->id_r_len, , );
+   }
+
if (!incoming)
return pf_key_v2_flow(isa->src_net, isa->src_mask,
isa->dst_net, isa->dst_mask, isa->tproto, isa->sport,
-   isa->dport, proto->spi[0], proto->proto, src, dst, 1, 0,
-   0, 0, 0, 0, 0, 0, proto->data);
+   isa->dport, proto->spi[0], proto->proto, dst, src, 1, 0,
+   sidtype, sid, sidlen, didtype, did, didlen, proto->data);
else {
return pf_key_v2_flow(isa->dst_net, isa->dst_mask,
isa->src_net, isa->src_mask, isa->tproto, isa->dport,
isa->sport, proto->spi[1], proto->proto, src, dst, 1, 1,
-   0, 0, 0, 0, 0, 0, proto->data);
+   sidtype, sid, sidlen, didtype, did, didlen, proto->data);
}
 }
 
diff -r 1f9cc3fc2a87 -r ec38c9134666 sys/net/pfkeyv2.c
--- sys/net/pfkeyv2.c   Thu Oct 05 10:26:02 2023 +0200
+++ sys/net/pfkeyv2.c   Mon Sep 18 11:54:00 2023 +0200
@@ -1127,6 +1127,7 @@
int delflag = 0;
struct sockaddr_encap encapdst, encapnetmask;
struct ipsec_policy *ipo;
+

Re: ifq_start_task(): purge queue before exit when IFF_RUNNING flag is not set

2023-10-05 Thread David Gwynne



> On 5 Oct 2023, at 21:50, Vitaliy Makkoveev  wrote:
> 
> Otherwise `if_snd' could still contain packets. If any of them has
> associated data with bumped reference counter, the corresponding
> refcnt_finalize() will sleep forever. I don't think we need to
> transmit these packets when interface became running again.

there were a bunch of paths where the stack would implicitly take an interface 
down and up again around the actual change it was trying to make. from memory 
these kinds of things are disruptive anyway, so losing a few more packets might 
not be a problem. we do ride this out at the moment though.

> 
> I don't like this unlocked if_flags check we have in ifq_start_task().
> Guess READ_ONCE() is much better to load value.

there are no inconsistent intermediate values you can read from if_flags. are 
you worried the compiler will load the value again and maybe we'll get a 
different state the second time?

> The problem reported in the "wg destroy hangs" [1] thread is the kind of
> described above. Also, current ifq_start_task() implementation denies to
> introduce reference counters for wg_peer and remove this
> wg_peer_destroy() hack:
> 
> NET_LOCK();
> while (!ifq_empty(>sc_if.if_snd)) {
> NET_UNLOCK();
> tsleep_nsec(sc, PWAIT, "wg_ifq", 1000);
> NET_LOCK();
> }
> NET_UNLOCK();
> 
> 1. https://marc.info/?t=16964239631=1=2
> 
> ok?

im kind of surprised that when dealing with a problem that only seem to appear 
in wireguard that you lean toward ifqs needing a fix. especially surprising 
when the wg code in question is a tsleep_nsec(1000) in a loop.

i would argue that wg go down, wait for the stack to stop using it, and then do 
it's own purge.

> 
> Index: sys/net/ifq.c
> ===
> RCS file: /cvs/src/sys/net/ifq.c,v
> retrieving revision 1.51
> diff -u -p -r1.51 ifq.c
> --- sys/net/ifq.c 5 Oct 2023 11:08:56 - 1.51
> +++ sys/net/ifq.c 5 Oct 2023 11:49:40 -
> @@ -140,8 +140,11 @@ ifq_start_task(void *p)
> struct ifqueue *ifq = p;
> struct ifnet *ifp = ifq->ifq_if;
> 
> - if (!ISSET(ifp->if_flags, IFF_RUNNING) ||
> -ifq_empty(ifq) || ifq_is_oactive(ifq))
> + if (!ISSET(READ_ONCE(ifp->if_flags), IFF_RUNNING)){
> + ifq_purge(ifq);
> + return;
> + }
> + if (ifq_empty(ifq) || ifq_is_oactive(ifq))
> return;
> 
> ifp->if_qstart(ifq);
> 



ifq_start_task(): purge queue before exit when IFF_RUNNING flag is not set

2023-10-05 Thread Vitaliy Makkoveev
Otherwise `if_snd' could still contain packets. If any of them has
associated data with bumped reference counter, the corresponding
refcnt_finalize() will sleep forever. I don't think we need to
transmit these packets when interface became running again.

I don't like this unlocked if_flags check we have in ifq_start_task().
Guess READ_ONCE() is much better to load value.

The problem reported in the "wg destroy hangs" [1] thread is the kind of
described above. Also, current ifq_start_task() implementation denies to
introduce reference counters for wg_peer and remove this
wg_peer_destroy() hack:

 NET_LOCK();
 while (!ifq_empty(>sc_if.if_snd)) {
 NET_UNLOCK();
 tsleep_nsec(sc, PWAIT, "wg_ifq", 1000);
 NET_LOCK();
 }
 NET_UNLOCK();

1. https://marc.info/?t=16964239631=1=2

ok?

Index: sys/net/ifq.c
===
RCS file: /cvs/src/sys/net/ifq.c,v
retrieving revision 1.51
diff -u -p -r1.51 ifq.c
--- sys/net/ifq.c   5 Oct 2023 11:08:56 -   1.51
+++ sys/net/ifq.c   5 Oct 2023 11:49:40 -
@@ -140,8 +140,11 @@ ifq_start_task(void *p)
struct ifqueue *ifq = p;
struct ifnet *ifp = ifq->ifq_if;
 
-   if (!ISSET(ifp->if_flags, IFF_RUNNING) ||
-   ifq_empty(ifq) || ifq_is_oactive(ifq))
+   if (!ISSET(READ_ONCE(ifp->if_flags), IFF_RUNNING)){
+   ifq_purge(ifq);
+   return;
+   }
+   if (ifq_empty(ifq) || ifq_is_oactive(ifq))
return;
 
ifp->if_qstart(ifq);



Re: wg destroy hangs

2023-10-05 Thread Alexander Bluhm
On Thu, Oct 05, 2023 at 07:15:23AM +0200, Kirill Miazine wrote:
> > This diff checks IFF_RUNNING flag within while (!ifq_empty()) loop of
> > wg_peer_destroy(). If the flag is not set queue will be purged and check
> > performed again. I intentionally keep netlock to prevent ifconfig
> > manipulations on the interface.
> 
> I confirm that just the diff below solved the issue

> > +* XXX: `if_snd' of stopped interface could still packets

This sentnece is missing a verb.  ... could still contain packets?
Or: `if_snd' of stopped interface does not consume packets

OK bluhm@

> > Index: sys/net/if_wg.c
> > ===
> > RCS file: /cvs/src/sys/net/if_wg.c,v
> > retrieving revision 1.31
> > diff -u -p -r1.31 if_wg.c
> > --- sys/net/if_wg.c 26 Sep 2023 15:16:44 -  1.31
> > +++ sys/net/if_wg.c 4 Oct 2023 23:09:14 -
> > @@ -509,6 +509,13 @@ wg_peer_destroy(struct wg_peer *peer)
> >   
> > NET_LOCK();
> > while (!ifq_empty(>sc_if.if_snd)) {
> > +   /*
> > +* XXX: `if_snd' of stopped interface could still packets
> > +*/
> > +   if (!ISSET(sc->sc_if.if_flags, IFF_RUNNING)) {
> > +   ifq_purge(>sc_if.if_snd);
> > +   continue;
> > +   }
> > NET_UNLOCK();
> > tsleep_nsec(sc, PWAIT, "wg_ifq", 1000);
> > NET_LOCK();
> > 



Re: [patch] [arm64] cpu.c patch based on amd64 idea, provides more debug for multicore kernel

2023-10-05 Thread S V
чт, 5 окт. 2023 г., 09:59 Mark Kettenis :
>
> > Date: Thu,  5 Oct 2023 04:10:10 +
> > From: Klemens Nanni 
> >
> > On Mon, Sep 25, 2023 at 01:33:31PM +, Klemens Nanni wrote:
> > > On Tue, Jul 25, 2023 at 01:30:43PM +0300, Slava Voronzoff wrote:
> > > > Hi, pinging and refreshing this patch
> > > >
> > > > What it does:
> > > > allow arm64 cpus to break from the loop of waiting to start core and
> > > > drop to DDB or OS.
> > > >
> > > > Patch based on same concept in amd64 cpu.c
> > > >
> > > > Any suggestions? Good to go?
> > >
> > > So instead of waiting possibly forever for secondary CPUs to come up,
> > > you can continue debug the system and/or continue boot with less CPUs.
> > >
> > > Apart from the trailing empty line you introduce, the approach does
> > > match amd64 (down to the for loop lacking a space after semicolon).
> > >
> > > That allows making progress on these machines and I don't see a downside,
> > > so OK kn modulo the empty line.
> > >
> > > Any input from our arm64 hackers?
> > >
> > > Same diff ten days ago: 
> > > https://marc.info/?l=openbsd-bugs=169465443200821=2
> >
> > Anyone?
>
> Really, if those secondary CPUs don't come up, your system is beyond
> repair.  You need to do some low-level debugging at that point and DDB
> isn't going to help you.  So no, let's keep this code as simple as we
> can.
>

This cores are starting and working properly,
but something messes with order
(maybe related to shared caches per 2 cores cluster, I'm not so
hardware-good ).
So 7th core will answer that it booted after 6th core.
But with 'while' loop we never got to 6th core.

>
> > Index: sys/arch/arm64//arm64/cpu.c
> > ===
> > RCS file: /cvs/src/sys/arch/arm64/arm64/cpu.c,v
> > retrieving revision 1.98
> > diff -u -p -r1.98 cpu.c
> > --- sys/arch/arm64//arm64/cpu.c   10 Aug 2023 19:29:32 -  1.98
> > +++ sys/arch/arm64//arm64/cpu.c   25 Sep 2023 13:24:39 -
> > @@ -1096,6 +1096,8 @@ cpu_start_secondary(struct cpu_info *ci,
> >  void
> >  cpu_boot_secondary(struct cpu_info *ci)
> >  {
> > + int i;
> > +
> >   atomic_setbits_int(>ci_flags, CPUF_GO);
> >   __asm volatile("dsb sy; sev" ::: "memory");
> >
> > @@ -1105,8 +1107,16 @@ cpu_boot_secondary(struct cpu_info *ci)
> >*/
> >   arm_send_ipi(ci, ARM_IPI_NOP);
> >
> > - while ((ci->ci_flags & CPUF_RUNNING) == 0)
> > + for (i = 1000; (!(ci->ci_flags & CPUF_RUNNING)) && i>0;i--) {
> >   __asm volatile("wfe");
> > + }
> > + if (! (ci->ci_flags & CPUF_RUNNING)) {
> > + printf("cpu %d failed to start\n", ci->ci_cpuid);
> > +#if defined(MPDEBUG) && defined(DDB)
> > + printf("dropping into debugger; continue from here to resume 
> > boot\n");
> > + db_enter();
> > +#endif
> > + }
> >  }
> >
> >  void
> >
> >



Re: [patch] [arm64] cpu.c patch based on amd64 idea, provides more debug for multicore kernel

2023-10-05 Thread Mark Kettenis
> Date: Thu,  5 Oct 2023 04:10:10 +
> From: Klemens Nanni 
> 
> On Mon, Sep 25, 2023 at 01:33:31PM +, Klemens Nanni wrote:
> > On Tue, Jul 25, 2023 at 01:30:43PM +0300, Slava Voronzoff wrote:
> > > Hi, pinging and refreshing this patch
> > > 
> > > What it does:
> > > allow arm64 cpus to break from the loop of waiting to start core and
> > > drop to DDB or OS.
> > > 
> > > Patch based on same concept in amd64 cpu.c
> > > 
> > > Any suggestions? Good to go?
> > 
> > So instead of waiting possibly forever for secondary CPUs to come up,
> > you can continue debug the system and/or continue boot with less CPUs.
> > 
> > Apart from the trailing empty line you introduce, the approach does
> > match amd64 (down to the for loop lacking a space after semicolon).
> > 
> > That allows making progress on these machines and I don't see a downside,
> > so OK kn modulo the empty line.
> > 
> > Any input from our arm64 hackers?
> > 
> > Same diff ten days ago: 
> > https://marc.info/?l=openbsd-bugs=169465443200821=2
> 
> Anyone?

Really, if those secondary CPUs don't come up, your system is beyond
repair.  You need to do some low-level debugging at that point and DDB
isn't going to help you.  So no, let's keep this code as simple as we
can.


> Index: sys/arch/arm64//arm64/cpu.c
> ===
> RCS file: /cvs/src/sys/arch/arm64/arm64/cpu.c,v
> retrieving revision 1.98
> diff -u -p -r1.98 cpu.c
> --- sys/arch/arm64//arm64/cpu.c   10 Aug 2023 19:29:32 -  1.98
> +++ sys/arch/arm64//arm64/cpu.c   25 Sep 2023 13:24:39 -
> @@ -1096,6 +1096,8 @@ cpu_start_secondary(struct cpu_info *ci,
>  void
>  cpu_boot_secondary(struct cpu_info *ci)
>  {
> + int i;
> +
>   atomic_setbits_int(>ci_flags, CPUF_GO);
>   __asm volatile("dsb sy; sev" ::: "memory");
>  
> @@ -1105,8 +1107,16 @@ cpu_boot_secondary(struct cpu_info *ci)
>*/
>   arm_send_ipi(ci, ARM_IPI_NOP);
>  
> - while ((ci->ci_flags & CPUF_RUNNING) == 0)
> + for (i = 1000; (!(ci->ci_flags & CPUF_RUNNING)) && i>0;i--) {
>   __asm volatile("wfe");
> + }
> + if (! (ci->ci_flags & CPUF_RUNNING)) {
> + printf("cpu %d failed to start\n", ci->ci_cpuid);
> +#if defined(MPDEBUG) && defined(DDB)
> + printf("dropping into debugger; continue from here to resume 
> boot\n");
> + db_enter();
> +#endif
> + }
>  }
>  
>  void
> 
>