Re: [patch] Sort of fix for game "phantasia"

2023-09-18 Thread Stuart Henderson
On 2023/09/18 23:38, S V wrote:
> ===
> RCS file: /cvs/src/games/phantasia/misc.c,v
> retrieving revision 1.21
> diff -u -p -r1.21 misc.c
> --- games/phantasia/misc.c10 Jan 2016 13:35:10 -  1.21
> +++ games/phantasia/misc.c18 Sep 2023 20:37:16 -
> @@ -1412,6 +1412,7 @@ error(char *whichfile)
>  
>   warn("%s", whichfile);
>   fprintf(stderr, "Please run 'setup' to determine the problem.\n");
> + fprintf(stderr, "Also note, that you can add your user to 'games' 
> group.\n");

...the system administrator can add your user...

but then, your user will be able to fill /var, so it's not a whole lot
better than setgid games.

And actually, if you really want to play it (ignoring the problem, or if
you have mitigated the issue by placing /var/games on another filesystem),
the current manual does give the observant reader exactly the information
needed to do so,




Re: [patch] Sort of fix for game "phantasia"

2023-09-18 Thread Theo de Raadt
No way.  I do not think that is good advice, either, because that
user can then fill /var.


S V  wrote:

> Awesome, You and William Ahern really showed me that is "security
> mindset". I simply didn't think about some of this points. Thanks for
> that.
> 
> Can I propose as my "last" attempt on this topic to add more
> clarification "how to fix it by user on his machine" with this patch
> 
> 2023-09-16 5:14 GMT+03:00, Theo de Raadt :
> >
> > revision 1.18
> > date: 2015/11/24 03:10:10;  author: deraadt;  state: Exp;  lines: +1 -2;
> > commitid: NZfzN0SfUUHBE4HF;
> > In 1995, all of the games were setuid games.  At end of 1996, I took them
> > all
> > to setgid games, and we started wittling them down.  Nearly 10 years later
> > I
> > am removing all setgid from the games.  If any of these have score files
> > they
> > are now broken, and I hope various folk repair them.  I have argued for
> > years
> > (and received pushback...) that the score file features must be removed, or
> > rewritten to use private files, because setgid is the wrong tool.
> > ok tedu
> >
> > We will not bring back setgid-supported scorefiles.
> >
> >
> > S V  wrote:
> >
> >> Why don't drop it in this case?
> >>
> >> сб, 16 сент. 2023 г., 05:03 Theo de Raadt :
> >>
> >>  Nope, 'setgid games' is intentionally dead.
> >>
> >>  We will not be bringing it back.
> >>
> >>  S V  wrote:
> >>
> >>  > It is pretty to not have this game running, but included so
> >>  >
> >>  > I just added chmod ugo+rw "game files" after chown root:games in
> >> Makefile
> >>  >
> >>  > and add notice about it in man file
> >>  >
> >>  > It is unbroke game and place it to playable state.
> >>  >
> >>  > Thanks in advance!
> >>  >
> >>  > -- Slava Voronzoff
> >>
> >
> 
> 
> -- 
> Nerfur Dragon
> -==(UDIC)==-



Re: [patch] Sort of fix for game "phantasia"

2023-09-18 Thread S V
Awesome, You and William Ahern really showed me that is "security
mindset". I simply didn't think about some of this points. Thanks for
that.

Can I propose as my "last" attempt on this topic to add more
clarification "how to fix it by user on his machine" with this patch

2023-09-16 5:14 GMT+03:00, Theo de Raadt :
>
> revision 1.18
> date: 2015/11/24 03:10:10;  author: deraadt;  state: Exp;  lines: +1 -2;
> commitid: NZfzN0SfUUHBE4HF;
> In 1995, all of the games were setuid games.  At end of 1996, I took them
> all
> to setgid games, and we started wittling them down.  Nearly 10 years later
> I
> am removing all setgid from the games.  If any of these have score files
> they
> are now broken, and I hope various folk repair them.  I have argued for
> years
> (and received pushback...) that the score file features must be removed, or
> rewritten to use private files, because setgid is the wrong tool.
> ok tedu
>
> We will not bring back setgid-supported scorefiles.
>
>
> S V  wrote:
>
>> Why don't drop it in this case?
>>
>> сб, 16 сент. 2023 г., 05:03 Theo de Raadt :
>>
>>  Nope, 'setgid games' is intentionally dead.
>>
>>  We will not be bringing it back.
>>
>>  S V  wrote:
>>
>>  > It is pretty to not have this game running, but included so
>>  >
>>  > I just added chmod ugo+rw "game files" after chown root:games in
>> Makefile
>>  >
>>  > and add notice about it in man file
>>  >
>>  > It is unbroke game and place it to playable state.
>>  >
>>  > Thanks in advance!
>>  >
>>  > -- Slava Voronzoff
>>
>


-- 
Nerfur Dragon
-==(UDIC)==-
Index: games/phantasia/misc.c
===
RCS file: /cvs/src/games/phantasia/misc.c,v
retrieving revision 1.21
diff -u -p -r1.21 misc.c
--- games/phantasia/misc.c  10 Jan 2016 13:35:10 -  1.21
+++ games/phantasia/misc.c  18 Sep 2023 20:37:16 -
@@ -1412,6 +1412,7 @@ error(char *whichfile)
 
warn("%s", whichfile);
fprintf(stderr, "Please run 'setup' to determine the problem.\n");
+   fprintf(stderr, "Also note, that you can add your user to 'games' 
group.\n");
exit(1);
 }
 /**/
Index: games/phantasia/phantasia.6
===
RCS file: /cvs/src/games/phantasia/phantasia.6,v
retrieving revision 1.3
diff -u -p -r1.3 phantasia.6
--- games/phantasia/phantasia.6 11 Jul 2022 03:11:49 -  1.3
+++ games/phantasia/phantasia.6 18 Sep 2023 20:37:16 -
@@ -13,9 +13,11 @@
 .Sh DESCRIPTION
 .Bf -symbolic
 .Nm
-is currently unusable because it relies on
+is currently partially broken because it relies on
 .Xr setgid 2
-to allow multiple users read and write access to the same files.
+to allow multiple users read and write access to the same files. 
+Please add your user to group 'games' to allow access to needed files, 
+if you really want to play.
 .Ef
 .Pp
 .Nm


Re: man 9 intro - improvements [and learning for me]?

2023-09-18 Thread Ingo Schwarze
Hi Christoff,

of course you are free to work on whatever interests you, but if you are
looking for advice, i'd respectfully recommend that you try to work on
specifics rather than on generalities first, in particular when you
feel as if your experience in contributing isn't above average.
That applies even in the domain of documentation.

Christoff Humphries wrote on Mon, Sep 18, 2023 at 12:21:48PM +:

> I went searching for documentation about the kernel internals and was
> used to the intro(9) man page from NetBSD
> https://man.netbsd.org/intro.9 that had a lot more details. Would it 
> be a worthwhile project to attempt to do the same for OpenBSD?

Doing something like that may *sound* simple at first, but actually,
i can think of few documentation changes that would be harder to get
right and harder to get committed.

Unless i'm totally off track, getting that right requires

 1) a solid understanding of all areas of the kernel and
 2) a good understanding of what our kernel developers
actually need to know for their everyday work.

If you have that knowledge, it's *still* much easier to improve
individual pages that are lacking in quality than improving the top-level
overview.  Note that item 2 above tells you which of the pages are
probably most worthy of attention.

Besides, the NetBSD intro(9) page does not strike me as particularly
convincing.  *If* the lines in that page agree with the .Nd one-line
descriptions in the indivual pages, then it provides almost nothing
of value - but causes a maintenance burden because it needs to be
edited whenever any .Nd line changes.  If the lines mismatch, then
the .Nd lines in the indivifual pages can likely be improved.
I did not check which is the case - possibly both are.

We did have a few pages like that in the past, but most of those
got deleted because they provided little value.  For example,
compare these two:

  https://man.openbsd.org/OpenBSD-5.6/string.3
  https://man.openbsd.org/OpenBSD-current/string.3

A very small number still exists, perhaps most notably

  https://man.openbsd.org/crypto.3 and
  https://man.openbsd.org/ssl.3

The benefit of those is *not* that they exhaustively list everything -
indeed, apropos(1) is better at that job than a manually maintained
table of contents - but that they provide some high-level information
how the libraries as a whole are intended to be used that is hard to
figure out from individual pages.  Also, these pages do not duplicate
.Nd lines.

> I understand the annoyance of folks talking about what they intend or
> are going to do with no actual fruit, but wanted to check that the
> intro(9) is lacking and the information is not just stored somewhere
> else (other than "apropos -s 9 .").

Sorry, i lack the experience in kernel development that would be
required to judge whether any information could usefully be added
to intro(9).

Yours,
  Ingo


> I want to learn the internals of the OpenBSD kernel, too, and will do
> as mulander (and friends) did by learning a bit of code daily
> https://blog.tintagel.pl/2017/09/10/openbsd-daily-recap.html.
> 
> Seeking the learn and contribute, especially in the uvm, vmd/vmm, and
> possibly filesystem subsystems.



Re: Use counters_read(9) from ddb(4)

2023-09-18 Thread Alexander Bluhm
On Fri, Sep 15, 2023 at 04:18:13PM +0200, Martin Pieuchot wrote:
> On 11/09/23(Mon) 21:05, Martin Pieuchot wrote:
> > On 06/09/23(Wed) 23:13, Alexander Bluhm wrote:
> > > On Wed, Sep 06, 2023 at 12:23:33PM -0500, Scott Cheloha wrote:
> > > > On Wed, Sep 06, 2023 at 01:04:19PM +0100, Martin Pieuchot wrote:
> > > > > Debugging OOM is hard.  UVM uses per-CPU counters and sadly
> > > > > counters_read(9) needs to allocate memory.  This is not acceptable in
> > > > > ddb(4).  As a result I cannot see the content of UVM counters in OOM
> > > > > situations.
> > > > > 
> > > > > Diff below introduces a *_static() variant of counters_read(9) that
> > > > > takes a secondary buffer to avoid calling malloc(9).  Is it fine?  Do
> > > > > you have a better idea?  Should we make it the default or using the
> > > > > stack might be a problem?
> > > > 
> > > > Instead of adding a second interface I think we could get away with
> > > > just extending counters_read(9) to take a scratch buffer as an optional
> > > > fourth parameter:
> > > > 
> > > > void
> > > > counters_read(struct cpumem *cm, uint64_t *output, unsigned int n,
> > > > uint64_t *scratch);
> > > > 
> > > > "scratch"?  "temp"?  "tmp"?
> > > 
> > > scratch is fine for me
> > 
> > Fine with me.
> 
> Here's a full diff, works for me(tm), ok?

OK bluhm@

> Index: sys/kern/kern_sysctl.c
> ===
> RCS file: /cvs/src/sys/kern/kern_sysctl.c,v
> retrieving revision 1.418
> diff -u -p -r1.418 kern_sysctl.c
> --- sys/kern/kern_sysctl.c16 Jul 2023 03:01:31 -  1.418
> +++ sys/kern/kern_sysctl.c15 Sep 2023 13:29:53 -
> @@ -519,7 +519,7 @@ kern_sysctl(int *name, u_int namelen, vo
>   unsigned int i;
>  
>   memset(, 0, sizeof(mbs));
> - counters_read(mbstat, counters, MBSTAT_COUNT);
> + counters_read(mbstat, counters, MBSTAT_COUNT, NULL);
>   for (i = 0; i < MBSTAT_TYPES; i++)
>   mbs.m_mtypes[i] = counters[i];
>  
> Index: sys/kern/subr_evcount.c
> ===
> RCS file: /cvs/src/sys/kern/subr_evcount.c,v
> retrieving revision 1.15
> diff -u -p -r1.15 subr_evcount.c
> --- sys/kern/subr_evcount.c   5 Dec 2022 08:58:49 -   1.15
> +++ sys/kern/subr_evcount.c   15 Sep 2023 14:01:55 -
> @@ -101,7 +101,7 @@ evcount_sysctl(int *name, u_int namelen,
>  {
>   int error = 0, s, nintr, i;
>   struct evcount *ec;
> - u_int64_t count;
> + uint64_t count, scratch;
>  
>   if (newp != NULL)
>   return (EPERM);
> @@ -129,7 +129,7 @@ evcount_sysctl(int *name, u_int namelen,
>   if (ec == NULL)
>   return (ENOENT);
>   if (ec->ec_percpu != NULL) {
> - counters_read(ec->ec_percpu, , 1);
> + counters_read(ec->ec_percpu, , 1, );
>   } else {
>   s = splhigh();
>   count = ec->ec_count;
> Index: sys/kern/subr_percpu.c
> ===
> RCS file: /cvs/src/sys/kern/subr_percpu.c,v
> retrieving revision 1.10
> diff -u -p -r1.10 subr_percpu.c
> --- sys/kern/subr_percpu.c3 Oct 2022 14:10:53 -   1.10
> +++ sys/kern/subr_percpu.c15 Sep 2023 14:16:41 -
> @@ -159,17 +159,19 @@ counters_free(struct cpumem *cm, unsigne
>  }
>  
>  void
> -counters_read(struct cpumem *cm, uint64_t *output, unsigned int n)
> +counters_read(struct cpumem *cm, uint64_t *output, unsigned int n,
> +uint64_t *scratch)
>  {
>   struct cpumem_iter cmi;
> - uint64_t *gen, *counters, *temp;
> + uint64_t *gen, *counters, *temp = scratch;
>   uint64_t enter, leave;
>   unsigned int i;
>  
>   for (i = 0; i < n; i++)
>   output[i] = 0;
>  
> - temp = mallocarray(n, sizeof(uint64_t), M_TEMP, M_WAITOK);
> + if (scratch == NULL)
> + temp = mallocarray(n, sizeof(uint64_t), M_TEMP, M_WAITOK);
>  
>   gen = cpumem_first(, cm);
>   do {
> @@ -202,7 +204,8 @@ counters_read(struct cpumem *cm, uint64_
>   gen = cpumem_next(, cm);
>   } while (gen != NULL);
>  
> - free(temp, M_TEMP, n * sizeof(uint64_t));
> + if (scratch == NULL)
> + free(temp, M_TEMP, n * sizeof(uint64_t));
>  }
>  
>  void
> @@ -305,7 +308,8 @@ counters_free(struct cpumem *cm, unsigne
>  }
>  
>  void
> -counters_read(struct cpumem *cm, uint64_t *output, unsigned int n)
> +counters_read(struct cpumem *cm, uint64_t *output, unsigned int n,
> +uint64_t *scratch)
>  {
>   uint64_t *counters;
>   unsigned int i;
> Index: sys/net/pfkeyv2_convert.c
> ===
> RCS file: /cvs/src/sys/net/pfkeyv2_convert.c,v
> retrieving revision 1.80
> diff -u -p -r1.80 pfkeyv2_convert.c
> --- sys/net/pfkeyv2_convert.c 7 Aug 2023 03:35:06 -   1.80
> 

Re: man 9 intro - improvements [and learning for me]?

2023-09-18 Thread Jason McIntyre
On Mon, Sep 18, 2023 at 12:21:48PM +, Christoff Humphries wrote:
> Greetings all.
> 
> I went searching for documentation about the kernel internals and was
> used to the intro(9) man page from NetBSD
> https://man.netbsd.org/intro.9 that had a lot more details. Would it 
> be a worthwhile project to attempt to do the same for OpenBSD?
> 
> I understand the annoyance of folks talking about what they intend or
> are going to do with no actual fruit, but wanted to check that the
> intro(9) is lacking and the information is not just stored somewhere
> else (other than "apropos -s 9 .").
> 
> I want to learn the internals of the OpenBSD kernel, too, and will do
> as mulander (and friends) did by learning a bit of code daily
> https://blog.tintagel.pl/2017/09/10/openbsd-daily-recap.html.
> 
> Seeking the learn and contribute, especially in the uvm, vmd/vmm, and
> possibly filesystem subsystems.
> 
> Thanks in advance!
> 

hi.

as far as i know, there is no real kernel overview page. so i guess
intro(9) would be the place. you could certainly try to do something
similar to netbsd's page - it would be more useful than what we have now.

jmc



man 9 intro - improvements [and learning for me]?

2023-09-18 Thread Christoff Humphries
Greetings all.

I went searching for documentation about the kernel internals and was
used to the intro(9) man page from NetBSD
https://man.netbsd.org/intro.9 that had a lot more details. Would it 
be a worthwhile project to attempt to do the same for OpenBSD?

I understand the annoyance of folks talking about what they intend or
are going to do with no actual fruit, but wanted to check that the
intro(9) is lacking and the information is not just stored somewhere
else (other than "apropos -s 9 .").

I want to learn the internals of the OpenBSD kernel, too, and will do
as mulander (and friends) did by learning a bit of code daily
https://blog.tintagel.pl/2017/09/10/openbsd-daily-recap.html.

Seeking the learn and contribute, especially in the uvm, vmd/vmm, and
possibly filesystem subsystems.

Thanks in advance!



Re: dt(4), hardclock(9): move interval, profile providers to dedicated callback

2023-09-18 Thread Martin Pieuchot
On 17/09/23(Sun) 11:22, Scott Cheloha wrote:
> v2 is attached.

Thanks.

> Clockintrs now have an argument.  If we pass the PCB as argument, we
> can avoid doing a linear search to find the PCB during the interrupt.
> 
> One thing I'm unsure about is whether I need to add a "barrier" flag
> to clockintr_cancel() and/or clockintr_disestablish().  This would
> cause the caller to block until the clockintr callback function has
> finished executing, which would ensure that it was safe to free the
> PCB.

Please do not reinvent the wheel.  Try to imitate what other kernel APIs
do.  Look at task_add(9) and timeout_add(9).  Call the functions add/del()
to match existing APIs, then we can add a clockintr_del_barrier() if needed.
Do not introduce functions before we need them.  I hope we won't need
it.

> On Mon, Sep 04, 2023 at 01:39:25PM +0100, Martin Pieuchot wrote:
> > On 25/08/23(Fri) 21:00, Scott Cheloha wrote:
> > > On Thu, Aug 24, 2023 at 07:21:29PM +0200, Martin Pieuchot wrote:
> > [...]
> > > The goal of clockintr is to provide a machine-independent API for
> > > scheduling clock interrupts.  You can use it to implement something
> > > like hardclock() or statclock().  We are already using it to implement
> > > these functions, among others.
> > 
> > After reading all the code and the previous manuals, I understand it as
> > a low-level per-CPU timeout API with nanosecond precision.  Is that it?
> 
> Yes.
> 
> The distinguishing feature is that it is usually wired up to a
> platform backend, so it can deliver the interrupt at the requested
> expiration time with relatively low error.

Why shouldn't we use it and prefer timeout then?  That's unclear to me
and I'd love to have this clearly documented.

What happened to the manual? 
 
> > Apart from running periodically on a given CPU an important need for
> > dt(4) is to get a timestamps for every event.  Currently nanotime(9)
> > is used.  This is global to the system.  I saw that ftrace use different
> > clocks per-CPU which might be something to consider now that we're
> > moving to a per-CPU API.
> > 
> > It's all about cost of the instrumentation.  Note that if we use a
> > different per-CPU timestamp it has to work outside of clockintr because
> > probes can be anywhere.
> > 
> > Regarding clockintr_establish(), I'd rather have it *not* allocated the
> > `clockintr'.  I'd prefer waste some more space in every PCB.  The reason
> > for this is to keep live allocation in dt(4) to dt_pcb_alloc() only to
> > be able to not go through malloc(9) at some point in the future to not
> > interfere with the rest of the kernel.  Is there any downside to this?
> 
> You're asking me to change from callee-allocated clockintrs to
> caller-allocated clockintrs.  I don't want to do this.

Why not?  What is your plan?  You want to put many clockintrs in the
kernel?  I don't understand where you're going.  Can you explain?  From
my point of view this design decision is only added complexity for no
good reason

> I am hoping to expirment with using a per-CPU pool for clockintrs
> during the next release cycle.  I think keeping all the clockintrs on
> a single page in memory will have cache benefits when reinserting
> clockintrs into the sorted data structure.

I do not understand this wish for micro optimization.

The API has only on dynamic consumer: dt(4), I tell you I'd prefer to
allocate the structure outside of your API and you tell me you don't
want.  You want to use pool.  Really I don't understand...  Please let's
design an API for our needs and not based on best practises from outside.

Or am I completely missing something from the clockintr world conquest
plan?

If you want to play with pools, they are many other stuff you can do :o)
I just don't understand.

> > Can we have a different hook for the interval provider?
> 
> I think I have added this now?
> 
> "profile" uses dt_prov_profile_intr().
> 
> "interval" uses dt_prov_interval_intr().
> 
> Did you mean a different kind of "hook"?

That's it and please remove DT_ENTER().  There's no need for the use of
the macro inside dt(4).  I thought I already mentioned it.

> > Since we need only one clockintr and we don't care about the CPU
> > should we pick a random one?  Could that be implemented by passing
> > a NULL "struct cpu_info *" pointer to clockintr_establish()?  So
> > multiple "interval" probes would run on different CPUs...
> 
> It would be simpler to just stick to running the "interval" provider
> on the primary CPU.

Well the primary CPU is already too loaded with interrupts on OpenBSD.
So if we can avoid it for no good reason, I'd prefer.

> If you expect a large number of "interval" probes to be active at
> once, it might help to choose the CPU round-robin style.  But that
> would be an optimization for a later step.

Let's do that now!  There's no limitation with your API right?

> > I'd rather keep "struct dt_softc" private to not create too convoluted
> > code.
> 
> I can't figure out how to 

Re: Folks thanks to all who attended P2k23 Hackathon in Dublin

2023-09-18 Thread Tom Smyth
Folks,

thanks to all who submitted undeadly reports on p2k23 hackathon to undeadly,
if anyone else would like to send a rough list of cool stuff they were
working on to me Ill submit an update to undeadly also,
I appreciate all the time you folks put into the hackathon and Hope
you enjoyed it  as much as I enjoyed hosting you good folks,


all the best,

Tom Smyth

On Fri, 8 Sept 2023 at 18:20, Tom Smyth  wrote:
>
> Folks
> thanks to all  who attended P2k23  Hackathon in Dublin
>
> if any of you would like to email me off list what you managed to
> achieve/  or increse understanding of  or any interesting ideas
> collaborations that happened during the week
>
> please feel free to email me them and Ill include them in an undeadly
> hackathon report..
>
> It was a real pleasure to host you all and learn from the hackathon
> hallway track :)
>
> Thanks
> Tom Smyth
>
> --
> Kindest regards,
> Tom Smyth.



-- 
Kindest regards,
Tom Smyth.



Re: hotplug(4): introduce `hotplug_mtx' mutex(9) and make `hotplugread_filterops' mp safe

2023-09-18 Thread Vitaliy Makkoveev
On Mon, Sep 18, 2023 at 02:03:08PM +0300, Vitaliy Makkoveev wrote:
> Also use this mutex to protect `evqueue_head', `evqueue_tail' and
> `evqueue_count'.
> 

Sorry, the right diff:

Index: sys/dev/hotplug.c
===
RCS file: /cvs/src/sys/dev/hotplug.c,v
retrieving revision 1.23
diff -u -p -r1.23 hotplug.c
--- sys/dev/hotplug.c   8 Sep 2023 20:00:27 -   1.23
+++ sys/dev/hotplug.c   18 Sep 2023 11:09:12 -
@@ -22,27 +22,39 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 
 #define HOTPLUG_MAXEVENTS  64
 
+/*
+ * Locks used to protect struct members and global data
+ *  M  hotplug_mtx
+ */
+
+static struct mutex hotplug_mtx = MUTEX_INITIALIZER(IPL_MPFLOOR);
+
 static int opened;
 static struct hotplug_event evqueue[HOTPLUG_MAXEVENTS];
-static int evqueue_head, evqueue_tail, evqueue_count;
-static struct selinfo hotplug_sel;
+static int evqueue_head, evqueue_tail, evqueue_count;  /* [M] */
+static struct klist hotplug_klist; /* [M] */
 
 void filt_hotplugrdetach(struct knote *);
 int  filt_hotplugread(struct knote *, long);
+int  filt_hotplugmodify(struct kevent *, struct knote *);
+int  filt_hotplugprocess(struct knote *, struct kevent *);
 
 const struct filterops hotplugread_filtops = {
-   .f_flags= FILTEROP_ISFD,
+   .f_flags= FILTEROP_ISFD | FILTEROP_MPSAFE,
.f_attach   = NULL,
.f_detach   = filt_hotplugrdetach,
.f_event= filt_hotplugread,
+   .f_modify   = filt_hotplugmodify,
+   .f_process  = filt_hotplugprocess,
 };
 
 #define EVQUEUE_NEXT(p) (p == HOTPLUG_MAXEVENTS - 1 ? 0 : p + 1)
@@ -60,6 +72,8 @@ hotplugattach(int count)
evqueue_head = 0;
evqueue_tail = 0;
evqueue_count = 0;
+
+   klist_init_mutex(_klist, _mtx);
 }
 
 void
@@ -87,7 +101,9 @@ hotplug_device_detach(enum devclass clas
 int
 hotplug_put_event(struct hotplug_event *he)
 {
+   mtx_enter(_mtx);
if (evqueue_count == HOTPLUG_MAXEVENTS && opened) {
+   mtx_leave(_mtx);
printf("hotplug: event lost, queue full\n");
return (1);
}
@@ -98,24 +114,21 @@ hotplug_put_event(struct hotplug_event *
evqueue_tail = EVQUEUE_NEXT(evqueue_tail);
else 
evqueue_count++;
+   knote_locked(_klist, 0);
wakeup();
-   selwakeup(_sel);
+   mtx_leave(_mtx);
return (0);
 }
 
 int
 hotplug_get_event(struct hotplug_event *he)
 {
-   int s;
-
if (evqueue_count == 0)
return (1);
 
-   s = splbio();
*he = evqueue[evqueue_tail];
evqueue_tail = EVQUEUE_NEXT(evqueue_tail);
evqueue_count--;
-   splx(s);
return (0);
 }
 
@@ -137,8 +150,11 @@ hotplugclose(dev_t dev, int flag, int mo
 {
struct hotplug_event he;
 
+   mtx_enter(_mtx);
while (hotplug_get_event() == 0)
continue;
+   mtx_leave(_mtx);
+   klist_invalidate(_klist);
opened = 0;
return (0);
 }
@@ -152,16 +168,23 @@ hotplugread(dev_t dev, struct uio *uio, 
if (uio->uio_resid != sizeof(he))
return (EINVAL);
 
-again:
-   if (hotplug_get_event() == 0)
-   return (uiomove(, sizeof(he), uio));
-   if (flags & IO_NDELAY)
-   return (EAGAIN);
-
-   error = tsleep_nsec(, PRIBIO | PCATCH, "htplev", INFSLP);
-   if (error)
-   return (error);
-   goto again;
+   mtx_enter(_mtx);
+   while (hotplug_get_event()) {
+   if (flags & IO_NDELAY) {
+   mtx_leave(_mtx);
+   return (EAGAIN);
+   }
+   
+   error = msleep_nsec(, _mtx, PRIBIO | PCATCH,
+   "htplev", INFSLP);
+   if (error) {
+   mtx_leave(_mtx);
+   return (error);
+   }
+   }
+   mtx_leave(_mtx);
+
+   return (uiomove(, sizeof(he), uio));
 }
 
 int
@@ -183,32 +206,22 @@ hotplugioctl(dev_t dev, u_long cmd, cadd
 int
 hotplugkqfilter(dev_t dev, struct knote *kn)
 {
-   struct klist *klist;
-   int s;
-
switch (kn->kn_filter) {
case EVFILT_READ:
-   klist = _sel.si_note;
kn->kn_fop = _filtops;
break;
default:
return (EINVAL);
}
 
-   s = splbio();
-   klist_insert_locked(klist, kn);
-   splx(s);
+   klist_insert(_klist, kn);
return (0);
 }
 
 void
 filt_hotplugrdetach(struct knote *kn)
 {
-   int s;
-
-   s = splbio();
-   klist_remove_locked(_sel.si_note, kn);
-   splx(s);
+   klist_remove(_klist, kn);
 }
 
 int
@@ -217,4 +230,28 @@ filt_hotplugread(struct knote *kn, long 
kn->kn_data = evqueue_count;
 
return (evqueue_count > 0);
+}
+

hotplug(4): introduce `hotplug_mtx' mutex(9) and make `hotplugread_filterops' mp safe

2023-09-18 Thread Vitaliy Makkoveev
Also use this mutex to protect `evqueue_head', `evqueue_tail' and
`evqueue_count'.

Index: sys/dev/hotplug.c
===
RCS file: /cvs/src/sys/dev/hotplug.c,v
retrieving revision 1.23
diff -u -p -r1.23 hotplug.c
--- sys/dev/hotplug.c   8 Sep 2023 20:00:27 -   1.23
+++ sys/dev/hotplug.c   18 Sep 2023 10:57:07 -
@@ -22,27 +22,39 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 
 #define HOTPLUG_MAXEVENTS  64
 
+/*
+ * Locks used to protect struct members and global data
+ *  M  hotplug_mtx
+ */
+
+static struct mutex hotplug_mtx = MUTEX_INITIALIZER(IPL_MPFLOOR);
+
 static int opened;
 static struct hotplug_event evqueue[HOTPLUG_MAXEVENTS];
-static int evqueue_head, evqueue_tail, evqueue_count;
-static struct selinfo hotplug_sel;
+static int evqueue_head, evqueue_tail, evqueue_count;  /* [M] */
+static struct klist hotplug_klist; /* [M] */
 
 void filt_hotplugrdetach(struct knote *);
 int  filt_hotplugread(struct knote *, long);
+int  filt_hotplugmodify(struct kevent *, struct knote *);
+int  filt_hotplugprocess(struct knote *, struct kevent *);
 
 const struct filterops hotplugread_filtops = {
-   .f_flags= FILTEROP_ISFD,
+   .f_flags= FILTEROP_ISFD  | FILTEROP_MPSAFE,
.f_attach   = NULL,
.f_detach   = filt_hotplugrdetach,
.f_event= filt_hotplugread,
+   .f_modify   = filt_hotplugmodify,
+   .f_process  = filt_hotplugprocess,
 };
 
 #define EVQUEUE_NEXT(p) (p == HOTPLUG_MAXEVENTS - 1 ? 0 : p + 1)
@@ -60,6 +72,8 @@ hotplugattach(int count)
evqueue_head = 0;
evqueue_tail = 0;
evqueue_count = 0;
+
+   klist_init_mutex(_klist, _mtx);
 }
 
 void
@@ -87,7 +101,9 @@ hotplug_device_detach(enum devclass clas
 int
 hotplug_put_event(struct hotplug_event *he)
 {
+   mtx_enter(_mtx);
if (evqueue_count == HOTPLUG_MAXEVENTS && opened) {
+   mtx_leave(_mtx);
printf("hotplug: event lost, queue full\n");
return (1);
}
@@ -98,24 +114,21 @@ hotplug_put_event(struct hotplug_event *
evqueue_tail = EVQUEUE_NEXT(evqueue_tail);
else 
evqueue_count++;
+   knote_locked(_klist, 0);
wakeup();
-   selwakeup(_sel);
+   mtx_leave(_mtx);
return (0);
 }
 
 int
 hotplug_get_event(struct hotplug_event *he)
 {
-   int s;
-
if (evqueue_count == 0)
return (1);
 
-   s = splbio();
*he = evqueue[evqueue_tail];
evqueue_tail = EVQUEUE_NEXT(evqueue_tail);
evqueue_count--;
-   splx(s);
return (0);
 }
 
@@ -137,8 +150,11 @@ hotplugclose(dev_t dev, int flag, int mo
 {
struct hotplug_event he;
 
+   mtx_enter(_mtx);
while (hotplug_get_event() == 0)
continue;
+   mtx_leave(_mtx);
+   klist_invalidate(_klist);
opened = 0;
return (0);
 }
@@ -152,16 +168,23 @@ hotplugread(dev_t dev, struct uio *uio, 
if (uio->uio_resid != sizeof(he))
return (EINVAL);
 
-again:
-   if (hotplug_get_event() == 0)
-   return (uiomove(, sizeof(he), uio));
-   if (flags & IO_NDELAY)
-   return (EAGAIN);
-
-   error = tsleep_nsec(, PRIBIO | PCATCH, "htplev", INFSLP);
-   if (error)
-   return (error);
-   goto again;
+   mtx_enter(_mtx);
+   while (hotplug_get_event()) {
+   if (flags & IO_NDELAY) {
+   mtx_leave(_mtx);
+   return (EAGAIN);
+   }
+   
+   error = msleep_nsec(, _mtx, PRIBIO | PCATCH,
+   "htplev", INFSLP);
+   if (error) {
+   mtx_leave(_mtx);
+   return (error);
+   }
+   }
+   mtx_leave(_mtx);
+
+   return (uiomove(, sizeof(he), uio));
 }
 
 int
@@ -183,32 +206,22 @@ hotplugioctl(dev_t dev, u_long cmd, cadd
 int
 hotplugkqfilter(dev_t dev, struct knote *kn)
 {
-   struct klist *klist;
-   int s;
-
switch (kn->kn_filter) {
case EVFILT_READ:
-   klist = _sel.si_note;
kn->kn_fop = _filtops;
break;
default:
return (EINVAL);
}
 
-   s = splbio();
-   klist_insert_locked(klist, kn);
-   splx(s);
+   klist_insert_locked(_klist, kn);
return (0);
 }
 
 void
 filt_hotplugrdetach(struct knote *kn)
 {
-   int s;
-
-   s = splbio();
-   klist_remove_locked(_sel.si_note, kn);
-   splx(s);
+   klist_remove(_klist, kn);
 }
 
 int
@@ -217,4 +230,28 @@ filt_hotplugread(struct knote *kn, long 
kn->kn_data = evqueue_count;
 
return (evqueue_count > 0);
+}
+
+int
+filt_hotplugmodify(struct kevent *kev, struct knote *kn)
+{
+   int active;
+
+   

hyperv(4): use shared netlock to protect if_list and ifa_list walkthrough and data

2023-09-18 Thread Vitaliy Makkoveev
Context switch looks fine here.

Index: sys/dev/pv/hypervic.c
===
RCS file: /cvs/src/sys/dev/pv/hypervic.c,v
retrieving revision 1.19
diff -u -p -r1.19 hypervic.c
--- sys/dev/pv/hypervic.c   11 Apr 2023 00:45:08 -  1.19
+++ sys/dev/pv/hypervic.c   18 Sep 2023 08:35:02 -
@@ -846,7 +846,7 @@ kvp_get_ip_info(struct hv_kvp *kvp, cons
struct sockaddr_in6 *sin6, sa6;
uint8_t enaddr[ETHER_ADDR_LEN];
uint8_t ipaddr[INET6_ADDRSTRLEN];
-   int i, j, lo, hi, s, af;
+   int i, j, lo, hi, af;
 
/* Convert from the UTF-16LE string format to binary */
for (i = 0, j = 0; j < ETHER_ADDR_LEN; i += 6) {
@@ -870,16 +870,14 @@ kvp_get_ip_info(struct hv_kvp *kvp, cons
return (-1);
}
 
-   KERNEL_LOCK();
-   s = splnet();
+   NET_LOCK_SHARED();
 
TAILQ_FOREACH(ifp, , if_list) {
if (!memcmp(LLADDR(ifp->if_sadl), enaddr, ETHER_ADDR_LEN))
break;
}
if (ifp == NULL) {
-   splx(s);
-   KERNEL_UNLOCK();
+   NET_UNLOCK_SHARED();
return (-1);
}
 
@@ -919,8 +917,7 @@ kvp_get_ip_info(struct hv_kvp *kvp, cons
else if (ifa6ll != NULL)
ifa = ifa6ll;
else {
-   splx(s);
-   KERNEL_UNLOCK();
+   NET_UNLOCK_SHARED();
return (-1);
}
}
@@ -956,8 +953,7 @@ kvp_get_ip_info(struct hv_kvp *kvp, cons
break;
}
 
-   splx(s);
-   KERNEL_UNLOCK();
+   NET_UNLOCK_SHARED();
 
return (0);
 }



Re: Mellanox driver : add 100G_LR4 capability

2023-09-18 Thread Jonathan Matthew
On Fri, Sep 15, 2023 at 09:48:16AM +0200, Olivier Croquin wrote:
> Hi,
> 
> The media capability 100GBase_LR4 is not listed in the mcx driver.
> 
> Could you please take a look at this short patch ? I found the value of 23
> in the Linux mlx driver.

Thanks, I've committed it.

> Is this enough to say that QSFP28 100GBase_LR are supported with the mcx
> driver ?

As much as anything else, sure.