Re: amd64: add tsc_delay(), a TSC-based delay(9) implementation

2020-08-23 Thread Scott Cheloha
On Sun, Aug 23, 2020 at 11:45:22PM -0500, Scott Cheloha wrote:
> 
> [...]
> 
> > > This patch (or something equivalent) is a prerequisite to running the
> > > lapic timer in oneshot or TSC deadline mode.  Using the lapic timer to
> > > implement delay(9) when it isn't running in periodic mode is too
> > > complicated.  However, using the i8254 for delay(9) is too slow.  We
> > > need an alternative.
> > 
> > Hmm, but what are we going to use on machines where the TSC isn't
> > constant/invariant?
> 
> Probably fall back on the i8254?  Unless someone wants to add yet
> another delay(9) implementation to amd64...
> 
> > In what respect is the i8254 too slow?  Does it take more than a
> > microsecond to read it?
> 
> On my machine, the portion of gettick() *within* the mutex runs in ~19
> microseconds.
> 
> That's before any overhead from mtx_enter(9).  I think having multiple
> threads in delay(9) should be relatively rare, but you have to keep
> that in mind.
> 
> No idea what the overhead would look like on real hardware.  I'm
> pretty sure my i8254 is emulated.
> 
> > We could use the HPET I suppose, whic may be a bit better.
> 
> It's better.  No mutex.  On my machine it takes ~11 microseconds.
> It's a start.

Hmmm, now I'm worried I have screwed something up or misconfigured
something.

It doesn't seem right that it would take 20K cycles to read the HPET
on this machine.

Am I way off?  Or is 20K actually a reasonable number?

For comparison, lapic_gettick() completes in... 80 nanoseconds (?) on
the same machine.  Relevant sysctls:

$ sysctl hw.{model,setperf,perfpolicy} machdep.{tscfreq,invarianttsc}
hw.model=Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz
hw.setperf=100
hw.perfpolicy=high
machdep.tscfreq=211200
machdep.invarianttsc=1

... if it really takes that long, then "high precision" is a bit of a
misnomer.



Re: aggr.4 and trunk.4: omit common ifconfig options

2020-08-23 Thread Jason McIntyre
On Mon, Aug 24, 2020 at 01:26:06AM +0200, Klemens Nanni wrote:
> ifconfig(8)'s TRUNK (LINK AGGREGATION) nicely combines the two drivers
> and I'd like to further omit common stuff from the drive specific
> manuals.
> 
> This aids in the overall design of having options documented in
> ifconfig(8) alone unless they're inherently driver specific, e.g.
> `trunkproto' which stays in trunk(4).
> 
> sys/net/if_trunk.c and sys/net/trunklacp.h confirm that trunk(4) has
> indeed the same defaults as aggr(4) when it comes to LACP mode and
> timeout:
> 
>   #define>LACP_DEFAULT_MODE>  >   1 /* Active Mode */
>   #define>LACP_DEFAULT_TIMEOUT>   >   0 /* Slow Timeout */
> 
> Feedback? OK?
> 

ok.
jmc

> 
> Index: share/man/man4/aggr.4
> ===
> RCS file: /cvs/src/share/man/man4/aggr.4,v
> retrieving revision 1.2
> diff -u -p -r1.2 aggr.4
> --- share/man/man4/aggr.4 5 Jul 2019 05:22:57 -   1.2
> +++ share/man/man4/aggr.4 23 Aug 2020 23:10:46 -
> @@ -63,30 +63,11 @@ and
>  .Xr netstart 8
>  using the following options:
>  .Bl -tag -width Ds
> -.It Cm lacpmode Cm active Ns | Ns Cm passive
> -Set the LACP mode to either
> -.Cm active
> -or
> -.Cm passive .
> -The default is active mode.
> -.It Cm lacptimeout Cm fast Ns | Ns Cm slow
> -Set the LACP timeout speed to either
> -.Cm fast
> -or
> -.Cm slow .
> -The default is slow timeouts.
>  .It Cm lladdr Ar etheraddr Ns | Ns Cm random
>  Change the link layer address (MAC address) of the interface.
>  This should be specified as six colon-separated hex values, or can
>  be chosen randomly.
>  By default a random MAC address is generated when an interface is created.
> -.It Cm trunkport Ar child-iface
> -Add
> -.Ar child-iface
> -as a port.
> -.It Cm -trunkport Ar child-iface
> -Remove the port
> -.Ar child-iface .
>  .El
>  .\" document the ioctls?
>  .Pp
> Index: share/man/man4/trunk.4
> ===
> RCS file: /cvs/src/share/man/man4/trunk.4,v
> retrieving revision 1.30
> diff -u -p -r1.30 trunk.4
> --- share/man/man4/trunk.412 Aug 2018 23:50:31 -  1.30
> +++ share/man/man4/trunk.423 Aug 2020 23:12:53 -
> @@ -34,15 +34,6 @@ A
>  interface can be created using the
>  .Ic ifconfig trunk Ns Ar N Ic create
>  command.
> -It can use different link aggregation protocols specified
> -using the
> -.Ic trunkproto Ar proto
> -option.
> -Child interfaces can be added using the
> -.Ic trunkport Ar child-iface
> -option and removed using the
> -.Ic -trunkport Ar child-iface
> -option.
>  .Pp
>  The driver currently supports the trunk protocols
>  .Ic broadcast ,
> Index: sbin/ifconfig/ifconfig.8
> ===
> RCS file: /cvs/src/sbin/ifconfig/ifconfig.8,v
> retrieving revision 1.356
> diff -u -p -r1.356 ifconfig.8
> --- sbin/ifconfig/ifconfig.8  8 Aug 2020 06:06:24 -   1.356
> +++ sbin/ifconfig/ifconfig.8  23 Aug 2020 23:21:28 -
> @@ -1824,13 +1824,14 @@ interfaces:
>  .It Cm lacpmode Cm active Ns | Ns Cm passive
>  Set the LACP trunk mode to either
>  .Cm active
> -or
> +(default) or
>  .Cm passive .
>  .It Cm lacptimeout Cm fast Ns | Ns Cm slow
>  Set the LACP timeout speed to either
>  .Cm fast
>  or
> -.Cm slow .
> +.Cm slow
> +(default).
>  .It Cm trunkport Ar child-iface
>  Add
>  .Ar child-iface
> 



Re: amd64: add tsc_delay(), a TSC-based delay(9) implementation

2020-08-23 Thread Scott Cheloha
On Mon, Aug 24, 2020 at 01:55:45AM +0200, Mark Kettenis wrote:
> > Date: Sun, 23 Aug 2020 18:11:12 -0500
> > From: Scott Cheloha 
> > 
> > Hi,
> > 
> > Other BSDs use the TSC to implement delay(9) if the TSC is constant
> > and invariant.  Here's a patch to add something similar to our kernel.
> 
> If the TSC is fine as a timecounter it should be absolutely fine for
> use as delay().  And we could even use if the TSC isn't synchronized
> between CPUs.

Yep, it's nice.

> > This patch (or something equivalent) is a prerequisite to running the
> > lapic timer in oneshot or TSC deadline mode.  Using the lapic timer to
> > implement delay(9) when it isn't running in periodic mode is too
> > complicated.  However, using the i8254 for delay(9) is too slow.  We
> > need an alternative.
> 
> Hmm, but what are we going to use on machines where the TSC isn't
> constant/invariant?

Probably fall back on the i8254?  Unless someone wants to add yet
another delay(9) implementation to amd64...

> In what respect is the i8254 too slow?  Does it take more than a
> microsecond to read it?

On my machine, the portion of gettick() *within* the mutex runs in ~19
microseconds.

That's before any overhead from mtx_enter(9).  I think having multiple
threads in delay(9) should be relatively rare, but you have to keep
that in mind.

No idea what the overhead would look like on real hardware.  I'm
pretty sure my i8254 is emulated.

> We could use the HPET I suppose, whic may be a bit better.

It's better.  No mutex.  On my machine it takes ~11 microseconds.
It's a start.

> > As for the patch, it works for me here, though I'd appreciate a few
> > tests.  I admit that comparing function pointers is ugly, but I think
> > this is as simple as it can be without implementing some sort of
> > framework for "registering" delay(9) implementations and comparing
> > them and selecting the "best" implementation.
> 
> What about:
> 
>   if (delay_func == NULL)
>   delay_func = lapic_delay;

Nah, can't do that.  delay_func is initialized to i8254_delay().  Look
in amd64/machdep.c.

I'm curious what NetBSD and Dragonfly have done about this.  Lemme
look around.

The whole "all the clocks on amd64 are slow or broken" issue isn't
unique to us.



Re: amd64: add tsc_delay(), a TSC-based delay(9) implementation

2020-08-23 Thread Mark Kettenis
> Date: Sun, 23 Aug 2020 18:11:12 -0500
> From: Scott Cheloha 
> 
> Hi,
> 
> Other BSDs use the TSC to implement delay(9) if the TSC is constant
> and invariant.  Here's a patch to add something similar to our kernel.

If the TSC is fine as a timecounter it should be absolutely fine for
use as delay().  And we could even use if the TSC isn't synchronized
between CPUs.

> 
> This patch (or something equivalent) is a prerequisite to running the
> lapic timer in oneshot or TSC deadline mode.  Using the lapic timer to
> implement delay(9) when it isn't running in periodic mode is too
> complicated.  However, using the i8254 for delay(9) is too slow.  We
> need an alternative.

Hmm, but what are we going to use on machines where the TSC isn't
constant/invariant?

In what respect is the i8254 too slow?  Does it take more than a
microsecond to read it?

We could use the HPET I suppose, whic may be a bit better.

> As for the patch, it works for me here, though I'd appreciate a few
> tests.  I admit that comparing function pointers is ugly, but I think
> this is as simple as it can be without implementing some sort of
> framework for "registering" delay(9) implementations and comparing
> them and selecting the "best" implementation.

What about:

if (delay_func == NULL)
delay_func = lapic_delay;

> I'm not sure I put the prototypes in the right headers.  We don't have
> a tsc.h but cpuvar.h looks sorta-correct for tsc_delay().

I think cpuvar.h is fine since it has other TSC-related stuff.
However, with my suggestion above you can drop that.

> FreeBSD's x86/delay.c may be of note:
> 
> https://github.com/freebsd/freebsd/blob/ed96335a07b688c39e16db8856232e5840bc22ac/sys/x86/x86/delay.c
> 
> Thoughts?
> 
> Index: amd64/tsc.c
> ===
> RCS file: /cvs/src/sys/arch/amd64/amd64/tsc.c,v
> retrieving revision 1.20
> diff -u -p -r1.20 tsc.c
> --- amd64/tsc.c   23 Aug 2020 21:38:47 -  1.20
> +++ amd64/tsc.c   23 Aug 2020 22:59:25 -
> @@ -26,6 +26,7 @@
>  
>  #include 
>  #include 
> +#include 
>  
>  #define RECALIBRATE_MAX_RETRIES  5
>  #define RECALIBRATE_SMI_THRESHOLD5
> @@ -252,7 +253,8 @@ tsc_timecounter_init(struct cpu_info *ci
>   tsc_timecounter.tc_quality = -1000;
>   tsc_timecounter.tc_user = 0;
>   tsc_is_invariant = 0;
> - }
> + } else
> + delay_func = tsc_delay;
>  
>   tc_init(_timecounter);
>  }
> @@ -342,4 +344,15 @@ tsc_sync_ap(struct cpu_info *ci)
>  {
>   tsc_post_ap(ci);
>   tsc_post_ap(ci);
> +}
> +
> +void
> +tsc_delay(int usecs)
> +{
> + uint64_t interval, start;
> +
> + interval = (uint64_t)usecs * tsc_frequency / 100;
> + start = rdtsc_lfence();
> + while (rdtsc_lfence() - start < interval)
> + CPU_BUSY_CYCLE();
>  }
> Index: amd64/lapic.c
> ===
> RCS file: /cvs/src/sys/arch/amd64/amd64/lapic.c,v
> retrieving revision 1.55
> diff -u -p -r1.55 lapic.c
> --- amd64/lapic.c 3 Aug 2019 14:57:51 -   1.55
> +++ amd64/lapic.c 23 Aug 2020 22:59:25 -
> @@ -41,6 +41,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -569,7 +570,8 @@ skip_calibration:
>* Now that the timer's calibrated, use the apic timer routines
>* for all our timing needs..
>*/
> - delay_func = lapic_delay;
> + if (delay_func != tsc_delay)
> + delay_func = lapic_delay;
>   initclock_func = lapic_initclocks;
>   }
>  }
> Index: include/cpuvar.h
> ===
> RCS file: /cvs/src/sys/arch/amd64/include/cpuvar.h,v
> retrieving revision 1.10
> diff -u -p -r1.10 cpuvar.h
> --- include/cpuvar.h  9 Aug 2019 15:20:05 -   1.10
> +++ include/cpuvar.h  23 Aug 2020 22:59:25 -
> @@ -102,4 +102,6 @@ void tsc_sync_drift(int64_t);
>  void tsc_sync_bp(struct cpu_info *);
>  void tsc_sync_ap(struct cpu_info *);
>  
> +void tsc_delay(int);
> +
>  #endif
> Index: include/i82489var.h
> ===
> RCS file: /cvs/src/sys/arch/amd64/include/i82489var.h,v
> retrieving revision 1.18
> diff -u -p -r1.18 i82489var.h
> --- include/i82489var.h   4 Oct 2018 05:00:40 -   1.18
> +++ include/i82489var.h   23 Aug 2020 22:59:26 -
> @@ -128,4 +128,6 @@ extern void lapic_calibrate_timer(struct
>  extern void lapic_startclock(void);
>  extern void lapic_initclocks(void);
>  
> +extern void lapic_delay(int);
> +
>  #endif
> 
> 



aggr.4 and trunk.4: omit common ifconfig options

2020-08-23 Thread Klemens Nanni
ifconfig(8)'s TRUNK (LINK AGGREGATION) nicely combines the two drivers
and I'd like to further omit common stuff from the drive specific
manuals.

This aids in the overall design of having options documented in
ifconfig(8) alone unless they're inherently driver specific, e.g.
`trunkproto' which stays in trunk(4).

sys/net/if_trunk.c and sys/net/trunklacp.h confirm that trunk(4) has
indeed the same defaults as aggr(4) when it comes to LACP mode and
timeout:

#define>LACP_DEFAULT_MODE>  >   1 /* Active Mode */
#define>LACP_DEFAULT_TIMEOUT>   >   0 /* Slow Timeout */

Feedback? OK?


Index: share/man/man4/aggr.4
===
RCS file: /cvs/src/share/man/man4/aggr.4,v
retrieving revision 1.2
diff -u -p -r1.2 aggr.4
--- share/man/man4/aggr.4   5 Jul 2019 05:22:57 -   1.2
+++ share/man/man4/aggr.4   23 Aug 2020 23:10:46 -
@@ -63,30 +63,11 @@ and
 .Xr netstart 8
 using the following options:
 .Bl -tag -width Ds
-.It Cm lacpmode Cm active Ns | Ns Cm passive
-Set the LACP mode to either
-.Cm active
-or
-.Cm passive .
-The default is active mode.
-.It Cm lacptimeout Cm fast Ns | Ns Cm slow
-Set the LACP timeout speed to either
-.Cm fast
-or
-.Cm slow .
-The default is slow timeouts.
 .It Cm lladdr Ar etheraddr Ns | Ns Cm random
 Change the link layer address (MAC address) of the interface.
 This should be specified as six colon-separated hex values, or can
 be chosen randomly.
 By default a random MAC address is generated when an interface is created.
-.It Cm trunkport Ar child-iface
-Add
-.Ar child-iface
-as a port.
-.It Cm -trunkport Ar child-iface
-Remove the port
-.Ar child-iface .
 .El
 .\" document the ioctls?
 .Pp
Index: share/man/man4/trunk.4
===
RCS file: /cvs/src/share/man/man4/trunk.4,v
retrieving revision 1.30
diff -u -p -r1.30 trunk.4
--- share/man/man4/trunk.4  12 Aug 2018 23:50:31 -  1.30
+++ share/man/man4/trunk.4  23 Aug 2020 23:12:53 -
@@ -34,15 +34,6 @@ A
 interface can be created using the
 .Ic ifconfig trunk Ns Ar N Ic create
 command.
-It can use different link aggregation protocols specified
-using the
-.Ic trunkproto Ar proto
-option.
-Child interfaces can be added using the
-.Ic trunkport Ar child-iface
-option and removed using the
-.Ic -trunkport Ar child-iface
-option.
 .Pp
 The driver currently supports the trunk protocols
 .Ic broadcast ,
Index: sbin/ifconfig/ifconfig.8
===
RCS file: /cvs/src/sbin/ifconfig/ifconfig.8,v
retrieving revision 1.356
diff -u -p -r1.356 ifconfig.8
--- sbin/ifconfig/ifconfig.88 Aug 2020 06:06:24 -   1.356
+++ sbin/ifconfig/ifconfig.823 Aug 2020 23:21:28 -
@@ -1824,13 +1824,14 @@ interfaces:
 .It Cm lacpmode Cm active Ns | Ns Cm passive
 Set the LACP trunk mode to either
 .Cm active
-or
+(default) or
 .Cm passive .
 .It Cm lacptimeout Cm fast Ns | Ns Cm slow
 Set the LACP timeout speed to either
 .Cm fast
 or
-.Cm slow .
+.Cm slow
+(default).
 .It Cm trunkport Ar child-iface
 Add
 .Ar child-iface



amd64: add tsc_delay(), a TSC-based delay(9) implementation

2020-08-23 Thread Scott Cheloha
Hi,

Other BSDs use the TSC to implement delay(9) if the TSC is constant
and invariant.  Here's a patch to add something similar to our kernel.

This patch (or something equivalent) is a prerequisite to running the
lapic timer in oneshot or TSC deadline mode.  Using the lapic timer to
implement delay(9) when it isn't running in periodic mode is too
complicated.  However, using the i8254 for delay(9) is too slow.  We
need an alternative.

As for the patch, it works for me here, though I'd appreciate a few
tests.  I admit that comparing function pointers is ugly, but I think
this is as simple as it can be without implementing some sort of
framework for "registering" delay(9) implementations and comparing
them and selecting the "best" implementation.

I'm not sure I put the prototypes in the right headers.  We don't have
a tsc.h but cpuvar.h looks sorta-correct for tsc_delay().

FreeBSD's x86/delay.c may be of note:

https://github.com/freebsd/freebsd/blob/ed96335a07b688c39e16db8856232e5840bc22ac/sys/x86/x86/delay.c

Thoughts?

Index: amd64/tsc.c
===
RCS file: /cvs/src/sys/arch/amd64/amd64/tsc.c,v
retrieving revision 1.20
diff -u -p -r1.20 tsc.c
--- amd64/tsc.c 23 Aug 2020 21:38:47 -  1.20
+++ amd64/tsc.c 23 Aug 2020 22:59:25 -
@@ -26,6 +26,7 @@
 
 #include 
 #include 
+#include 
 
 #define RECALIBRATE_MAX_RETRIES5
 #define RECALIBRATE_SMI_THRESHOLD  5
@@ -252,7 +253,8 @@ tsc_timecounter_init(struct cpu_info *ci
tsc_timecounter.tc_quality = -1000;
tsc_timecounter.tc_user = 0;
tsc_is_invariant = 0;
-   }
+   } else
+   delay_func = tsc_delay;
 
tc_init(_timecounter);
 }
@@ -342,4 +344,15 @@ tsc_sync_ap(struct cpu_info *ci)
 {
tsc_post_ap(ci);
tsc_post_ap(ci);
+}
+
+void
+tsc_delay(int usecs)
+{
+   uint64_t interval, start;
+
+   interval = (uint64_t)usecs * tsc_frequency / 100;
+   start = rdtsc_lfence();
+   while (rdtsc_lfence() - start < interval)
+   CPU_BUSY_CYCLE();
 }
Index: amd64/lapic.c
===
RCS file: /cvs/src/sys/arch/amd64/amd64/lapic.c,v
retrieving revision 1.55
diff -u -p -r1.55 lapic.c
--- amd64/lapic.c   3 Aug 2019 14:57:51 -   1.55
+++ amd64/lapic.c   23 Aug 2020 22:59:25 -
@@ -41,6 +41,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -569,7 +570,8 @@ skip_calibration:
 * Now that the timer's calibrated, use the apic timer routines
 * for all our timing needs..
 */
-   delay_func = lapic_delay;
+   if (delay_func != tsc_delay)
+   delay_func = lapic_delay;
initclock_func = lapic_initclocks;
}
 }
Index: include/cpuvar.h
===
RCS file: /cvs/src/sys/arch/amd64/include/cpuvar.h,v
retrieving revision 1.10
diff -u -p -r1.10 cpuvar.h
--- include/cpuvar.h9 Aug 2019 15:20:05 -   1.10
+++ include/cpuvar.h23 Aug 2020 22:59:25 -
@@ -102,4 +102,6 @@ void tsc_sync_drift(int64_t);
 void tsc_sync_bp(struct cpu_info *);
 void tsc_sync_ap(struct cpu_info *);
 
+void tsc_delay(int);
+
 #endif
Index: include/i82489var.h
===
RCS file: /cvs/src/sys/arch/amd64/include/i82489var.h,v
retrieving revision 1.18
diff -u -p -r1.18 i82489var.h
--- include/i82489var.h 4 Oct 2018 05:00:40 -   1.18
+++ include/i82489var.h 23 Aug 2020 22:59:26 -
@@ -128,4 +128,6 @@ extern void lapic_calibrate_timer(struct
 extern void lapic_startclock(void);
 extern void lapic_initclocks(void);
 
+extern void lapic_delay(int);
+
 #endif



Re: timekeep: fixing large skews on amd64 with RDTSCP

2020-08-23 Thread Hrvoje Popovski
On 23.8.2020. 16:50, Claudio Jeker wrote:
> On Sun, Aug 23, 2020 at 04:06:01PM +0200, Christian Weisgerber wrote:
>> Scott Cheloha:
>>
>>> This "it might slow down the network stack" thing keeps coming up, and
>>> yet nobody can point to (a) who expressed this concern or (b) what the
>>> penalty is in practice.
>>
>> It was kettenis@ who simply raised the question and asked for
>> comments from the network people.
>>
>> I think we should just go ahead and use rdtsc_lfence() in
>> tsc_get_timecount().  It is *correct*.
> 
> I agree. For the network stack there are bigger fishes to fry befor such a
> micro optimisation would even matter.
> 

Hi all,

as you said, with this diff forwarding performance is little slower.

6 x Intel(R) Xeon(R) CPU E5-2643 v2 @ 3.50GHz, 3600.54 MHz, 06-3e-04
without diff 1.11 Mpps
with diff 1.04 Mpps

12 x Intel(R) Xeon(R) CPU E5-2620 v2 @ 2.10GHz, 2400.01 MHz, 06-3e-04
without diff 650 Kpps
with diff 600 Kpps




top: toggle routing tables

2020-08-23 Thread Klemens Nanni
Add `t' to swap the WAIT column with RTABLE (and vice versa);  WAIT
is wide enough to fit RTABLE, somewhat adds additional value to STATE
and seems therefore most appropiate to hide in favour of RTABLE.

Internally, I renamed the existing CMD_rtable command to filter routing
tables into CMD_rtableid in order to use CMD_rtable for showing them as
that seems in line with how CMD_threads is named to show threads, etc.

format_header() semantics are slightly reworked/improved now that there
are two changing fields;  instead of conditionally changing, it now
always updates it accordingly - i think that makes it clearer overall.

format_next_process() now uses strlcpy() instead of snprintf() for plain
strings as I had to touch those lines anyway.

Filtering rtables with `T' does not toggle the column, just like
filtering users with `u' does not toggle between user and thread id.

Feedback? OK?

Index: display.c
===
RCS file: /cvs/src/usr.bin/top/display.c,v
retrieving revision 1.64
diff -u -p -r1.64 display.c
--- display.c   23 Aug 2020 21:11:55 -  1.64
+++ display.c   23 Aug 2020 22:39:47 -
@@ -824,6 +824,7 @@ show_help(void)
"r count pid  - renice process `pid' to nice value `count'\n"
"S- toggle the display of system processes\n"
"s time   - change delay between displays to `time' seconds\n"
+   "t- toggle the display of routing tables\n"
"T [-]rtable  - show processes associated with routing table 
`rtable'\n"
"   (T+ shows all, T -rtable hides rtable)\n"
"u [-]user- show processes for `user' (u+ shows all, u -user 
hides user)\n"
Index: machine.c
===
RCS file: /cvs/src/usr.bin/top/machine.c,v
retrieving revision 1.108
diff -u -p -r1.108 machine.c
--- machine.c   23 Aug 2020 21:11:55 -  1.108
+++ machine.c   23 Aug 2020 22:38:15 -
@@ -75,8 +75,9 @@ struct handle {
 static char header[] =
"  PID XPRI NICE  SIZE   RES STATE WAIT  TIMECPU 
COMMAND";
 
-/* 0123456   -- field to fill in starts at header+6 */
+/* offsets in the header line to start alternative columns */
 #define UNAME_START 6
+#define RTABLE_START 46
 
 #define Proc_format \
"%5d %-8.8s %3d %4d %5s %5s %-9s %-7.7s %6s %5.2f%% %s"
@@ -226,16 +227,20 @@ machine_init(struct statics *statics)
 }
 
 char *
-format_header(char *second_field)
+format_header(char *second_field, char *eighth_field)
 {
-   char *field_name, *thread_field = " TID";
-   char *ptr;
+   char *second_fieldp = second_field, *eighth_fieldp = eighth_field, *ptr;
 
-   field_name = second_field ? second_field : thread_field;
-
-   ptr = header + UNAME_START;
-   while (*field_name != '\0')
-   *ptr++ = *field_name++;
+   if (second_field != NULL) {
+   ptr = header + UNAME_START;
+   while (*second_fieldp != '\0')
+   *ptr++ = *second_fieldp++;
+   }
+   if (eighth_field != NULL) {
+   ptr = header + RTABLE_START;
+   while (*eighth_fieldp != '\0')
+   *ptr++ = *eighth_fieldp++;
+   }
return (header);
 }
 
@@ -414,7 +419,7 @@ get_process_info(struct system_info *si,
 int (*compare) (const void *, const void *))
 {
int show_idle, show_system, show_threads, show_uid, show_pid, show_cmd;
-   int show_rtable, hide_rtable, hide_uid;
+   int show_rtableid, hide_rtableid, hide_uid;
int total_procs, active_procs;
struct kinfo_proc **prefp, *pp;
int what = KERN_PROC_ALL;
@@ -446,8 +451,8 @@ get_process_info(struct system_info *si,
show_uid = sel->uid != (uid_t)-1;
hide_uid = sel->huid != (uid_t)-1;
show_pid = sel->pid != (pid_t)-1;
-   show_rtable = sel->rtableid != -1;
-   hide_rtable = sel->hrtableid != -1;
+   show_rtableid = sel->rtableid != -1;
+   hide_rtableid = sel->hrtableid != -1;
show_cmd = sel->command != NULL;
 
/* count up process states and get pointers to interesting procs */
@@ -476,8 +481,8 @@ get_process_info(struct system_info *si,
(!hide_uid || pp->p_ruid != sel->huid) &&
(!show_uid || pp->p_ruid == sel->uid) &&
(!show_pid || pp->p_pid == sel->pid) &&
-   (!hide_rtable || pp->p_rtableid != sel->hrtableid) 
&&
-   (!show_rtable || pp->p_rtableid == sel->rtableid) &&
+   (!hide_rtableid || pp->p_rtableid != 
sel->hrtableid) &&
+   (!show_rtableid || pp->p_rtableid == sel->rtableid) 
&&
(!show_cmd || cmd_matches(pp, sel->command))) {
*prefp++ = pp;

Re: Improve semantics and punctuation in ifconfig.8

2020-08-23 Thread Jason McIntyre
On Sat, Aug 22, 2020 at 02:18:48PM -0700, Evan Silberman wrote:
> Not to provoke a deep philosophical debate about the difference between
> Ql, Cm, and Sy, but surely Em isn't the best choice here. A couple other
> nits that bothered me in the same general region, I guess your taste
> might vary.
> 
> Evan Silberman
> 

hi

> ---
> commit 233b62ce4e6cd3388e68f407e50758111b3eeffc (ifconfig.8)
> from: Evan Silberman 
> date: Sat Aug 22 21:14:25 2020 UTC
>  
>  Improve markup and punctuation concerning groups
>  
>  - They're groups, not scare-quotes-"groups"

oddly enough, this usage is probably close to what Em is meant for ;)
quoting here is just another way to add that emphasis. but i agree it
doesn;t make the text more readable.

>  - The markup for the default groups is debatable, but Cm seems closest.
>You might also like: Ql, Sy

i would prefer to use one macro rather than two. i agree Em is not
great. i would just use Dq.

>  - Parenthesized plural(s) is (are) distracting
>  

i heartily agree.

so, if no one chips in soon, i will commit your diff, but with a
straight Em->Dq replacement.

jmc

> diff cb34809c69289319bd78d14b4f5ed7d4e93b080f 
> c3c1549c9f1e94ec03820de542eaf10eaf0dc3f0
> blob - 9e6ad912b385de338033598de04f0184d015224e
> blob + 968886c1a6d4849c4362ab2d759cc72c71d125ed
> --- sbin/ifconfig/ifconfig.8
> +++ sbin/ifconfig/ifconfig.8
> @@ -238,8 +238,8 @@ transmit messages through that interface.
>  If possible, the interface will be reset to disable reception as well.
>  This action automatically disables routes using the interface.
>  .It Cm group Ar group-name
> -Assign the interface to a
> -.Dq group .
> +Assign the interface to a group.
> +The
>  .Ar group-name
>  may not be longer than 15 characters and must not end with a digit.
>  Any interface can be in multiple groups.
> @@ -254,36 +254,35 @@ Some interfaces belong to specific groups by default:
>  .Bl -dash -width Ds -compact
>  .It
>  All interfaces are members of the
> -.Em all
> +.Cm all
>  interface group.
>  .It
>  Cloned interfaces are members of their interface family group.
>  For example, a PPP interface such as
> -.Em ppp0
> +.Ql ppp0
>  is a member of the
> -.Em ppp
> +.Cm ppp
>  interface family group.
>  .It
>  .Xr pppx 4
>  interfaces are members of the
> -.Em pppx
> +.Cm pppx
>  interface group.
>  .It
> -The interface(s) the default route(s) point to are members of the
> -.Em egress
> +The interfaces the default routes point to are members of the
> +.Cm egress
>  interface group.
>  .It
>  IEEE 802.11 wireless interfaces are members of the
> -.Em wlan
> +.Cm wlan
>  interface group.
>  .It
>  Any interfaces used for network booting are members of the
> -.Em netboot
> +.Cm netboot
>  interface group.
>  .El
>  .It Cm -group Ar group-name
> -Remove the interface from the given
> -.Dq group .
> +Remove the interface from the given group.
>  .It Cm hwfeatures
>  Display the interface hardware features:
>  .Pp
> 



Re: top: filter by routing table

2020-08-23 Thread Remi Locherer
On Sun, Aug 23, 2020 at 10:45:14PM +0200, Klemens Nanni wrote:
> On Sun, Aug 23, 2020 at 10:39:21PM +0200, Remi Locherer wrote:
> > I like the feature and it works as advertised.
> > 
> > It would be nice to have a column that displays the rtable id of
> > each process when T is used. When I type "T-0" I see a list of procs
> > not in rtable 0.  But I still do not know in which one they are.
> That's certainly possible, but we need to pick a column which is not
> only suitable to omit but also wide enough to fit "RTABLE" as
> description, I'd say.
> 
> Are you OK with the diff as is?  We can take care of the rest as a
> separate diff.

sure! ok remi@



Re: top: filter by routing table

2020-08-23 Thread Klemens Nanni
On Sun, Aug 23, 2020 at 10:39:21PM +0200, Remi Locherer wrote:
> I like the feature and it works as advertised.
> 
> It would be nice to have a column that displays the rtable id of
> each process when T is used. When I type "T-0" I see a list of procs
> not in rtable 0.  But I still do not know in which one they are.
That's certainly possible, but we need to pick a column which is not
only suitable to omit but also wide enough to fit "RTABLE" as
description, I'd say.

Are you OK with the diff as is?  We can take care of the rest as a
separate diff.



Re: top: filter by routing table

2020-08-23 Thread Remi Locherer
On Sat, Aug 22, 2020 at 05:20:56PM -0600, Todd C. Miller wrote:
> This looks good to me but I've refrained from commenting simply
> because I don't use rtables at all myself.  Can we get some feedback
> from people who actually use rtables?
> 
>  - todd
> 

I like the feature and it works as advertised.

It would be nice to have a column that displays the rtable id of
each process when T is used. When I type "T-0" I see a list of procs
not in rtable 0.  But I still do not know in which one they are.

Remi



Re: awk: implement mktime() function

2020-08-23 Thread Jason McIntyre
On Sun, Aug 23, 2020 at 10:25:37AM -0600, Todd C. Miller wrote:
> Both gawk and mawk include mktime() in their time functions.
> We have strftime() and systime() but no mktime().  The following
> diff makes our awk more compatible with other implementations.
> 
>  - todd
> 

hi/

i think the text is fine. could you add mktime to the list of
extensions in STANDARDS too?

jmc

> Index: usr.bin/awk/awk.1
> ===
> RCS file: /cvs/src/usr.bin/awk/awk.1,v
> retrieving revision 1.56
> diff -u -p -u -r1.56 awk.1
> --- usr.bin/awk/awk.1 24 Jul 2020 01:57:06 -  1.56
> +++ usr.bin/awk/awk.1 23 Aug 2020 16:22:46 -
> @@ -684,6 +684,41 @@ This version of
>  provides the following functions for obtaining and formatting time
>  stamps.
>  .Bl -tag -width indent
> +.It Fn mktime datespec
> +Converts
> +.Fa datespec
> +into a timestamp in the same form as a value returned by
> +.Fn systime .
> +The
> +.Fa datespec
> +is a string composed of six or seven numbers separated by whitespace:
> +.Bd -literal -offset indent
> + MM DD HH MM SS [DST]
> +.Ed
> +.Pp
> +The fields in
> +.Fa datespec
> +are as follows:
> +.Bl -tag -width ""
> +.It YYY
> +Year: a four-digit year, including the century.
> +.It MM
> +Month: a number from 1 to 12.
> +.It DD
> +Day: a number from 1 to 31.
> +.It HH
> +Hour: a number from 0 to 23.
> +.It MM
> +Minute: a number from 0 to 59.
> +.It SS
> +Second: a number from 0 to 60 (permitting a leap second).
> +.It DST
> +Daylight Saving Time: a positive or zero value indicates that
> +DST is or is not in effect.
> +If DST is not specified, or is negative,
> +.Fn mktime
> +will attempt to determine the correct value.
> +.El
>  .It Fn strftime "[format [, timestamp]]"
>  Formats
>  .Ar timestamp
> @@ -696,6 +731,8 @@ manual page, as well as any arbitrary te
>  The
>  .Ar timestamp
>  must be in the same form as a value returned by
> +.Fn mktime
> +and
>  .Fn systime .
>  If
>  .Ar timestamp



awk: implement mktime() function

2020-08-23 Thread Todd C . Miller
Both gawk and mawk include mktime() in their time functions.
We have strftime() and systime() but no mktime().  The following
diff makes our awk more compatible with other implementations.

 - todd

Index: usr.bin/awk/awk.1
===
RCS file: /cvs/src/usr.bin/awk/awk.1,v
retrieving revision 1.56
diff -u -p -u -r1.56 awk.1
--- usr.bin/awk/awk.1   24 Jul 2020 01:57:06 -  1.56
+++ usr.bin/awk/awk.1   23 Aug 2020 16:22:46 -
@@ -684,6 +684,41 @@ This version of
 provides the following functions for obtaining and formatting time
 stamps.
 .Bl -tag -width indent
+.It Fn mktime datespec
+Converts
+.Fa datespec
+into a timestamp in the same form as a value returned by
+.Fn systime .
+The
+.Fa datespec
+is a string composed of six or seven numbers separated by whitespace:
+.Bd -literal -offset indent
+ MM DD HH MM SS [DST]
+.Ed
+.Pp
+The fields in
+.Fa datespec
+are as follows:
+.Bl -tag -width ""
+.It YYY
+Year: a four-digit year, including the century.
+.It MM
+Month: a number from 1 to 12.
+.It DD
+Day: a number from 1 to 31.
+.It HH
+Hour: a number from 0 to 23.
+.It MM
+Minute: a number from 0 to 59.
+.It SS
+Second: a number from 0 to 60 (permitting a leap second).
+.It DST
+Daylight Saving Time: a positive or zero value indicates that
+DST is or is not in effect.
+If DST is not specified, or is negative,
+.Fn mktime
+will attempt to determine the correct value.
+.El
 .It Fn strftime "[format [, timestamp]]"
 Formats
 .Ar timestamp
@@ -696,6 +731,8 @@ manual page, as well as any arbitrary te
 The
 .Ar timestamp
 must be in the same form as a value returned by
+.Fn mktime
+and
 .Fn systime .
 If
 .Ar timestamp
Index: usr.bin/awk/awk.h
===
RCS file: /cvs/src/usr.bin/awk/awk.h,v
retrieving revision 1.26
diff -u -p -u -r1.26 awk.h
--- usr.bin/awk/awk.h   26 Jun 2020 15:57:39 -  1.26
+++ usr.bin/awk/awk.h   22 Aug 2020 21:17:49 -
@@ -160,6 +160,7 @@ extern Cell *symtabloc; /* SYMTAB */
 #define FRSHIFT20
 #define FSYSTIME   21
 #define FSTRFTIME  22
+#define FMKTIME23
 
 /* Node:  parse tree is made of nodes, with Cell's at bottom */
 
Index: usr.bin/awk/lex.c
===
RCS file: /cvs/src/usr.bin/awk/lex.c,v
retrieving revision 1.25
diff -u -p -u -r1.25 lex.c
--- usr.bin/awk/lex.c   30 Jul 2020 17:45:44 -  1.25
+++ usr.bin/awk/lex.c   22 Aug 2020 21:18:16 -
@@ -75,6 +75,7 @@ const Keyword keywords[] = {  /* keep sor
{ "log",FLOG,   BLTIN },
{ "lshift", FLSHIFT,BLTIN },
{ "match",  MATCHFCN,   MATCHFCN },
+   { "mktime", FMKTIME,BLTIN },
{ "next",   NEXT,   NEXT },
{ "nextfile",   NEXTFILE,   NEXTFILE },
{ "or", FFOR,   BLTIN },
Index: usr.bin/awk/run.c
===
RCS file: /cvs/src/usr.bin/awk/run.c,v
retrieving revision 1.67
diff -u -p -u -r1.67 run.c
--- usr.bin/awk/run.c   11 Aug 2020 16:57:05 -  1.67
+++ usr.bin/awk/run.c   22 Aug 2020 21:35:49 -
@@ -1594,7 +1594,7 @@ Cell *bltin(Node **a, int n)  /* builtin 
FILE *fp;
int status = 0;
time_t tv;
-   struct tm *tm;
+   struct tm *tm, tmbuf;
 
t = ptoi(a[0]);
x = execute(a[1]);
@@ -1748,6 +1748,26 @@ Cell *bltin(Node **a, int n) /* builtin 
u = EOF;
else
u = fflush(fp);
+   break;
+   case FMKTIME:
+   memset(, 0, sizeof(tmbuf));
+   tm = 
+   t = sscanf(getsval(x), "%d %d %d %d %d %d %d",
+   >tm_year, >tm_mon, >tm_mday, >tm_hour,
+   >tm_min, >tm_sec, >tm_isdst);
+   switch (t) {
+   case 6:
+   tmbuf.tm_isdst = -1;/* let mktime figure it out */
+   /* FALLTHROUGH */
+   case 7:
+   tm->tm_year -= 1900;
+   tm->tm_mon--;
+   u = mktime(tm);
+   break;
+   default:
+   u = -1;
+   break;
+   }
break;
case FSYSTIME:
u = time((time_t *) 0);



Re: timekeep: fixing large skews on amd64 with RDTSCP

2020-08-23 Thread Claudio Jeker
On Sun, Aug 23, 2020 at 04:06:01PM +0200, Christian Weisgerber wrote:
> Scott Cheloha:
> 
> > This "it might slow down the network stack" thing keeps coming up, and
> > yet nobody can point to (a) who expressed this concern or (b) what the
> > penalty is in practice.
> 
> It was kettenis@ who simply raised the question and asked for
> comments from the network people.
> 
> I think we should just go ahead and use rdtsc_lfence() in
> tsc_get_timecount().  It is *correct*.

I agree. For the network stack there are bigger fishes to fry befor such a
micro optimisation would even matter.

-- 
:wq Claudio



Re: timekeep: fixing large skews on amd64 with RDTSCP

2020-08-23 Thread Christian Weisgerber
Scott Cheloha:

> This "it might slow down the network stack" thing keeps coming up, and
> yet nobody can point to (a) who expressed this concern or (b) what the
> penalty is in practice.

It was kettenis@ who simply raised the question and asked for
comments from the network people.

I think we should just go ahead and use rdtsc_lfence() in
tsc_get_timecount().  It is *correct*.

-- 
Christian "naddy" Weisgerber  na...@mips.inka.de



Re: timekeep: fixing large skews on amd64 with RDTSCP

2020-08-23 Thread Mark Kettenis
> Date: Sat, 22 Aug 2020 22:05:44 -0500
> From: Scott Cheloha 
> 
> On Tue, Jul 28, 2020 at 10:02:07AM +0300, Paul Irofti wrote:
> > 
> > [...]
> > 
> > Is the issue with LFENCE slowing down the network stack settled? That was
> > the argument against it last time.
> 
> ... a month passes.  Nobody says anything.
> 
> This "it might slow down the network stack" thing keeps coming up, and
> yet nobody can point to (a) who expressed this concern or (b) what the
> penalty is in practice.
> 
> Note that the alternative is "your timecounter might not be monotonic
> between threads".  For me, that's already a dealbreaker.
> 
> But for sake of discussion let's look at some data.  For those of you
> watching from home, please follow along!  I would like to know what
> your results look like.
> 
> To start, here is a microbenchmarking program for clock_gettime(2) on
> amd64.  If you have the userspace timecounter, then
> 
>   clock_gettime(CLOCK_MONOTONIC, ...);
> 
> is a suitable surrogate for nanouptime(9), so this microbenchmark can
> actually tell us about how nanouptime(9) or nanotime(9) would be
> impacted by a comparable change in the kernel timecounter.
> 
> --
> 
> /*
>  * clock_gettime-bench.c
>  */
> #include 
> #include 
> #include 
> #include 
> #include 
> 
> static uint64_t
> rdtsc_lfence(void)
> {
>   uint32_t hi, lo;
> 
>   __asm volatile("lfence; rdtsc; lfence" : "=d" (hi), "=a" (lo));
>   return ((uint64_t)hi << 32) | lo;
> }
> 
> int
> main(int argc, char *argv[])
> {
>   struct timespec now;
>   uint64_t begin, end;
>   long long count, i;
>   const char *errstr;
> 
>   if (argc != 2) {
>   fprintf(stderr, "usage: %s count\n", getprogname());
>   return 1;
>   }
>   count = strtonum(argv[1], 1, LLONG_MAX, );
>   if (errstr != NULL)
>   errx(1, "count is %s: %s", errstr, argv[1]);
> 
>   begin = rdtsc_lfence();
>   for (i = 0; i < count; i++)
>   clock_gettime(CLOCK_MONOTONIC, );
>   end = rdtsc_lfence();
> 
>   printf("%lld\t%llu\n", count, end - begin);
> 
>   return 0;
> }
> 
> --
> 
> Now consider a benchmark of 100K clock_gettime(2) calls against the
> userspace timecounter.
> 
> $ clock_gettime-bench 10
> 10  15703664
> 
> Let's collect 10K of these benchmarks -- our samples -- atop an
> unpatched libc.  Use the shell script below.  Note that we throw out
> samples where we hit a context switch.
> 
> --
> 
> #! /bin/sh
> 
> [ $# -ne 1 ] && exit 1
> RESULTS=$1
> shift
> 
> TIME=$(mktemp) || exit 1
> TMP=$(mktemp) || exit 1
> 
> # Collect 10K samples.
> i=0
> while [ $i -lt 1 ]; do
>   # Call clock_gettime(2) 100K times.
>   /usr/bin/time -l ~/scratch/clock_gettime-bench 10 > $TMP 2> $TIME
>   # Ignore this sample if a context switch occurred.
>   if egrep -q '[1-9][0-9]* +(in)?voluntary context' $TIME; then
>   continue
>   fi
>   cat $TMP >> $RESULTS
>   i=$((i + 1))
> done
> 
> rm $TMP $TIME
> 
> --
> 
> Run it like this:
> 
> $ ksh bench.sh unpatched.out
> 
> That will take ~5-10 minutes at most.
> 
> Next, we'll patch libc to add the LFENCE to the userspace timecounter.
> 
> Index: usertc.c
> ===
> RCS file: /cvs/src/lib/libc/arch/amd64/gen/usertc.c,v
> retrieving revision 1.2
> diff -u -p -r1.2 usertc.c
> --- usertc.c  8 Jul 2020 09:17:48 -   1.2
> +++ usertc.c  22 Aug 2020 22:18:47 -
> @@ -19,10 +19,10 @@
>  #include 
>  
>  static inline u_int
> -rdtsc(void)
> +rdtsc_lfence(void)
>  {
>   uint32_t hi, lo;
> - asm volatile("rdtsc" : "=a"(lo), "=d"(hi));
> + asm volatile("lfence; rdtsc" : "=a"(lo), "=d"(hi));
>   return ((uint64_t)lo)|(((uint64_t)hi)<<32);
>  }
>  
> @@ -31,7 +31,7 @@ tc_get_timecount(struct timekeep *tk, u_
>  {
>   switch (tk->tk_user) {
>   case TC_TSC:
> - *tc = rdtsc();
> + *tc = rdtsc_lfence();
>   return 0;
>   }
>  
> --
> 
> Recompile and reinstall libc.
> 
> Then rerun the benchmark.  Be careful not to overwrite our results
> from the unpatched libc:
> 
> $ ksh bench.sh patched.out
> 
> --
> 
> Alright, now let's compare the results.  I'm not a mathemagician so I
> use ministat and trust it implicitly.  A stat jock could probably do
> this in R or with some python, but I am not that clever, so I will
> stick with ministat.
> 
> There is no ministat port for OpenBSD, but it is pretty trivial to
> clone this github repo and build it on -current:
> 
> https://github.com/thorduri/ministat
> 
> --
> 
> Okay, you have ministat?
> 
> Let's compare the results.  We want the 2nd column in the output
> (-C2).  I'm not interested in the graph (-q), given our population
> size.  We have N=1, so let's push the CI up (-c 99.5).
> 
> $ ~/repo/ministat/ministat -C2 -q -c99.5 unpatched.out patched.out
> x unpatched.out
> + patched.out
> N   Min   

Re: pf: remove ptr_array from struct pf_ruleset

2020-08-23 Thread Klemens Nanni
On Mon, Jul 20, 2020 at 05:07:03PM +0200, Klemens Nanni wrote:
> On Mon, Jul 20, 2020 at 01:14:00PM +0200, Alexandr Nedvedicky wrote:
> > I took a closer look at your change and related area.  Below is an alternate
> > way to fix the bug you've found.
> Thanks for bringing it up again, I forgot to reply earlier.
> 
> > there are few details worth to note:
> > 
> > ptr_array matters to main ruleset only, because it is the only 
> > ruleset covered by MD5 sum for pfsync.
> Correct, as is directly seen by the only caller pf_commit_rules():
> 
> 819 /* Calculate checksum for the main ruleset */
> 820 if (rs == _main_ruleset) {
> 821 error = pf_setup_pfsync_matching(rs);
> 822 if (error != 0)
> 823 return (error);
> 824 }
> 
> > it looks like we should also recompute the MD5sum if we remove the rule
> > from the main ruleset 
> Yes, but that is a separate bug, regardless of how we handle checksum
> calculation and rule lookups.
> 
> My plan was to tackle this next, but reusing existing code instead:
> 
> After my initial diff, pf_setup_pfsync_matching() would merely update
> the checksum, so we could rename the function to something more
> appropiate and call it from pf_purge_rule() instead of duplicating it
> there.
Here is a new diff that amends the previous with the following:

- rename pf_setup_pfsync_matching() to pf_calc_chksum()
- make it void and simplify callees
- rehash in pf_purge_rule() to account for removed `once' rules

> > my diff introduces a new member of pr_ruleset structure, which keeps the
> > array size so we can pass it to free(9f) later.
> I persued this first (as done in other free() size diffs) but eventually
> opted for the removal of ptr_array for the sake of simplicity.
> 
> > I have no pfsync deployment readily available for testing. Will be glad
> > if you can give my diff a try.
> Works as expected in both regards: pfsync picks the right rule and the
> checksum updates after rules expired.
I have tested this with my usual setup where states keep syncing and
`once' rules disappear from the main ruleset on the machine I hit,
compared to -CURRENT however watching `pfctl -vsi | grep Check' no shows
the checksum updating accordingly.

Admins using `once' rules are hopefully aware of this caveat already,
but now the checksum actually indicates out-of-sync rulesets and does
no longer present the same checksum for different rulesets.

Feedback? OK?


Index: if_pfsync.c
===
RCS file: /cvs/src/sys/net/if_pfsync.c,v
retrieving revision 1.277
diff -u -p -r1.277 if_pfsync.c
--- if_pfsync.c 21 Aug 2020 22:59:27 -  1.277
+++ if_pfsync.c 23 Aug 2020 12:09:39 -
@@ -500,6 +500,7 @@ pfsync_state_import(struct pfsync_state 
struct pfi_kif  *kif;
int pool_flags;
int error = ENOMEM;
+   int n = 0;
 
if (sp->creatorid == 0) {
DPFPRINTF(LOG_NOTICE, "pfsync_state_import: "
@@ -524,9 +525,11 @@ pfsync_state_import(struct pfsync_state 
 */
if (sp->rule != htonl(-1) && sp->anchor == htonl(-1) &&
(flags & (PFSYNC_SI_IOCTL | PFSYNC_SI_CKSUM)) && ntohl(sp->rule) <
-   pf_main_ruleset.rules.active.rcount)
-   r = pf_main_ruleset.rules.active.ptr_array[ntohl(sp->rule)];
-   else
+   pf_main_ruleset.rules.active.rcount) {
+   TAILQ_FOREACH(r, pf_main_ruleset.rules.active.ptr, entries)
+   if (ntohl(sp->rule) == n++)
+   break;
+   } else
r = _default_rule;
 
if ((r->max_states && r->states_cur >= r->max_states))
Index: pfvar.h
===
RCS file: /cvs/src/sys/net/pfvar.h,v
retrieving revision 1.495
diff -u -p -r1.495 pfvar.h
--- pfvar.h 28 Jul 2020 16:47:42 -  1.495
+++ pfvar.h 23 Aug 2020 12:09:39 -
@@ -924,7 +924,6 @@ struct pf_ruleset {
struct pf_rulequeue  queues[2];
struct {
struct pf_rulequeue *ptr;
-   struct pf_rule  **ptr_array;
u_int32_trcount;
u_int32_tticket;
int  open;
Index: pf_ioctl.c
===
RCS file: /cvs/src/sys/net/pf_ioctl.c,v
retrieving revision 1.354
diff -u -p -r1.354 pf_ioctl.c
--- pf_ioctl.c  21 Jul 2020 14:13:17 -  1.354
+++ pf_ioctl.c  23 Aug 2020 13:50:53 -
@@ -97,7 +97,7 @@ intpf_rollback_rules(u_int32_t, char
 voidpf_remove_queues(void);
 int pf_commit_queues(void);
 voidpf_free_queues(struct pf_queuehead *);
-int 

Re: video -c: showing auto white balance temperature

2020-08-23 Thread Marcus Glocker
On Sun, 23 Aug 2020 08:59:10 +0100
Laurence Tratt  wrote:

> On Sun, Aug 23, 2020 at 09:17:57AM +0200, Marcus Glocker wrote:
> 
> Hello Marcus,
> 
> > Sorry for the delay - I'm back to business mode and need to take
> > care about naive project managers with crazy requirements the most
> > of the day  
> 
> Rather you than me :)
> 
> > See inline some (nitpicking) feedback and adapted diff.  Ok for
> > you?  
> 
> This is fine except:
> 
> > @@ -1684,7 +1705,7 @@ setup(struct video *vid)
> >  * after the video stream has been started since some cams
> > only
> >  * process this control while the video stream is on.
> >  */
> > -   dev_set_ctrl_auto_white_balance(vid, 0, 1);
> > +   dev_set_ctrl_auto(vid, V4L2_CID_AUTO_WHITE_BALANCE, 0, 1);
> >  
> > if (vid->mode & M_OUT_XV)
> > net_wm_supported(vid);  
> 
> This segfaults because we're passing the wrong constant in. This chunk
> should be:
> 
> @@ -1684,7 +1707,7 @@ setup(struct video *vid)
>* after the video stream has been started since some cams
> only
>* process this control while the video stream is on.
>*/
> - dev_set_ctrl_auto_white_balance(vid, 0, 1);
> + dev_set_ctrl_auto(vid, CTRL_WHITE_BALANCE_TEMPERATURE, 0, 1);
>  
>   if (vid->mode & M_OUT_XV)
>   net_wm_supported(vid);

Oh dear, thanks for spotting :-)

> 
> Laurie
> 



Re: timekeep: fixing large skews on amd64 with RDTSCP

2020-08-23 Thread Paul Irofti
Hi,

As I see this is addressed to me I will reply. I am in the mountains until the 
end of the month with poor internet connection.

First, it was not me that stated that the network stack is blocking the change. 
Somebody said that during the initial development of the user timeclock when I 
proposed the same change. 

I love that you actually provided data to analyze the issue! Statements like 
that should always come with data to support them like you did now.

Just to make it clear, I am OK with this, but I am not a network guy so I don't 
know the real issue (if there was any).

Paul


În 23 august 2020 06:05:44 EEST, Scott Cheloha  a scris:
>On Tue, Jul 28, 2020 at 10:02:07AM +0300, Paul Irofti wrote:
>> 
>> [...]
>> 
>> Is the issue with LFENCE slowing down the network stack settled? That
>was
>> the argument against it last time.
>
>... a month passes.  Nobody says anything.
>
>This "it might slow down the network stack" thing keeps coming up, and
>yet nobody can point to (a) who expressed this concern or (b) what the
>penalty is in practice.
>
>Note that the alternative is "your timecounter might not be monotonic
>between threads".  For me, that's already a dealbreaker.
>
>But for sake of discussion let's look at some data.  For those of you
>watching from home, please follow along!  I would like to know what
>your results look like.
>
>To start, here is a microbenchmarking program for clock_gettime(2) on
>amd64.  If you have the userspace timecounter, then
>
>   clock_gettime(CLOCK_MONOTONIC, ...);
>
>is a suitable surrogate for nanouptime(9), so this microbenchmark can
>actually tell us about how nanouptime(9) or nanotime(9) would be
>impacted by a comparable change in the kernel timecounter.
>
>--
>
>/*
> * clock_gettime-bench.c
> */
>#include 
>#include 
>#include 
>#include 
>#include 
>
>static uint64_t
>rdtsc_lfence(void)
>{
>   uint32_t hi, lo;
>
>   __asm volatile("lfence; rdtsc; lfence" : "=d" (hi), "=a" (lo));
>   return ((uint64_t)hi << 32) | lo;
>}
>
>int
>main(int argc, char *argv[])
>{
>   struct timespec now;
>   uint64_t begin, end;
>   long long count, i;
>   const char *errstr;
>
>   if (argc != 2) {
>   fprintf(stderr, "usage: %s count\n", getprogname());
>   return 1;
>   }
>   count = strtonum(argv[1], 1, LLONG_MAX, );
>   if (errstr != NULL)
>   errx(1, "count is %s: %s", errstr, argv[1]);
>
>   begin = rdtsc_lfence();
>   for (i = 0; i < count; i++)
>   clock_gettime(CLOCK_MONOTONIC, );
>   end = rdtsc_lfence();
>
>   printf("%lld\t%llu\n", count, end - begin);
>
>   return 0;
>}
>
>--
>
>Now consider a benchmark of 100K clock_gettime(2) calls against the
>userspace timecounter.
>
>$ clock_gettime-bench 10
>10  15703664
>
>Let's collect 10K of these benchmarks -- our samples -- atop an
>unpatched libc.  Use the shell script below.  Note that we throw out
>samples where we hit a context switch.
>
>--
>
>#! /bin/sh
>
>[ $# -ne 1 ] && exit 1
>RESULTS=$1
>shift
>
>TIME=$(mktemp) || exit 1
>TMP=$(mktemp) || exit 1
>
># Collect 10K samples.
>i=0
>while [ $i -lt 1 ]; do
>   # Call clock_gettime(2) 100K times.
>   /usr/bin/time -l ~/scratch/clock_gettime-bench 10 > $TMP 2> $TIME
>   # Ignore this sample if a context switch occurred.
>   if egrep -q '[1-9][0-9]* +(in)?voluntary context' $TIME; then
>   continue
>   fi
>   cat $TMP >> $RESULTS
>   i=$((i + 1))
>done
>
>rm $TMP $TIME
>
>--
>
>Run it like this:
>
>$ ksh bench.sh unpatched.out
>
>That will take ~5-10 minutes at most.
>
>Next, we'll patch libc to add the LFENCE to the userspace timecounter.
>
>Index: usertc.c
>===
>RCS file: /cvs/src/lib/libc/arch/amd64/gen/usertc.c,v
>retrieving revision 1.2
>diff -u -p -r1.2 usertc.c
>--- usertc.c   8 Jul 2020 09:17:48 -   1.2
>+++ usertc.c   22 Aug 2020 22:18:47 -
>@@ -19,10 +19,10 @@
> #include 
> 
> static inline u_int
>-rdtsc(void)
>+rdtsc_lfence(void)
> {
>   uint32_t hi, lo;
>-  asm volatile("rdtsc" : "=a"(lo), "=d"(hi));
>+  asm volatile("lfence; rdtsc" : "=a"(lo), "=d"(hi));
>   return ((uint64_t)lo)|(((uint64_t)hi)<<32);
> }
> 
>@@ -31,7 +31,7 @@ tc_get_timecount(struct timekeep *tk, u_
> {
>   switch (tk->tk_user) {
>   case TC_TSC:
>-  *tc = rdtsc();
>+  *tc = rdtsc_lfence();
>   return 0;
>   }
> 
>--
>
>Recompile and reinstall libc.
>
>Then rerun the benchmark.  Be careful not to overwrite our results
>from the unpatched libc:
>
>$ ksh bench.sh patched.out
>
>--
>
>Alright, now let's compare the results.  I'm not a mathemagician so I
>use ministat and trust it implicitly.  A stat jock could probably do
>this in R or with some python, but I am not that clever, so I will
>stick with ministat.
>
>There is no ministat port for OpenBSD, but it is pretty trivial to

Re: video -c: showing auto white balance temperature

2020-08-23 Thread Laurence Tratt
On Sun, Aug 23, 2020 at 09:17:57AM +0200, Marcus Glocker wrote:

Hello Marcus,

> Sorry for the delay - I'm back to business mode and need to take care
> about naive project managers with crazy requirements the most of the day

Rather you than me :)

> See inline some (nitpicking) feedback and adapted diff.  Ok for you?

This is fine except:

> @@ -1684,7 +1705,7 @@ setup(struct video *vid)
>* after the video stream has been started since some cams only
>* process this control while the video stream is on.
>*/
> - dev_set_ctrl_auto_white_balance(vid, 0, 1);
> + dev_set_ctrl_auto(vid, V4L2_CID_AUTO_WHITE_BALANCE, 0, 1);
>  
>   if (vid->mode & M_OUT_XV)
>   net_wm_supported(vid);

This segfaults because we're passing the wrong constant in. This chunk
should be:

@@ -1684,7 +1707,7 @@ setup(struct video *vid)
 * after the video stream has been started since some cams only
 * process this control while the video stream is on.
 */
-   dev_set_ctrl_auto_white_balance(vid, 0, 1);
+   dev_set_ctrl_auto(vid, CTRL_WHITE_BALANCE_TEMPERATURE, 0, 1);
 
if (vid->mode & M_OUT_XV)
net_wm_supported(vid);


Laurie



Re: video -c: showing auto white balance temperature

2020-08-23 Thread Marcus Glocker
Hello Laurie,

Sorry for the delay - I'm back to business mode and need to take care
about naive project managers with crazy requirements the most of the day
...

See inline some (nitpicking) feedback and adapted diff.  Ok for you?

Thanks,
Marcus

On Sun, 9 Aug 2020 13:09:34 +0100
Laurence Tratt  wrote:

> On Sat, Aug 08, 2020 at 11:29:41PM +0200, Marcus Glocker wrote:
> 
> Hello Marcus,
> 
>  One other thing has occurred to me -- but can be done in a
>  future patch -- is that we probably want to be able to do:
> 
>    $ video white_balance_temperature=auto  
> 
> Please find attached a patch which does this. It also makes setting
> auto controls more generic for if/when we gain more. If you set (say)
> "saturation=auto" we warn, but don't error. So for example:
> 
>   $ video -d
>   $ video white_balance_temperature=3000
>   white_balance_temperature: auto -> 3000
>   $ video white_balance_temperature=auto
>   white_balance_temperature: 3000 -> auto
>   $ video white_balance_temperature=auto
>   white_balance_temperature: auto -> auto
>   $ video white_balance_temperature=3000 saturation=auto
>   white_balance_temperature: auto -> 3000
>   saturation: no automatic control found
> 
> 
> Laurie
> 
> 
> Index: video.c
> ===
> RCS file: /cvs/xenocara/app/video/video.c,v
> retrieving revision 1.35
> diff -u -r1.35 video.c
> --- video.c   9 Aug 2020 06:51:04 -   1.35
> +++ video.c   9 Aug 2020 11:58:35 -
> @@ -219,8 +219,8 @@
>  int dev_init(struct video *);
>  int dev_set_ctrl_abs(struct video *vid, int, int);
>  void dev_set_ctrl_rel(struct video *, int, int);
> -void dev_set_ctrl_auto_white_balance(struct video *, int, int);
>  int dev_get_ctrl_auto(struct video *, int);
> +void dev_set_ctrl_auto(struct video *, int, int, int);
>  void dev_reset_ctrls(struct video *);
>  
>  int parse_ctrl(struct video *, int, char **);
> @@ -1037,7 +1037,7 @@
>* The spec requires auto-white balance to be off
> before
>* we can set the white balance temperature.
>*/
> - dev_set_ctrl_auto_white_balance(vid, 0, 0);
> + dev_set_ctrl_auto(vid, ctrl, 0, 0);
>   }
>   if (val > ctrls[ctrl].max)
>   val = ctrls[ctrl].max;
> @@ -1082,12 +1082,15 @@
>  }
>  
>  void
> -dev_set_ctrl_auto_white_balance(struct video *vid, int value, int
> reset) +dev_set_ctrl_auto(struct video *vid, int ctrl, int value, int
> reset) {
>   struct dev *d = >dev;
>   struct v4l2_control control;
>  
> - control.id = V4L2_CID_AUTO_WHITE_BALANCE;
> + if (!ctrls[ctrl].id_auto)
> + return;
> +
> + control.id = ctrls[ctrl].id_auto;
>   if (ioctl(d->fd, VIDIOC_G_CTRL, ) != 0) {
>   warn("VIDIOC_G_CTRL");
>   return;
> @@ -1134,8 +1137,7 @@
>   if (!ctrls[i].supported)
>   continue;
>   dev_set_ctrl_abs(vid, i, ctrls[i].def);
> - if (i == CTRL_WHITE_BALANCE_TEMPERATURE)
> - dev_set_ctrl_auto_white_balance(vid, 1, 0);
> + dev_set_ctrl_auto(vid, i, 1, 0);
>   }
>  }
>  
> @@ -1303,7 +1305,7 @@
>   }
>   }
>   if (i == CTRL_LAST)
> - warnx("%s: unknown control",
> *argv);   
> + warnx("%s: unknown control", *argv);
>   continue;
>   }

Lets commit this separately since it's not related to the main change.

> @@ -1315,21 +1317,34 @@
>   warnx("%s: no value", *argv);
>   break;
>   }
> - val_new = strtonum(p, -32768, 32768,
> );
> - if (errstr != NULL) {
> - warnx("%s: %s", *argv, errstr);
> - return 0;
> - }
> - val_old = ctrls[i].cur;
>   auto_old = dev_get_ctrl_auto(vid, i);
> - if (dev_set_ctrl_abs(vid, i, val_new) == 0) {
> - if (auto_old) {
> - fprintf(stderr, "%s: auto ->
> %d\n",
> - ctrls[i].name,
> ctrls[i].cur);
> + val_old = ctrls[i].cur;
> + if (strcmp(p, "auto") == 0) {
> + if (ctrls[i].id_auto == 0) {
> + fprintf(stderr, "%s: no
> automatic control found\n",
> + ctrls[i].name);
> + } else if (!auto_old) {
> + fprintf(stderr, "%s: %d ->
> auto\n",
> + ctrls[i].name,
> val_old);
> + dev_set_ctrl_auto(vid, i, 1,
> 0); } else {
> -