Re: Kill NFS-only kqueue poller thread

2020-05-31 Thread Martin Pieuchot
On 30/05/20(Sat) 09:25, Theo de Raadt wrote:
> [...] 
> What does this have to do with threads?  That is an implimentation detail.

This implementation detail is specific to NFS, no other FS do anything
like that.  So I'm questioning whether calling kthread_create(9) inside a
kqueue(2) handler, which is called from kevent(2) and could be called
from poll(2) is the wanted behavior.

> This is about behaviour.  If you open a FFS directory, do you eventually
> get updates?

FFS and all the other FS have the same behavior when it comes to poll(2)
and kqueue(2).  They don't check for I/O and return true.  NFS tries to
be clever for kqueue(2)-only, I'm questioning the added value of that
cleverness.

> Should NFS not try to act the same?

That's exactly what my diff is doing: bring it in sync with other FS.



Re: userland clock_gettime proof of concept

2020-05-31 Thread Paul Irofti

On 2020-05-31 07:28, Theo de Raadt wrote:

PowerPC Mac OS X had a userland gettimeofday(2) using the cpu's
timebase and a "common page" from the kernel.  Their common page also
had executable code for gettimeofday, memcpy, pthread_self, and a few
other functions.


We are desperately avoiding the model where such code is exported.
It becomes a target.


Indeed.

Are we settled on timekeep as a name? Do you want to rename it to 
something else? Make it more generic?




Re: Kill NFS-only kqueue poller thread

2020-05-31 Thread Martin Pieuchot
On 30/05/20(Sat) 15:49, Visa Hankala wrote:
> [...] 
> Local filesystems can observe changes at the source, which makes polling
> unnecessary. NFS clients do not have that benefit. The NFSv3 protocol
> lacks a mechanism to notify clients of changes.
> 
> The NFS polling mechanism is in use for example when running tail -f
> on a remote file. Directory monitoring can be utilized for example
> by a directory browser application or a distributed batch processing
> system.
> 
> > > It is true that the code is not perfect, but having even basic and
> > > best-effort functionality with kevent(2) and NFS is better than nothing
> > > in my opinion. The kqueue subsystem should not dumb down for emulating
> > > poll(2) and select(2).
> > 
> > In that case we've to accept or work-around with the fact that a thread
> > is allocated, an operation that can fail, in the kqfilter().  Do you have
> > an opinion on that?  What should we do for other FSs is uniformity wanted?
> 
> Can't the emulation of poll(2) and select(2) operate without the poller?
> These system calls are supposed to return true for regular files for
> reading and writing. No polling of remote state is needed for that.
> 
> The read behaviour of poll(2) and select(2) differs from kevent(2)'s
> EVFILT_READ. Consequently, there has to be an indicator that makes
> f_event work in poll(2)/select(2) mode. I guess the same indicator can
> control nfs_kqfilter().

So are you saying that poll(2) is broken on NFS and suggest that I extend
the kqfilter to match this behavior?



Re: [patch] azalia: Intel 300 Series HD Audio

2020-05-31 Thread Benjamin Baier
On Fri, 29 May 2020 11:25:44 +0200
Bruno Flueckiger  wrote:

> Hi,
> 
> My brand new laptop HP EliteBook 850 G6 comes with an Intel 300 Series
> HD Audio device rev 0x11. The device shows up as not configured in the
> dmesg. The PCI config space of the device identifies its subclass as
> PCI_SUBCLASS_MULTIMEDIA_AUDIO instead of PCI_SUBCLASS_MULTIMEDIA_HDAUDIO
> 
> The patch below makes the device work just fine on my laptop.
> 
> Cheers,
> Bruno
> 
> Index: sys/dev/pci/azalia.c
> ===
> RCS file: /cvs/src/sys/dev/pci/azalia.c,v
> retrieving revision 1.255
> diff -u -p -r1.255 azalia.c
> --- sys/dev/pci/azalia.c  18 Apr 2020 21:55:56 -  1.255
> +++ sys/dev/pci/azalia.c  28 May 2020 13:48:10 -
> @@ -481,7 +481,8 @@ azalia_pci_match(struct device *parent,
> 
>   pa = aux;
>   if (PCI_CLASS(pa->pa_class) == PCI_CLASS_MULTIMEDIA
> - && PCI_SUBCLASS(pa->pa_class) == PCI_SUBCLASS_MULTIMEDIA_HDAUDIO)
> + && (PCI_SUBCLASS(pa->pa_class) == PCI_SUBCLASS_MULTIMEDIA_HDAUDIO
> + || PCI_SUBCLASS(pa->pa_class) == PCI_SUBCLASS_MULTIMEDIA_AUDIO))
>   return 1;
>   return 0;
>  }
> 

Hi.

Does your Laptop run with the latest BIOS? There was one released on May 11th.
https://support.hp.com/de-de/drivers/selfservice/swdetails/hp-elitebook-850-g6-notebook-pc/26609805/swItemId/ob-251060-1
The release notes state: Fixes an issue where the audio on the system stops 
functioning after the Intel Active Management Technology (AMT) option is 
disabled.

The azalia patch is in but I would prefer HP to fix their BIOS instead.

Greetings Ben



NFS kqueue handlers & poll(2)/select(2) compatibility

2020-05-31 Thread Martin Pieuchot
NFS poll(2)/select(2) and kqueue(2) behaviors are incoherent.  Diff
below uses the kernel-only NOTE_IMM hint to make the kqueue handlers
behave like the current poll handler: the poller is bypassed.

The new EVFILT_WRITE handler doesn't check for NOTE_IMM because it is
unlikely to introduce regression.

Is this a preferred approach?  Ok?

Index: nfs/nfs_kq.c
===
RCS file: /cvs/src/sys/nfs/nfs_kq.c,v
retrieving revision 1.30
diff -u -p -r1.30 nfs_kq.c
--- nfs/nfs_kq.c7 Apr 2020 13:27:52 -   1.30
+++ nfs/nfs_kq.c31 May 2020 08:43:36 -
@@ -50,9 +50,12 @@
 #include 
 
 void   nfs_kqpoll(void *);
+intnfs_kqwatch(struct vnode *);
+void   nfs_kqunwatch(struct vnode *);
 
 void   filt_nfsdetach(struct knote *);
 intfilt_nfsread(struct knote *, long);
+intfilt_nfswrite(struct knote *, long);
 intfilt_nfsvnode(struct knote *, long);
 
 struct kevq {
@@ -182,11 +185,19 @@ void
 filt_nfsdetach(struct knote *kn)
 {
struct vnode *vp = (struct vnode *)kn->kn_hook;
-   struct kevq *ke;
 
klist_remove(>v_selectinfo.si_note, kn);
 
/* Remove the vnode from watch list */
+   if ((kn->kn_sfflags & NOTE_IMM) == 0)
+   nfs_kqunwatch(vp);
+}
+
+void
+nfs_kqunwatch(struct vnode *vp)
+{
+   struct kevq *ke;
+
rw_enter_write(_lock);
SLIST_FOREACH(ke, , kev_link) {
if (ke->vp == vp) {
@@ -238,6 +249,22 @@ filt_nfsread(struct knote *kn, long hint
 }
 
 int
+filt_nfswrite(struct knote *kn, long hint)
+{
+   /*
+* filesystem is gone, so set the EOF flag and schedule
+* the knote for deletion.
+*/
+   if (hint == NOTE_REVOKE) {
+   kn->kn_flags |= (EV_EOF | EV_ONESHOT);
+   return (1);
+   }
+
+   kn->kn_data = 0;
+   return (1);
+}
+
+int
 filt_nfsvnode(struct knote *kn, long hint)
 {
if (kn->kn_sfflags & hint)
@@ -256,6 +283,13 @@ static const struct filterops nfsread_fi
.f_event= filt_nfsread,
 };
 
+const struct filterops nfswrite_filtops = {
+   .f_flags= FILTEROP_ISFD,
+   .f_attach   = NULL,
+   .f_detach   = filt_nfsdetach,
+   .f_event= filt_nfswrite,
+};
+
 static const struct filterops nfsvnode_filtops = {
.f_flags= FILTEROP_ISFD,
.f_attach   = NULL,
@@ -269,10 +303,6 @@ nfs_kqfilter(void *v)
struct vop_kqfilter_args *ap = v;
struct vnode *vp;
struct knote *kn;
-   struct kevq *ke;
-   int error = 0;
-   struct vattr attr;
-   struct proc *p = curproc;   /* XXX */
 
vp = ap->a_vp;
kn = ap->a_kn;
@@ -286,6 +316,9 @@ nfs_kqfilter(void *v)
case EVFILT_READ:
kn->kn_fop = _filtops;
break;
+   case EVFILT_WRITE:
+   kn->kn_fop = _filtops;
+   break;
case EVFILT_VNODE:
kn->kn_fop = _filtops;
break;
@@ -298,7 +331,27 @@ nfs_kqfilter(void *v)
/*
 * Put the vnode to watched list.
 */
-   
+   if ((kn->kn_sfflags & NOTE_IMM) == 0) {
+   int error;
+
+   error = nfs_kqwatch(vp);
+   if (error)
+   return (error);
+   }
+
+   klist_insert(>v_selectinfo.si_note, kn);
+
+   return (0);
+}
+
+int
+nfs_kqwatch(struct vnode *vp)
+{
+   struct proc *p = curproc;   /* XXX */
+   struct vattr attr;
+   struct kevq *ke;
+   int error = 0;
+
/*
 * Fetch current attributes. It's only needed when the vnode
 * is not watched yet, but we need to do this without lock
@@ -338,8 +391,6 @@ nfs_kqfilter(void *v)
 
/* kick the poller */
wakeup(pnfskq);
-
-   klist_insert(>v_selectinfo.si_note, kn);
 
 out:
rw_exit_write(_lock);



Re: symmetric toeplitz hashing

2020-05-31 Thread Theo Buehler
On Fri, May 29, 2020 at 02:41:07PM +1000, David Gwynne wrote:
> This is another bit of the puzzle for supporting multiple rx rings
> and receive side scaling (RSS) on nics. It borrows heavily from
> DragonflyBSD, but I've made some tweaks on the way.
> 
> For background on the dfly side, I recommend having a look at
> https://leaf.dragonflybsd.org/~sephe/AsiaBSDCon%20-%20Dfly.pdf.
> 
> From my point of view, the interesting thing is that they came up
> with a way to use Toeplitz hashing so the kernel AND network
> interfaces hash packets so packets in both directions onto the same
> bucket. The other interesting thing is that the optimised the hash
> calculation by building a cache of all the intermediate results
> possible for each input byte. Their hash calculation is simply
> xoring these intermediate reults together.
> 
> I've made some tweaks compared to dfly for how the caching is
> calculated and used, so it's not an exactly 1:1 port of the dfly
> code. If anyone is interested in the tweaks, let me know.
> 
> So this diff adds an API for the kernel to use for calculating a
> hash for ip addresses and ports, and adds a function for network
> drivers to call that gives them a key to use with RSS. If all drivers
> use the same key, then the same flows should be steered to the same
> place when they enter the network stack regardless of which hardware
> they came in on.
> 
> I've tested it with vmx(4) and some quick and dirty hacks to the
> network stack (and with some magical observability), and can see
> things like tcpbench push packets onto the same numbered ifq/txring
> that the "nic" picks for the rxring and therefore ifiq into the
> stack. We're going to try it on some more drivers soon.
> 
> The way this is set up now, if a nic driver wants to do RSS, you
> add stoeplitz as a dependency in the kernel config file, which
> causes this code to be included in the build.
> 
> There's some discussion to be had about the best way to integrate
> this on the IP stack side, but that is about where this API is
> called from, not the internals of it per se.
> 
> Thoughts? ok?

It's a nice construction. Some tiny nits and one comment/question below.

> 
> Index: share/man/man9/stoeplitz_to_key.9
> ===
> RCS file: share/man/man9/stoeplitz_to_key.9
> diff -N share/man/man9/stoeplitz_to_key.9
> --- /dev/null 1 Jan 1970 00:00:00 -
> +++ share/man/man9/stoeplitz_to_key.9 29 May 2020 04:01:26 -
> @@ -0,0 +1,126 @@
> +.\" $OpenBSD$
> +.\"
> +.\" Copyright (c) 2020 David Gwynne 
> +.\"
> +.\" Permission to use, copy, modify, and distribute this software for any
> +.\" purpose with or without fee is hereby granted, provided that the above
> +.\" copyright notice and this permission notice appear in all copies.
> +.\"
> +.\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
> +.\" WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
> +.\" MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
> +.\" ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
> +.\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
> +.\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
> +.\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
> +.\"
> +.Dd $Mdocdate: May 29 2020 $
> +.Dt STOEPLITZ_TO_KEY 9
> +.Os
> +.Sh NAME
> +.Nm stoeplitz_to_key ,
> +.Nm stoeplitz_hash_ip4 ,
> +.Nm stoeplitz_hash_ip4port ,
> +.Nm stoeplitz_hash_ip6 ,
> +.Nm stoeplitz_hash_ip4port
> +.Nd Symmetric Toeplitz Hash API
> +.Sh SYNOPSIS
> +.In net/toeplitz.h
> +.Ft void
> +.Fn stoeplitz_to_key "uint8_t *key" "size_t keylen"
> +.Ft uint16_t
> +.Fo stoeplitz_hash_ip4
> +.Fa "uint32_t srcaddr"
> +.Fa "uint32_t dstaddr"
> +.Fc
> +.Ft uint16_t
> +.Fo stoeplitz_hash_ip4port
> +.Fa "uint32_t srcaddr"
> +.Fa "uint32_t dstaddr"
> +.Fa "uint16_t srcport"
> +.Fa "uint16_t dstport"
> +.Fc
> +.Ft uint16_t
> +.Fo stoeplitz_hash_ip6
> +.Fa "const struct in6_addr *srcaddr"
> +.Fa "const struct in6_addr *dstaddr"
> +.Fc
> +.Ft uint16_t
> +.Fo stoeplitz_hash_ip6port
> +.Fa "const struct in6_addr *srcaddr"
> +.Fa "const struct in6_addr *dstaddr"
> +.Fa "uint16_t srcport"
> +.Fa "uint16_t dstport"
> +.Fc
> +.Sh DESCRIPTION
> +The Toeplitz hash algorithm is commonly used by network interface
> +controllers to to generate a short hash based on the value of fields
> +in network packet headers.
> +.\" mention RSS?
> +The resulting hash value can be used as a flow identifier, which
> +in turn can be used to consistently select a context for processing
> +packets using those fields.
> +Traditionally, the Toeplitz hash produces different results depending
> +on the order of inputs, ie, adding port 80 then 1234 as inputs would
> +produce a different result to hashing port 1234 then 80.
> +.Pp
> +The symmetric Toeplitz API uses a key selected to generate the same
> +hash result regardless of the 

Re: Kill NFS-only kqueue poller thread

2020-05-31 Thread Visa Hankala
On Sun, May 31, 2020 at 09:40:47AM +0200, Martin Pieuchot wrote:
> On 30/05/20(Sat) 15:49, Visa Hankala wrote:
> > [...] 
> > Local filesystems can observe changes at the source, which makes polling
> > unnecessary. NFS clients do not have that benefit. The NFSv3 protocol
> > lacks a mechanism to notify clients of changes.
> > 
> > The NFS polling mechanism is in use for example when running tail -f
> > on a remote file. Directory monitoring can be utilized for example
> > by a directory browser application or a distributed batch processing
> > system.
> > 
> > > > It is true that the code is not perfect, but having even basic and
> > > > best-effort functionality with kevent(2) and NFS is better than nothing
> > > > in my opinion. The kqueue subsystem should not dumb down for emulating
> > > > poll(2) and select(2).
> > > 
> > > In that case we've to accept or work-around with the fact that a thread
> > > is allocated, an operation that can fail, in the kqfilter().  Do you have
> > > an opinion on that?  What should we do for other FSs is uniformity wanted?
> > 
> > Can't the emulation of poll(2) and select(2) operate without the poller?
> > These system calls are supposed to return true for regular files for
> > reading and writing. No polling of remote state is needed for that.
> > 
> > The read behaviour of poll(2) and select(2) differs from kevent(2)'s
> > EVFILT_READ. Consequently, there has to be an indicator that makes
> > f_event work in poll(2)/select(2) mode. I guess the same indicator can
> > control nfs_kqfilter().
> 
> So are you saying that poll(2) is broken on NFS and suggest that I extend
> the kqfilter to match this behavior?

I am not saying that. I am saying that poll(2) and kevent(2) have
different semantics with regular files. Except that the write filter
is missing at the moment, the current NFS poll(2) and kevent(2)
behaviour looks about similar to the other filesystems.

poll(2) returns true immediately, whereas kevent(2) EVFILT_READ returns
true once the file pointer is not at the end of file (unless NOTE_EOF
is set).

For the semantics of poll(2), the NFS poller is pointless. However,
the poller is necessary for approximating kevent(2) semantics on NFS.



Re: sparc64 boot issue on qemu

2020-05-31 Thread Mark Cave-Ayland
On 30/05/2020 00:19, Jason A. Donenfeld wrote:

> Note that you need to run this with `-nographic`, because the kernel
> crashes when trying to use vgafb on sparc64/qemu. I've witnessed two
> varieties crashes:
> 
> - https://data.zx2c4.com/openbsd-6.7-sparc64-vga-panic-miniroot67.png
> This happens when booting up miniroot67.fs
> 
> - https://data.zx2c4.com/openbsd-6.7-sparc64-vga-panic-after-installation.png
> This happens after installation openbsd onto disk properly, and then
> booting up into it.
> 
> Passing `-nographic` prevents these from happening, since vgafb doesn't
> bind to anything.
> 
> I don't have a bsd.gdb in order to addr2line this, but if the miniroot
> panic is related to the normal panic, and we then assume alignment
> issues in fb_get_console_metrics, then I wonder if the below patch would
> make a difference. On the other hand, a "data access fault" makes it
> seem more likely that OF_interpret is just getting bogus addresses from
> buggy qemu firmware.
> 
> I probably have another two hours to go in waiting for this thing to
> build...
> 
> Jason
> 
> --- a/sys/arch/sparc64/dev/fb.c
> +++ b/sys/arch/sparc64/dev/fb.c
> @@ -507,6 +507,7 @@ int
>  fb_get_console_metrics(int *fontwidth, int *fontheight, int *wtop, int 
> *wleft)
>  {
>   cell_t romheight, romwidth, windowtop, windowleft;
> + uint64_t romheight_64, romwidth_64, windowtop_64, windowleft_64;
> 
>   /*
>* Get the PROM font metrics and address
> @@ -520,10 +521,15 @@ fb_get_console_metrics(int *fontwidth, int *fontheight, 
> int *wtop, int *wleft)
>   windowtop == 0 || windowleft == 0)
>   return (1);
> 
> - *fontwidth = (int)*(uint64_t *)romwidth;
> - *fontheight = (int)*(uint64_t *)romheight;
> - *wtop = (int)*(uint64_t *)windowtop;
> - *wleft = (int)*(uint64_t *)windowleft;
> + memcpy(_64, (void *)romheight, sizeof(romheight_64));
> + memcpy(_64, (void *)romwidth, sizeof(romwidth_64));
> + memcpy(_64, (void *)windowtop, sizeof(windowtop_64));
> + memcpy(_64, (void *)windowleft, sizeof(windowleft_64));
> +
> + *fontwidth = (int)romwidth_64;
> + *fontheight = (int)romheight_64;
> + *wtop = (int)windowtop_64;
> + *wleft = (int)windowleft_64;
> 
>   return (0);
>  }

AFAICT the problem here is the Forth being used at
https://github.com/openbsd/src/blob/master/sys/arch/sparc64/dev/fb.c#L511: 
since the
addr word isn't part of the IEEE-1275 specification, it is currently 
unimplemented in
OpenBIOS.

Why is addr needed here? Does the fb.c driver try and change these values 
rather than
just read them?


ATB,

Mark.



Re: sparc64 boot issue on qemu

2020-05-31 Thread Mark Cave-Ayland
On 30/05/2020 10:54, Otto Moerbeek wrote:

> https://cdn.openbsd.org/pub/OpenBSD/snapshots/sparc64/
> contains the unpatched miniroot.
> 
> https://www.drijf.net/openbsd/disk.qcow2
> 
> is the disk image based on the miniroot containing the patch in the
> firts post in this thread.
> 
> Thanks for looking into this.
> 
> Note that we did *not* observe boot failure on any real sparc64
> hardware. The bootblock changes I did for the 6.7 release were tested
> on many different machines.

Thanks for the test case which enables me to reproduce the issue. With 
?fcode-verbose
enabled you see this at the end of the FCode execution:

...
...
5acf :  [ 0x8b7 ]
5ad0 : b(lit) [ 0x10 ]
5ad6 :  [ 0x81e ]
5ad7 : 0= [ 0x34 ]
5ad8 : swap [ 0x49 ]
5ad9 : drop [ 0x46 ]
5ada : b?branch [ 0x14 ]
   (offset) 5
5ade : (compile)  [ 0x8c8 ]
5adf : (compile) b(>resolve) [ 0xb2 ]
OpenBSD IEEE 1275 Bootblock 2.0
Booting from device /pci@1fe,0/pci@1,1/ide@3/ide@1/cdrom@0
Try superblock read
FFS v1
ufs-open complete
.Looking for ofwboot in directory...
.
..
ofwboot
Found it
.Loading 1a1c8  bytes of file...
Copying 2000 bytes to 4000
Copying 2000 bytes to 6000
Copying 2000 bytes to 8000
Copying 2000 bytes to a000
Copying 2000 bytes to c000
Copying 2000 bytes to e000
Copying 2000 bytes to 1
Copying 2000 bytes to 12000
Copying 2000 bytes to 14000
Copying 2000 bytes to 16000
Copying 2000 bytes to 18000
Copying 2000 bytes to 1a000
Copying 2000 bytes to 1c000
Copying 2000 bytes to 1e000
5ae0 : expect [ 0x8a ]


Now that 0x8a is completely wrong since according to
https://github.com/openbsd/src/blob/master/sys/arch/sparc64/stand/bootblk/bootblk.fth
the last instruction should be exit which is 0x33.

Since the FCode itself is located at load-base (0x4000) it looks to me from the 
above
debug that you're loading ofwboot at the same address, overwriting the FCode. 
Once
do-boot has finished executing, the FCode interpreter returns to execute the 
exit
word which has now been overwritten: so instead of returning to the updated 
client
context via exit to execute ofwboot, it executes expect which asks for input 
from the
keyboard and then crashes because the stack is incorrect.

My recommendation would be to load ofwboot at 0x6000 instead of 0x4000 which I
believe will fix the issue. It's interesting you mention that this works on real
hardware, since it doesn't agree with my reading of the IEEE-1275 specification 
so
you're certainly relying on some undocumented behaviour here.


ATB,

Mark.



Re: userland clock_gettime proof of concept

2020-05-31 Thread Paul Irofti
On Sun, May 31, 2020 at 12:25:00AM -0400, George Koehler wrote:
> On Sat, 30 May 2020 19:21:30 +0300
> Paul Irofti  wrote:
> 
> > Here is an updated diff with no libc bump.  Please use this one for
> > further testing.
> 
> Your diff does amd64.
> Here is a diff to add macppc.  Apply after your diff.

Cool! Thanks for doing this!

> I have only tested clock_gettime(2) with CLOCK_REALTIME,
> by doing loops in Ruby like, $ ruby27 -e '1.times{p Time.now}'
> The time increased steadily, and ktrace showed only a few system calls
> to clock_gettime(2).

I am attaching a diff that includes minimal regression tests for this.
You can also try testing with real programs such as Firefox.

> I changed __amd64 to __amd64__ because I didn't find __powerpc.  I'm
> not sure, but one might move the list of arches to dlfcn/Makefile.inc
> and do -DTIMEKEEP, like how thread/Makefile.inc does -DFUTEX.  One
> might drop the tc_get_timecount function pointer and just always call
> the function #ifdef TIMEKEEP.

That could work. First we have to decide on a name. Or maybe we already
have, I don't know.

> PowerPC Mac OS X had a userland gettimeofday(2) using the cpu's
> timebase and a "common page" from the kernel.  Their common page also
> had executable code for gettimeofday, memcpy, pthread_self, and a few
> other functions.  --George

That's a no-no for security reasons. The diff looks good. Please try it
with more tests and real programs and report back.


diff --git regress/lib/libc/timekeep/Makefile regress/lib/libc/timekeep/Makefile
new file mode 100644
index 000..a7f3080290d
--- /dev/null
+++ regress/lib/libc/timekeep/Makefile
@@ -0,0 +1,5 @@
+#  $OpenBSD$
+
+PROGS= test_clock_gettime test_time_skew test_gettimeofday
+
+.include 
diff --git regress/lib/libc/timekeep/test_clock_gettime.c 
regress/lib/libc/timekeep/test_clock_gettime.c
new file mode 100644
index 000..859ec368215
--- /dev/null
+++ regress/lib/libc/timekeep/test_clock_gettime.c
@@ -0,0 +1,43 @@
+/* $OpenBSD$ */
+/*
+ * Copyright (c) 2020 Paul Irofti 
+ *
+ * Permission to use, copy, modify, and distribute this software for any
+ * purpose with or without fee is hereby granted, provided that the above
+ * copyright notice and this permission notice appear in all copies.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
+ * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
+ * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
+ * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
+ * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
+ * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
+ * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
+ */
+
+#include 
+#include 
+
+#define ASSERT_EQ(a, b) assert((a) == (b))
+
+void
+check()
+{
+   struct timespec tp = {0};
+
+   ASSERT_EQ(0, clock_gettime(CLOCK_REALTIME, ));
+   ASSERT_EQ(0, clock_gettime(CLOCK_MONOTONIC, ));
+   ASSERT_EQ(0, clock_gettime(CLOCK_BOOTTIME, ));
+   ASSERT_EQ(0, clock_gettime(CLOCK_UPTIME, ));
+
+
+   ASSERT_EQ(0, clock_gettime(CLOCK_PROCESS_CPUTIME_ID, ));
+   ASSERT_EQ(0, clock_gettime(CLOCK_THREAD_CPUTIME_ID, ));
+
+}
+
+int main()
+{
+   check();
+   return 0;
+}
diff --git regress/lib/libc/timekeep/test_gettimeofday.c 
regress/lib/libc/timekeep/test_gettimeofday.c
new file mode 100644
index 000..ea90a1be7e0
--- /dev/null
+++ regress/lib/libc/timekeep/test_gettimeofday.c
@@ -0,0 +1,37 @@
+/* $OpenBSD$ */
+/*
+ * Copyright (c) 2020 Paul Irofti 
+ *
+ * Permission to use, copy, modify, and distribute this software for any
+ * purpose with or without fee is hereby granted, provided that the above
+ * copyright notice and this permission notice appear in all copies.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
+ * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
+ * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
+ * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
+ * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
+ * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
+ * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
+ */
+
+#include 
+#include 
+
+#define ASSERT_EQ(a, b) assert((a) == (b))
+
+void
+check()
+{
+   struct timeval tv = {0};
+   struct timezone tzp;
+
+   ASSERT_EQ(0, gettimeofday(, NULL));
+   ASSERT_EQ(0, gettimeofday(, ));
+}
+
+int main()
+{
+   check();
+   return 0;
+}
diff --git regress/lib/libc/timekeep/test_time_skew.c 
regress/lib/libc/timekeep/test_time_skew.c
new file mode 100644
index 000..dfa9481c091
--- /dev/null
+++ regress/lib/libc/timekeep/test_time_skew.c
@@ -0,0 +1,55 @@
+/* $OpenBSD$ */
+/*
+ * Copyright (c) 2020 Paul Irofti 
+ *
+ * Permission to 

FS poll(2)/select(2) behavior with kqfilter handlers

2020-05-31 Thread Martin Pieuchot
FS poll handlers always return true kqueue handlers are different.  So
the diff below introduces a new NOTE_IMM hint to be able to match the
existing poll(2)/select(2) behavior.

This hint is obviously kernel-only.  This is related to the NFS poller
discussion so I'm bringing the diff now in order to move forward.

Comments, oks?

Index: sys/event.h
===
RCS file: /cvs/src/sys/sys/event.h,v
retrieving revision 1.38
diff -u -p -r1.38 event.h
--- sys/event.h 25 May 2020 15:54:10 -  1.38
+++ sys/event.h 31 May 2020 08:34:38 -
@@ -128,6 +128,10 @@ struct klist {
 };
 
 #ifdef _KERNEL
+/*
+ * data/hint flags for EVFILT_{READ|WRITE}, not shared with userspace
+ */
+#define NOTE_IMM   0x1000  /* Immediate read event */
 
 #define EVFILT_MARKER  0xf /* placemarker for tailq */
 
Index: isofs/cd9660/cd9660_vnops.c
===
RCS file: /cvs/src/sys/isofs/cd9660/cd9660_vnops.c,v
retrieving revision 1.83
diff -u -p -r1.83 cd9660_vnops.c
--- isofs/cd9660/cd9660_vnops.c 7 Apr 2020 13:27:51 -   1.83
+++ isofs/cd9660/cd9660_vnops.c 31 May 2020 08:34:38 -
@@ -1036,6 +1036,9 @@ filt_cd9660read(struct knote *kn, long h
return (1);
}
 
+   if (kn->kn_sfflags & NOTE_IMM)
+   return (1);
+
return (kn->kn_data != 0);
 }
 
Index: miscfs/fuse/fuse_vnops.c
===
RCS file: /cvs/src/sys/miscfs/fuse/fuse_vnops.c,v
retrieving revision 1.59
diff -u -p -r1.59 fuse_vnops.c
--- miscfs/fuse/fuse_vnops.c7 Apr 2020 13:27:51 -   1.59
+++ miscfs/fuse/fuse_vnops.c31 May 2020 08:34:38 -
@@ -188,6 +188,9 @@ filt_fusefsread(struct knote *kn, long h
return (1);
}
 
+   if (kn->kn_sfflags & NOTE_IMM)
+   return (1);
+
return (kn->kn_data != 0);
 }
 
Index: msdosfs/msdosfs_vnops.c
===
RCS file: /cvs/src/sys/msdosfs/msdosfs_vnops.c,v
retrieving revision 1.132
diff -u -p -r1.132 msdosfs_vnops.c
--- msdosfs/msdosfs_vnops.c 7 Apr 2020 13:27:52 -   1.132
+++ msdosfs/msdosfs_vnops.c 31 May 2020 08:34:38 -
@@ -2013,6 +2013,10 @@ filt_msdosfsread(struct knote *kn, long 
kn->kn_fflags |= NOTE_EOF;
return (1);
}
+
+   if (kn->kn_sfflags & NOTE_IMM)
+   return (1);
+
return (kn->kn_data != 0);
 }
 
Index: nfs/nfs_kq.c
===
RCS file: /cvs/src/sys/nfs/nfs_kq.c,v
retrieving revision 1.30
diff -u -p -r1.30 nfs_kq.c
--- nfs/nfs_kq.c7 Apr 2020 13:27:52 -   1.30
+++ nfs/nfs_kq.c31 May 2020 08:36:21 -
@@ -234,6 +234,10 @@ filt_nfsread(struct knote *kn, long hint
kn->kn_fflags |= NOTE_EOF;
return (1);
}
+
+   if (kn->kn_sfflags & NOTE_IMM)
+   return (1);
+
 return (kn->kn_data != 0);
 }
 
Index: tmpfs/tmpfs_vnops.c
===
RCS file: /cvs/src/sys/tmpfs/tmpfs_vnops.c,v
retrieving revision 1.40
diff -u -p -r1.40 tmpfs_vnops.c
--- tmpfs/tmpfs_vnops.c 7 Apr 2020 13:27:52 -   1.40
+++ tmpfs/tmpfs_vnops.c 31 May 2020 08:34:38 -
@@ -2665,6 +2665,9 @@ filt_tmpfsread(struct knote *kn, long hi
return (1);
}
 
+   if (kn->kn_sfflags & NOTE_IMM)
+   return (1);
+
return (kn->kn_data != 0);
 }
 
Index: ufs/ufs/ufs_vnops.c
===
RCS file: /cvs/src/sys/ufs/ufs/ufs_vnops.c,v
retrieving revision 1.150
diff -u -p -r1.150 ufs_vnops.c
--- ufs/ufs/ufs_vnops.c 7 Apr 2020 13:27:52 -   1.150
+++ ufs/ufs/ufs_vnops.c 31 May 2020 08:34:38 -
@@ -1973,6 +1973,9 @@ filt_ufsread(struct knote *kn, long hint
return (1);
}
 
+   if (kn->kn_sfflags & NOTE_IMM)
+   return (1);
+
return (kn->kn_data != 0);
 }
 



Re: sparc64 boot issue on qemu

2020-05-31 Thread Otto Moerbeek
On Sun, May 31, 2020 at 09:49:34AM +0100, Mark Cave-Ayland wrote:

> On 30/05/2020 10:54, Otto Moerbeek wrote:
> 
> > https://cdn.openbsd.org/pub/OpenBSD/snapshots/sparc64/
> > contains the unpatched miniroot.
> > 
> > https://www.drijf.net/openbsd/disk.qcow2
> > 
> > is the disk image based on the miniroot containing the patch in the
> > firts post in this thread.
> > 
> > Thanks for looking into this.
> > 
> > Note that we did *not* observe boot failure on any real sparc64
> > hardware. The bootblock changes I did for the 6.7 release were tested
> > on many different machines.
> 
> Thanks for the test case which enables me to reproduce the issue. With 
> ?fcode-verbose
> enabled you see this at the end of the FCode execution:
> 
> ...
> ...
> 5acf :  [ 0x8b7 ]
> 5ad0 : b(lit) [ 0x10 ]
> 5ad6 :  [ 0x81e ]
> 5ad7 : 0= [ 0x34 ]
> 5ad8 : swap [ 0x49 ]
> 5ad9 : drop [ 0x46 ]
> 5ada : b?branch [ 0x14 ]
>(offset) 5
> 5ade : (compile)  [ 0x8c8 ]
> 5adf : (compile) b(>resolve) [ 0xb2 ]
> OpenBSD IEEE 1275 Bootblock 2.0
> Booting from device /pci@1fe,0/pci@1,1/ide@3/ide@1/cdrom@0
> Try superblock read
> FFS v1
> ufs-open complete
> .Looking for ofwboot in directory...
> .
> ..
> ofwboot
> Found it
> .Loading 1a1c8  bytes of file...
> Copying 2000 bytes to 4000
> Copying 2000 bytes to 6000
> Copying 2000 bytes to 8000
> Copying 2000 bytes to a000
> Copying 2000 bytes to c000
> Copying 2000 bytes to e000
> Copying 2000 bytes to 1
> Copying 2000 bytes to 12000
> Copying 2000 bytes to 14000
> Copying 2000 bytes to 16000
> Copying 2000 bytes to 18000
> Copying 2000 bytes to 1a000
> Copying 2000 bytes to 1c000
> Copying 2000 bytes to 1e000
> 5ae0 : expect [ 0x8a ]
> 
> 
> Now that 0x8a is completely wrong since according to
> https://github.com/openbsd/src/blob/master/sys/arch/sparc64/stand/bootblk/bootblk.fth
> the last instruction should be exit which is 0x33.
> 
> Since the FCode itself is located at load-base (0x4000) it looks to me from 
> the above
> debug that you're loading ofwboot at the same address, overwriting the FCode. 
> Once
> do-boot has finished executing, the FCode interpreter returns to execute the 
> exit
> word which has now been overwritten: so instead of returning to the updated 
> client
> context via exit to execute ofwboot, it executes expect which asks for input 
> from the
> keyboard and then crashes because the stack is incorrect.
> 
> My recommendation would be to load ofwboot at 0x6000 instead of 0x4000 which I
> believe will fix the issue. It's interesting you mention that this works on 
> real
> hardware, since it doesn't agree with my reading of the IEEE-1275 
> specification so
> you're certainly relying on some undocumented behaviour here.
> 
> 
> ATB,
> 
> Mark.

Thanks, the following works indeed. 

-Otto

Index: bootblk.fth
===
RCS file: /cvs/src/sys/arch/sparc64/stand/bootblk/bootblk.fth,v
retrieving revision 1.9
diff -u -p -r1.9 bootblk.fth
--- bootblk.fth 2 Apr 2020 06:06:22 -   1.9
+++ bootblk.fth 31 May 2020 13:17:25 -
@@ -716,7 +716,8 @@ create cur-blockno -1 l, -1 l,  \ Curren
 2drop
 ;
 
-" load-base " evaluate constant loader-base
+\\ Do not overwrite bootblocks
+" load-base " evaluate 2000 + constant loader-base
 
 : load-file-signon ( load-file len boot-path len -- load-file len boot-path 
len )
." Loading file" space 2over type cr ." from device" space 2dup type cr



Re: userland clock_gettime proof of concept

2020-05-31 Thread Mark Kettenis
> Date: Sun, 31 May 2020 00:25:00 -0400
> From: George Koehler 
> 
> On Sat, 30 May 2020 19:21:30 +0300
> Paul Irofti  wrote:
> 
> > Here is an updated diff with no libc bump.  Please use this one for
> > further testing.
> 
> Your diff does amd64.
> Here is a diff to add macppc.  Apply after your diff.
> 
> I have only tested clock_gettime(2) with CLOCK_REALTIME,
> by doing loops in Ruby like, $ ruby27 -e '1.times{p Time.now}'
> The time increased steadily, and ktrace showed only a few system calls
> to clock_gettime(2).
> 
> I copied ppc_mftb() from /sys/arch/powerpc/powerpc/cpu.h and renamed
> it to tc_get_timecount_md(), because #include  doesn't
> provide ppc_mftb() if not _KERNEL.  It would be better to edit the
> kernel headers to give ppc_mftb() if _LIBC, but I haven't done that.

I think copying the code is fine.  It's only two lines of inline
assembly and I don't think it will ever change (despite what we just
went through with the clang switch) ;).

> I changed __amd64 to __amd64__ because I didn't find __powerpc.  I'm
> not sure, but one might move the list of arches to dlfcn/Makefile.inc
> and do -DTIMEKEEP, like how thread/Makefile.inc does -DFUTEX.  One
> might drop the tc_get_timecount function pointer and just always call
> the function #ifdef TIMEKEEP.

Yes, we prefer the __xxx__ variants in OpenBSD code; thanks for
catching that.  The benefit of the TIMEKEEP define would be that we
can eliminate the fallback code completely on architectures that don't
implement this functionality.

> PowerPC Mac OS X had a userland gettimeofday(2) using the cpu's
> timebase and a "common page" from the kernel.  Their common page also
> had executable code for gettimeofday, memcpy, pthread_self, and a few
> other functions.  --George
> 
> Index: lib/libc/arch/powerpc/gen/Makefile.inc
> ===
> RCS file: /cvs/src/lib/libc/arch/powerpc/gen/Makefile.inc,v
> retrieving revision 1.14
> diff -u -p -r1.14 Makefile.inc
> --- lib/libc/arch/powerpc/gen/Makefile.inc12 Apr 2012 16:14:09 -  
> 1.14
> +++ lib/libc/arch/powerpc/gen/Makefile.inc31 May 2020 03:20:58 -
> @@ -3,3 +3,4 @@ SRCS+= fabs.c
>  SRCS+= fpgetmask.c fpsetmask.c
>  SRCS+= fpgetround.c fpsetround.c
>  SRCS+= fpgetsticky.c fpsetsticky.c
> +SRCS+= usertc.c
> --- /dev/null Sat May 30 23:21:20 2020
> +++ lib/libc/arch/powerpc/gen/usertc.cSat May 30 19:37:59 2020
> @@ -0,0 +1,44 @@
> +/*   $OpenBSD$ */
> +/*
> + * Copyright (C) 1995, 1996 Wolfgang Solfrank.
> + * Copyright (C) 1995, 1996 TooLs GmbH.
> + * All rights reserved.
> + *
> + * Redistribution and use in source and binary forms, with or without
> + * modification, are permitted provided that the following conditions
> + * are met:
> + * 1. Redistributions of source code must retain the above copyright
> + *notice, this list of conditions and the following disclaimer.
> + * 2. Redistributions in binary form must reproduce the above copyright
> + *notice, this list of conditions and the following disclaimer in the
> + *documentation and/or other materials provided with the distribution.
> + * 3. All advertising materials mentioning features or use of this software
> + *must display the following acknowledgement:
> + *   This product includes software developed by TooLs GmbH.
> + * 4. The name of TooLs GmbH may not be used to endorse or promote products
> + *derived from this software without specific prior written permission.
> + *
> + * THIS SOFTWARE IS PROVIDED BY TOOLS GMBH ``AS IS'' AND ANY EXPRESS OR
> + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
> + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
> + * IN NO EVENT SHALL TOOLS GMBH BE LIABLE FOR ANY DIRECT, INDIRECT, 
> INCIDENTAL,
> + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED 
> TO,
> + * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR 
> PROFITS;
> + * OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
> + * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
> + * OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
> + * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +#include 
> +
> +uint64_t
> +tc_get_timecount_md(void)
> +{
> + u_long scratch;
> + u_int64_t tb;
> +
> + __asm volatile ("1: mftbu %0; mftb %L0; mftbu %1;"
> + " cmpw 0,%0,%1; bne 1b" : "=r"(tb), "=r"(scratch));
> + return tb;
> +}
> --- lib/libc/dlfcn/init.c.before  Sat May 30 23:26:35 2020
> +++ lib/libc/dlfcn/init.c Sat May 30 18:00:45 2020
> @@ -70,7 +70,7 @@
>  
>  /* provide definitions for these */
>  const dl_cb *_dl_cb __relro = NULL;
> -#if defined(__amd64)
> +#if defined(__amd64__) || defined(__powerpc__)
>  uint64_t (*const tc_get_timecount)(void) = tc_get_timecount_md;
>  #else
>  uint64_t (*const tc_get_timecount)(void) = NULL;
> --- 

Re: userland clock_gettime proof of concept

2020-05-31 Thread Mark Kettenis
> Date: Sun, 31 May 2020 00:25:00 -0400
> From: George Koehler 
> 
> On Sat, 30 May 2020 19:21:30 +0300
> Paul Irofti  wrote:
> 
> > Here is an updated diff with no libc bump.  Please use this one for
> > further testing.
> 
> Your diff does amd64.
> Here is a diff to add macppc.  Apply after your diff.
> 
> I have only tested clock_gettime(2) with CLOCK_REALTIME,
> by doing loops in Ruby like, $ ruby27 -e '1.times{p Time.now}'
> The time increased steadily, and ktrace showed only a few system calls
> to clock_gettime(2).
> 
> I copied ppc_mftb() from /sys/arch/powerpc/powerpc/cpu.h and renamed
> it to tc_get_timecount_md(), because #include  doesn't
> provide ppc_mftb() if not _KERNEL.  It would be better to edit the
> kernel headers to give ppc_mftb() if _LIBC, but I haven't done that.
> 
> I changed __amd64 to __amd64__ because I didn't find __powerpc.  I'm
> not sure, but one might move the list of arches to dlfcn/Makefile.inc
> and do -DTIMEKEEP, like how thread/Makefile.inc does -DFUTEX.  One
> might drop the tc_get_timecount function pointer and just always call
> the function #ifdef TIMEKEEP.
> 
> PowerPC Mac OS X had a userland gettimeofday(2) using the cpu's
> timebase and a "common page" from the kernel.  Their common page also
> had executable code for gettimeofday, memcpy, pthread_self, and a few
> other functions.  --George

Oh, and on the diff itself:

> Index: lib/libc/arch/powerpc/gen/Makefile.inc
> ===
> RCS file: /cvs/src/lib/libc/arch/powerpc/gen/Makefile.inc,v
> retrieving revision 1.14
> diff -u -p -r1.14 Makefile.inc
> --- lib/libc/arch/powerpc/gen/Makefile.inc12 Apr 2012 16:14:09 -  
> 1.14
> +++ lib/libc/arch/powerpc/gen/Makefile.inc31 May 2020 03:20:58 -
> @@ -3,3 +3,4 @@ SRCS+= fabs.c
>  SRCS+= fpgetmask.c fpsetmask.c
>  SRCS+= fpgetround.c fpsetround.c
>  SRCS+= fpgetsticky.c fpsetsticky.c
> +SRCS+= usertc.c
> --- /dev/null Sat May 30 23:21:20 2020
> +++ lib/libc/arch/powerpc/gen/usertc.cSat May 30 19:37:59 2020
> @@ -0,0 +1,44 @@
> +/*   $OpenBSD$ */
> +/*
> + * Copyright (C) 1995, 1996 Wolfgang Solfrank.
> + * Copyright (C) 1995, 1996 TooLs GmbH.
> + * All rights reserved.
> + *
> + * Redistribution and use in source and binary forms, with or without
> + * modification, are permitted provided that the following conditions
> + * are met:
> + * 1. Redistributions of source code must retain the above copyright
> + *notice, this list of conditions and the following disclaimer.
> + * 2. Redistributions in binary form must reproduce the above copyright
> + *notice, this list of conditions and the following disclaimer in the
> + *documentation and/or other materials provided with the distribution.
> + * 3. All advertising materials mentioning features or use of this software
> + *must display the following acknowledgement:
> + *   This product includes software developed by TooLs GmbH.
> + * 4. The name of TooLs GmbH may not be used to endorse or promote products
> + *derived from this software without specific prior written permission.
> + *
> + * THIS SOFTWARE IS PROVIDED BY TOOLS GMBH ``AS IS'' AND ANY EXPRESS OR
> + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
> + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
> + * IN NO EVENT SHALL TOOLS GMBH BE LIABLE FOR ANY DIRECT, INDIRECT, 
> INCIDENTAL,
> + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED 
> TO,
> + * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR 
> PROFITS;
> + * OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
> + * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
> + * OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
> + * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +#include 
> +
> +uint64_t
> +tc_get_timecount_md(void)
> +{
> + u_long scratch;
> + u_int64_t tb;
> +
> + __asm volatile ("1: mftbu %0; mftb %L0; mftbu %1;"
> + " cmpw 0,%0,%1; bne 1b" : "=r"(tb), "=r"(scratch));
> + return tb;
> +}

Would be good not to mux uint64_t and u_int64_t in the same bit of
code.  I'd use uint64_t and replace u_long with uint32_t.


> --- lib/libc/dlfcn/init.c.before  Sat May 30 23:26:35 2020
> +++ lib/libc/dlfcn/init.c Sat May 30 18:00:45 2020
> @@ -70,7 +70,7 @@
>  
>  /* provide definitions for these */
>  const dl_cb *_dl_cb __relro = NULL;
> -#if defined(__amd64)
> +#if defined(__amd64__) || defined(__powerpc__)
>  uint64_t (*const tc_get_timecount)(void) = tc_get_timecount_md;
>  #else
>  uint64_t (*const tc_get_timecount)(void) = NULL;
> --- sys/arch/macppc/macppc/clock.c.before Sat May 30 23:28:00 2020
> +++ sys/arch/macppc/macppc/clock.cSat May 30 20:35:47 2020
> @@ -57,7 +57,7 @@
>  static int32_t ticks_per_intr;
>  
>  static struct timecounter tb_timecounter = {
> - 

Re: [patch] azalia: Intel 300 Series HD Audio

2020-05-31 Thread Bruno Flueckiger
On 31.05., Benjamin Baier wrote:
> On Fri, 29 May 2020 11:25:44 +0200
> Bruno Flueckiger  wrote:
>
> > Hi,
> >
> > My brand new laptop HP EliteBook 850 G6 comes with an Intel 300 Series
> > HD Audio device rev 0x11. The device shows up as not configured in the
> > dmesg. The PCI config space of the device identifies its subclass as
> > PCI_SUBCLASS_MULTIMEDIA_AUDIO instead of PCI_SUBCLASS_MULTIMEDIA_HDAUDIO
> >
> > The patch below makes the device work just fine on my laptop.
> >
> > Cheers,
> > Bruno
> >
> > Index: sys/dev/pci/azalia.c
> > ===
> > RCS file: /cvs/src/sys/dev/pci/azalia.c,v
> > retrieving revision 1.255
> > diff -u -p -r1.255 azalia.c
> > --- sys/dev/pci/azalia.c18 Apr 2020 21:55:56 -  1.255
> > +++ sys/dev/pci/azalia.c28 May 2020 13:48:10 -
> > @@ -481,7 +481,8 @@ azalia_pci_match(struct device *parent,
> >
> > pa = aux;
> > if (PCI_CLASS(pa->pa_class) == PCI_CLASS_MULTIMEDIA
> > -   && PCI_SUBCLASS(pa->pa_class) == PCI_SUBCLASS_MULTIMEDIA_HDAUDIO)
> > +   && (PCI_SUBCLASS(pa->pa_class) == PCI_SUBCLASS_MULTIMEDIA_HDAUDIO
> > +   || PCI_SUBCLASS(pa->pa_class) == PCI_SUBCLASS_MULTIMEDIA_AUDIO))
> > return 1;
> > return 0;
> >  }
> >
>
> Hi.
>
> Does your Laptop run with the latest BIOS? There was one released on May 11th.
> https://support.hp.com/de-de/drivers/selfservice/swdetails/hp-elitebook-850-g6-notebook-pc/26609805/swItemId/ob-251060-1
> The release notes state: Fixes an issue where the audio on the system stops 
> functioning after the Intel Active Management Technology (AMT) option is 
> disabled.
>
> The azalia patch is in but I would prefer HP to fix their BIOS instead.
>
> Greetings Ben
>

Hi Ben,

I've upgraded the firmware to the latest available release, but the
audio device still reports with the wrong subclass in its PCI config
space.

I agree that HP should rather fix their firmware. I still try to find
out how to report this problem to HP in a way that doesn't get ignored.

Cheers,
Bruno



Re: smtpd: make smarthost to use SNI when relaying

2020-05-31 Thread Sebastien Marie
Hi,

updated diff after millert@ and beck@ remarks:
- use union to collapse in_addr + in6_addr
- doesn't allocate buffer and directly use s->relay->domain->name

Thanks.
-- 
Sebastien Marie


diff 73b535ef4537e8454483912fc3420bc304759e96 /home/semarie/repos/openbsd/src
blob - d384692a0e43de47d645142a6b99e72b7d83b687
file + usr.sbin/smtpd/mta_session.c
--- usr.sbin/smtpd/mta_session.c
+++ usr.sbin/smtpd/mta_session.c
@@ -26,6 +26,7 @@
 #include 
 #include 
 
+#include 
 #include 
 #include 
 #include 
@@ -1604,6 +1605,10 @@ mta_cert_init_cb(void *arg, int status, const char *na
struct mta_session *s = arg;
void *ssl;
char *xname = NULL, *xcert = NULL;
+   union {
+   struct in_addr in4;
+   struct in6_addr in6;
+   } addrbuf;
 
if (s->flags & MTA_WAIT)
mta_tree_pop(_tls_init, s->id);
@@ -1623,6 +1628,22 @@ mta_cert_init_cb(void *arg, int status, const char *na
free(xcert);
if (ssl == NULL)
fatal("mta: ssl_mta_init");
+
+   /*
+* RFC4366 (SNI): Literal IPv4 and IPv6 addresses are not
+* permitted in "HostName".
+*/
+   if (s->relay->domain->as_host == 1) {
+   if (inet_pton(AF_INET, s->relay->domain->name, ) != 1 &&
+   inet_pton(AF_INET6, s->relay->domain->name, ) != 1) 
{
+   log_debug("%016"PRIx64" mta tls setting SNI name=%s",
+   s->id, s->relay->domain->name);
+   if (SSL_set_tlsext_host_name(ssl, 
s->relay->domain->name) == 0)
+   log_warnx("%016"PRIx64" mta tls setting SNI 
failed",
+  s->id);
+   }
+   }
+
io_start_tls(s->io, ssl);
 }
 



Re: userland clock_gettime proof of concept

2020-05-31 Thread Paul Irofti

On 2020-05-31 20:46, Mark Kettenis wrote:

From: Paul Irofti 
Date: Sun, 31 May 2020 19:12:54 +0300

On 2020-05-31 18:25, Theo de Raadt wrote:

Mark Kettenis  wrote:


I changed __amd64 to __amd64__ because I didn't find __powerpc.  I'm
not sure, but one might move the list of arches to dlfcn/Makefile.inc
and do -DTIMEKEEP, like how thread/Makefile.inc does -DFUTEX.  One
might drop the tc_get_timecount function pointer and just always call
the function #ifdef TIMEKEEP.


Yes, we prefer the __xxx__ variants in OpenBSD code; thanks for
catching that.  The benefit of the TIMEKEEP define would be that we
can eliminate the fallback code completely on architectures that don't
implement this functionality.


...


Yeah, I just followed the dlfcn/dlfcn_stubs.c example from libc. Which I
see now it is commented out...


--- lib/libc/dlfcn/init.c.beforeSat May 30 23:26:35 2020
+++ lib/libc/dlfcn/init.c   Sat May 30 18:00:45 2020
@@ -70,7 +70,7 @@
   
   /* provide definitions for these */

   const dl_cb *_dl_cb __relro = NULL;
-#if defined(__amd64)
+#if defined(__amd64__) || defined(__powerpc__)
   uint64_t (*const tc_get_timecount)(void) = tc_get_timecount_md;
   #else
   uint64_t (*const tc_get_timecount)(void) = NULL;


1) I think adding _md to the name is superflous.  There will never
 be a MI version, so tc_get_timecount() is enough.


What about pvclock(4)?


What about it?  Seems to me what you're really thinking of here is how
to support more than just one timecounter for a specific architecture.
Your function pointer is not really going to help in that case.
You'll need to dispatch to the right function based on some sort of
machine-specific clock ID.

Oh and BTW, I don't think you're ever going to support pvclock(4).
Take a look at the code and think how you would do all that magic in
userland...


2) I hope we can get away from #ifdef __ arch__.
 Maybe this can be split into architectures which
a) have a function called tc_get_timecount()
 or
b) tc_get_timecount is #define'd to NULL, though I don't
   know which MD include file to do that in


If we go with something like this or with something like -DTIMEKEEP, how
do we handle the different PROTO_WRAP vs. PROTO_NORMAL declarations?
Split them in MD headers? But then we end up in the same place. Sort of.


Forget about all that for a moment.  Here is an alternative suggestion:

On sparc64 we need to support both tick_timecounter and
sys_tick_timecounter.  So we need some sort of clockid value to
distnguish between those two.  I already suggested to use the tc_user
field of the timecounter for that.  0 means that a timecounter is not
usable in userland, a (small) positive integer means a specific
timecounter type.  The code in libc will need to know whether a
particular timecounter type can be supported.  My proposal would be to
implement a function *on all architecture* that takes the clockid as
an argument and returns a pointer to the function that implements
support for that timecounter.  On architectures without support, ir
when called with a clockid that isn't supported, that function would
simply return NULL.


Sure. All architectures will register their clocks with a unique ID in 
timetc.h, right? And then we do clockfun[clockid]() in libc, right?




Re: userland clock_gettime proof of concept

2020-05-31 Thread Paul Irofti

On 2020-05-31 20:51, Theo de Raadt wrote:

(There has been some pressure to get this in before it covers all the
architectures and this kind of discussion is why I think such a
premature "and then we'll fix it in the tree" procedure is wrong).


Again, I hope not from me. I am in no rush with this diff nor do I want 
to put any pressure to get this in. I am quite happy that you feel this 
is a good thing and I am also happy that you are helping me get this in 
proper shape.




Re: userland clock_gettime proof of concept

2020-05-31 Thread Theo de Raadt
Paul Irofti  wrote:

> > Forget about all that for a moment.  Here is an alternative suggestion:
> >
> > On sparc64 we need to support both tick_timecounter and
> > sys_tick_timecounter.  So we need some sort of clockid value to
> > distnguish between those two.  I already suggested to use the tc_user
> > field of the timecounter for that.  0 means that a timecounter is not
> > usable in userland, a (small) positive integer means a specific
> > timecounter type.  The code in libc will need to know whether a
> > particular timecounter type can be supported.  My proposal would be to
> > implement a function *on all architecture* that takes the clockid as
> > an argument and returns a pointer to the function that implements
> > support for that timecounter.  On architectures without support, ir
> > when called with a clockid that isn't supported, that function would
> > simply return NULL.
> 
> Sure. All architectures will register their clocks with a unique ID in
> timetc.h, right? And then we do clockfun[clockid]() in libc, right?

No, don't do that on every call -- instead, get the function pointer once.



Re: userland clock_gettime proof of concept

2020-05-31 Thread Mark Kettenis
> From: Paul Irofti 
> Date: Sun, 31 May 2020 19:12:54 +0300
> 
> On 2020-05-31 18:25, Theo de Raadt wrote:
> > Mark Kettenis  wrote:
> > 
> >>> I changed __amd64 to __amd64__ because I didn't find __powerpc.  I'm
> >>> not sure, but one might move the list of arches to dlfcn/Makefile.inc
> >>> and do -DTIMEKEEP, like how thread/Makefile.inc does -DFUTEX.  One
> >>> might drop the tc_get_timecount function pointer and just always call
> >>> the function #ifdef TIMEKEEP.
> >>
> >> Yes, we prefer the __xxx__ variants in OpenBSD code; thanks for
> >> catching that.  The benefit of the TIMEKEEP define would be that we
> >> can eliminate the fallback code completely on architectures that don't
> >> implement this functionality.
> > 
> > ...
> 
> Yeah, I just followed the dlfcn/dlfcn_stubs.c example from libc. Which I 
> see now it is commented out...
> 
> >>> --- lib/libc/dlfcn/init.c.before  Sat May 30 23:26:35 2020
> >>> +++ lib/libc/dlfcn/init.c Sat May 30 18:00:45 2020
> >>> @@ -70,7 +70,7 @@
> >>>   
> >>>   /* provide definitions for these */
> >>>   const dl_cb *_dl_cb __relro = NULL;
> >>> -#if defined(__amd64)
> >>> +#if defined(__amd64__) || defined(__powerpc__)
> >>>   uint64_t (*const tc_get_timecount)(void) = tc_get_timecount_md;
> >>>   #else
> >>>   uint64_t (*const tc_get_timecount)(void) = NULL;
> > 
> > 1) I think adding _md to the name is superflous.  There will never
> > be a MI version, so tc_get_timecount() is enough.
> 
> What about pvclock(4)?

What about it?  Seems to me what you're really thinking of here is how
to support more than just one timecounter for a specific architecture.
Your function pointer is not really going to help in that case.
You'll need to dispatch to the right function based on some sort of
machine-specific clock ID.

Oh and BTW, I don't think you're ever going to support pvclock(4).
Take a look at the code and think how you would do all that magic in
userland...

> > 2) I hope we can get away from #ifdef __ arch__.
> > Maybe this can be split into architectures which
> >a) have a function called tc_get_timecount()
> > or
> >b) tc_get_timecount is #define'd to NULL, though I don't
> >   know which MD include file to do that in
> 
> If we go with something like this or with something like -DTIMEKEEP, how 
> do we handle the different PROTO_WRAP vs. PROTO_NORMAL declarations? 
> Split them in MD headers? But then we end up in the same place. Sort of.

Forget about all that for a moment.  Here is an alternative suggestion:

On sparc64 we need to support both tick_timecounter and
sys_tick_timecounter.  So we need some sort of clockid value to
distnguish between those two.  I already suggested to use the tc_user
field of the timecounter for that.  0 means that a timecounter is not
usable in userland, a (small) positive integer means a specific
timecounter type.  The code in libc will need to know whether a
particular timecounter type can be supported.  My proposal would be to
implement a function *on all architecture* that takes the clockid as
an argument and returns a pointer to the function that implements
support for that timecounter.  On architectures without support, ir
when called with a clockid that isn't supported, that function would
simply return NULL.



Re: userland clock_gettime proof of concept

2020-05-31 Thread Theo de Raadt
Mark Kettenis  wrote:

> On sparc64 we need to support both tick_timecounter and
> sys_tick_timecounter.  So we need some sort of clockid value to
> distnguish between those two.  I already suggested to use the tc_user
> field of the timecounter for that.  0 means that a timecounter is not
> usable in userland, a (small) positive integer means a specific
> timecounter type.  The code in libc will need to know whether a
> particular timecounter type can be supported.  My proposal would be to
> implement a function *on all architecture* that takes the clockid as
> an argument and returns a pointer to the function that implements
> support for that timecounter.  On architectures without support, ir
> when called with a clockid that isn't supported, that function would
> simply return NULL.

I agree.

The alternative being tried here is to do it all at link-time.  I don't
think that is flexible enough to cover all the architectures.
Determining this at startup, and following a pointer is the natural
approach.

(There has been some pressure to get this in before it covers all the
architectures and this kind of discussion is why I think such a
premature "and then we'll fix it in the tree" procedure is wrong).



loongson/macppc default and staff login class limits coherency

2020-05-31 Thread Lucas
Hello tech@,

While using my macppc to try a port I noticed there is a discrepancy
between login classes limits: default datasize-cur is defined to 2048M
while staff datasize-cur is 512M. This happened in
/src/etc/etc.macppc/login.conf 1.12 as "grow limits a bit because clang
is a pig." A similar change was introduced for loongson in that commit
too.

I checked all the arch-specific login.conf for similar discrepancy just
between default and staff classes. Only these are affected, and only
datasize-cur is below default's. Find a diff for bumping it.

Out of curiosity, what is the rationale for bumping limits in one arch
but not the others? For example, default class' datasize-cur in amd64
is only 768MB, despite them being more beefy in general than most of
macppc devices, yet this ship a 2048MB datasize-cur.

-Lucas

PD: is there a easy way in CVS to look at all the affected files by
one commit, very much like Git? for this I used `cvs log -d` and a AWK
script)


Index: etc.loongson/login.conf
===
RCS file: /home/cvs/src/etc/etc.loongson/login.conf,v
retrieving revision 1.13
diff -u -p -r1.13 login.conf
--- etc.loongson/login.conf 12 Mar 2020 15:32:21 -  1.13
+++ etc.loongson/login.conf 31 May 2020 20:45:30 -
@@ -71,7 +71,7 @@ daemon:\
 # Staff have fewer restrictions and can login even when nologins are set.
 #
 staff:\
-   :datasize-cur=768M:\
+   :datasize-cur=1024M:\
:datasize-max=infinity:\
:maxproc-max=512:\
:maxproc-cur=128:\
Index: etc.macppc/login.conf
===
RCS file: /home/cvs/src/etc/etc.macppc/login.conf,v
retrieving revision 1.13
diff -u -p -r1.13 login.conf
--- etc.macppc/login.conf   23 May 2020 13:16:03 -  1.13
+++ etc.macppc/login.conf   31 May 2020 20:45:50 -
@@ -70,7 +70,7 @@ daemon:\
 # Staff have fewer restrictions and can login even when nologins are set.
 #
 staff:\
-   :datasize-cur=512M:\
+   :datasize-cur=2048M:\
:datasize-max=infinity:\
:maxproc-max=512:\
:maxproc-cur=128:\



Re: Error reporting in ikev2_ike_sa_alive (was: Improve error reporting in pfkey_sa_last_used)

2020-05-31 Thread Tobias Heider
On Tue, May 26, 2020 at 12:08:08PM -0400, matthew j weaver wrote:
> During childsa last use checks, iked debug logs results, per SA, after a
> successful pfkey_sa_last_used call.
> 
> This patch makes logging behavior more closely match that, on error.
> 
> I chose log_warn instead of log_debug since iked will complain about the
> nonzero errno after pfkey_reply:
>   pfkey_sa_last_used: message: No such process
> 
> With this patch an operator can at least troubleshoot which SAs are
> causing the trouble.
> 
> Comments? Make sense?
> 
> thank you, all
> matthew weaver

I don't think this is a good idea
With your diff the log gets spammed with 'Undefined error: 0' for child SAs
that have never been used.
Also log_warn seems a bit too much as those errors are rarely serious.



Re: loongson/macppc default and staff login class limits coherency

2020-05-31 Thread Theo de Raadt
Lucas  wrote:

> Hello tech@,
> 
> While using my macppc to try a port I noticed there is a discrepancy
> between login classes limits: default datasize-cur is defined to 2048M
> while staff datasize-cur is 512M. This happened in
> /src/etc/etc.macppc/login.conf 1.12 as "grow limits a bit because clang
> is a pig." A similar change was introduced for loongson in that commit
> too.
> 
> I checked all the arch-specific login.conf for similar discrepancy just
> between default and staff classes. Only these are affected, and only
> datasize-cur is below default's. Find a diff for bumping it.
> 
> Out of curiosity, what is the rationale for bumping limits in one arch
> but not the others? For example, default class' datasize-cur in amd64
> is only 768MB, despite them being more beefy in general than most of
> macppc devices, yet this ship a 2048MB datasize-cur.

Your diff is proposing cranking the amount of memory 1 process can use,
to 3/4 of what the machine physical maxes out to.

I think that is unreasonable.

Our maximums are intended to be functional and reasonable, while yours
are not reasonable.  At that point we might as well remove all limits,
no?



Re: sparc64 boot issue on qemu

2020-05-31 Thread Theo de Raadt
Mark Cave-Ayland  wrote:

> On 30/05/2020 00:19, Jason A. Donenfeld wrote:
> 
> > Note that you need to run this with `-nographic`, because the kernel
> > crashes when trying to use vgafb on sparc64/qemu. I've witnessed two
> > varieties crashes:
> > 
> > - https://data.zx2c4.com/openbsd-6.7-sparc64-vga-panic-miniroot67.png
> > This happens when booting up miniroot67.fs
> > 
> > - 
> > https://data.zx2c4.com/openbsd-6.7-sparc64-vga-panic-after-installation.png
> > This happens after installation openbsd onto disk properly, and then
> > booting up into it.
> > 
> > Passing `-nographic` prevents these from happening, since vgafb doesn't
> > bind to anything.
> > 
> > I don't have a bsd.gdb in order to addr2line this, but if the miniroot
> > panic is related to the normal panic, and we then assume alignment
> > issues in fb_get_console_metrics, then I wonder if the below patch would
> > make a difference. On the other hand, a "data access fault" makes it
> > seem more likely that OF_interpret is just getting bogus addresses from
> > buggy qemu firmware.
> > 
> > I probably have another two hours to go in waiting for this thing to
> > build...
> > 
> > Jason
> > 
> > --- a/sys/arch/sparc64/dev/fb.c
> > +++ b/sys/arch/sparc64/dev/fb.c
> > @@ -507,6 +507,7 @@ int
> >  fb_get_console_metrics(int *fontwidth, int *fontheight, int *wtop, int 
> > *wleft)
> >  {
> > cell_t romheight, romwidth, windowtop, windowleft;
> > +   uint64_t romheight_64, romwidth_64, windowtop_64, windowleft_64;
> > 
> > /*
> >  * Get the PROM font metrics and address
> > @@ -520,10 +521,15 @@ fb_get_console_metrics(int *fontwidth, int 
> > *fontheight, int *wtop, int *wleft)
> > windowtop == 0 || windowleft == 0)
> > return (1);
> > 
> > -   *fontwidth = (int)*(uint64_t *)romwidth;
> > -   *fontheight = (int)*(uint64_t *)romheight;
> > -   *wtop = (int)*(uint64_t *)windowtop;
> > -   *wleft = (int)*(uint64_t *)windowleft;
> > +   memcpy(_64, (void *)romheight, sizeof(romheight_64));
> > +   memcpy(_64, (void *)romwidth, sizeof(romwidth_64));
> > +   memcpy(_64, (void *)windowtop, sizeof(windowtop_64));
> > +   memcpy(_64, (void *)windowleft, sizeof(windowleft_64));
> > +
> > +   *fontwidth = (int)romwidth_64;
> > +   *fontheight = (int)romheight_64;
> > +   *wtop = (int)windowtop_64;
> > +   *wleft = (int)windowleft_64;
> > 
> > return (0);
> >  }
> 
> AFAICT the problem here is the Forth being used at
> https://github.com/openbsd/src/blob/master/sys/arch/sparc64/dev/fb.c#L511: 
> since the
> addr word isn't part of the IEEE-1275 specification, it is currently 
> unimplemented in
> OpenBIOS.
> 
> Why is addr needed here? Does the fb.c driver try and change these values 
> rather than
> just read them?

Why does that matter?

sparc64 isn't a IEEE-1275 openfirmware.

It is a Sun openfirmware, meaning it is more than the vague
specification.  An emulation must be able to emulate THE REAL HARDWARE.

This should work.

For another 64-bit cell_t usage see dev/prtc.c.

For another "addr" usage, see romgetcursoraddr()



Re: smtpd: make smarthost to use SNI when relaying

2020-05-31 Thread Bob Beck
looks good to me

ok beck@

On Sun, May 31, 2020 at 03:38:00PM +0200, Sebastien Marie wrote:
> Hi,
> 
> updated diff after millert@ and beck@ remarks:
> - use union to collapse in_addr + in6_addr
> - doesn't allocate buffer and directly use s->relay->domain->name
> 
> Thanks.
> -- 
> Sebastien Marie
> 
> 
> diff 73b535ef4537e8454483912fc3420bc304759e96 /home/semarie/repos/openbsd/src
> blob - d384692a0e43de47d645142a6b99e72b7d83b687
> file + usr.sbin/smtpd/mta_session.c
> --- usr.sbin/smtpd/mta_session.c
> +++ usr.sbin/smtpd/mta_session.c
> @@ -26,6 +26,7 @@
>  #include 
>  #include 
>  
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -1604,6 +1605,10 @@ mta_cert_init_cb(void *arg, int status, const char *na
>   struct mta_session *s = arg;
>   void *ssl;
>   char *xname = NULL, *xcert = NULL;
> + union {
> + struct in_addr in4;
> + struct in6_addr in6;
> + } addrbuf;
>  
>   if (s->flags & MTA_WAIT)
>   mta_tree_pop(_tls_init, s->id);
> @@ -1623,6 +1628,22 @@ mta_cert_init_cb(void *arg, int status, const char *na
>   free(xcert);
>   if (ssl == NULL)
>   fatal("mta: ssl_mta_init");
> +
> + /*
> +  * RFC4366 (SNI): Literal IPv4 and IPv6 addresses are not
> +  * permitted in "HostName".
> +  */
> + if (s->relay->domain->as_host == 1) {
> + if (inet_pton(AF_INET, s->relay->domain->name, ) != 1 &&
> + inet_pton(AF_INET6, s->relay->domain->name, ) != 1) 
> {
> + log_debug("%016"PRIx64" mta tls setting SNI name=%s",
> + s->id, s->relay->domain->name);
> + if (SSL_set_tlsext_host_name(ssl, 
> s->relay->domain->name) == 0)
> + log_warnx("%016"PRIx64" mta tls setting SNI 
> failed",
> +s->id);
> + }
> + }
> +
>   io_start_tls(s->io, ssl);
>  }
>  
> 



Re: userland clock_gettime proof of concept

2020-05-31 Thread Theo de Raadt
Paul Irofti  wrote:

> On 2020-05-31 19:17, Theo de Raadt wrote:
> > Paul Irofti  wrote:
> >
> >> Yeah, I just followed the dlfcn/dlfcn_stubs.c example from libc. Which
> >> I see now it is commented out...
> >>
> > --- lib/libc/dlfcn/init.c.beforeSat May 30 23:26:35 2020
> > +++ lib/libc/dlfcn/init.c   Sat May 30 18:00:45 2020
> > @@ -70,7 +70,7 @@
> >  /* provide definitions for these */
> >const dl_cb *_dl_cb __relro = NULL;
> > -#if defined(__amd64)
> > +#if defined(__amd64__) || defined(__powerpc__)
> >uint64_t (*const tc_get_timecount)(void) = tc_get_timecount_md;
> >#else
> >uint64_t (*const tc_get_timecount)(void) = NULL;
> >>>
> >>> 1) I think adding _md to the name is superflous.  There will never
> >>>  be a MI version, so tc_get_timecount() is enough.
> >>
> >> What about pvclock(4)?
> >
> > What about it?  Is it MI?
> 
> It is used by two architectures. There is also glxpcib. Of course we
> can have a copy of each in arch/*/usertc.c

You plan to flip between supporting functions on the fly?

How do you know to flip?

It sounds insane and designing for a problem which doesn't exist.

Obviously on an architecture there must be *one function* that does
the job, using whatever it finds available.



Re: sparc64 boot issue on qemu

2020-05-31 Thread Klemens Nanni
On Sun, May 31, 2020 at 03:22:45PM +0200, Otto Moerbeek wrote:
> On Sun, May 31, 2020 at 09:49:34AM +0100, Mark Cave-Ayland wrote:
> > Thanks for the test case which enables me to reproduce the issue. With 
> > ?fcode-verbose
> > enabled you see this at the end of the FCode execution:
FWIW, on latest OpenBoot on machines such as the T4-2 there exists no
other variable but fcode-debug?.

> > Now that 0x8a is completely wrong since according to
> > https://github.com/openbsd/src/blob/master/sys/arch/sparc64/stand/bootblk/bootblk.fth
> > the last instruction should be exit which is 0x33.
> > 
> > Since the FCode itself is located at load-base (0x4000) it looks to me from 
> > the above
> > debug that you're loading ofwboot at the same address, overwriting the 
> > FCode. Once
> > do-boot has finished executing, the FCode interpreter returns to execute 
> > the exit
> > word which has now been overwritten: so instead of returning to the updated 
> > client
> > context via exit to execute ofwboot, it executes expect which asks for 
> > input from the
> > keyboard and then crashes because the stack is incorrect.
> > 
> > My recommendation would be to load ofwboot at 0x6000 instead of 0x4000 
> > which I
> > believe will fix the issue. It's interesting you mention that this works on 
> > real
> > hardware, since it doesn't agree with my reading of the IEEE-1275 
> > specification so
> > you're certainly relying on some undocumented behaviour here.
Neither Forth nor boot blocks are my area of good expertise, but your
analysis reads fine to me.

> Thanks, the following works indeed. 
I threw the diff (with minor crank) onto a guest domain on my T4-2 box
that I'm currently using for CURRENT ofwboot testing and it continues to
work just fine:

{0} ok setenv fcode-debug? true 
fcode-debug? =  true
{0} ok boot -V  
NOTICE: Entering OpenBoot.
NOTICE: Fetching Guest MD from HV.
NOTICE: Starting additional cpus.
NOTICE: Initializing LDC services.
NOTICE: Probing PCI devices.
NOTICE: Finished PCI probing.

SPARC T4-2, No Keyboard
Copyright (c) 1998, 2018, Oracle and/or its affiliates. All rights 
reserved.
OpenBoot 4.38.16, 8. GB memory available, Serial #xxx.
Ethernet address xxx, Host ID: xxx.



Boot device: /virtual-devices@100/channel-devices@200/disk@0  File and 
args: -V
OpenBSD IEEE 1275 Bootblock 2.1
Booting from device /virtual-devices@100/channel-devices@200/disk@0
Try superblock read
FFS v2
ufs-open complete
.Looking for ofwboot in directory...
.
..
home
tmp
usr
var
bsd
sys
bsd.rd
altroot
bin
dev
etc
mnt
root
sbin
.cshrc
.profile
ofwboot
Found it
.Loading 1a398  bytes of file...
Copying 4000 bytes to 6000 
Copying 4000 bytes to a000 
Copying 4000 bytes to e000 
Copying 4000 bytes to 12000 
Copying 4000 bytes to 16000 
Copying 4000 bytes to 1a000 
Copying 2800 bytes to 1e000 
>> OpenBSD BOOT 1.21



Re: snmp(1) initial UTF-8 support

2020-05-31 Thread Ingo Schwarze
Hi Martijn,

Martijn van Duren wrote on Tue, May 19, 2020 at 10:25:37PM +0200:

> So according to RFC2579 an octetstring can contain UTF-8 characters if  
> so described in the DISPLAY-HINT. One of the main consumers of this is
> SnmpAdminString, which is used quite a lot.
> 
> Now that we trimmed a little fat from snmp's oid, I wanted to fill it
> up again and implemented the bare minimum DISPLAY-HINT parsing that is
> required for SnmpAdminString. Other parts of the syntax can be atter
> at a later state if desirable.
> 
> Now I decided to implement this a little different from net-snmp,
> because they throw incomplete sequences directly to the terminal,
> which I recon is not a sane approach.

No, definitely not.  Never throw invalid or incomplete UTF-8 at the
terminal unless the user unambiguously requests just that (like
with `printf "a\x80z\n"` or `cat /usr/bin/cc` or something like that).

> I choose to take the approach taken by the likes of wall(1) and replace
> invalid and incomplete sequences with a '?'.

If the specification says that the user is only allowed to put valid
UTF-8 into octetstrings (and not arbitrary bytes), replacing invalid
bytes at output time is both acceptable and feels like the only
sensible option, unless something like vis(3) is considered more
appropriate.

However, the reason why wall(1) uses '?' as the replacement character
is because wall(1) must not assume that the output device can handle
UTF-8 and has to work safely if all the output device can handle is
ASCII.  So, wall(1) has to live with the inconvenience that when you
see a question mark, you don't know whether it really is a question
mark or whether it originally used to be garbage input.

If i understand correctly, here we are talking about output that is
UTF-8 anyway.  So instead of (ab)using the question mark, you might
consider using the U+FFFD REPLACEMENT CHARACTER.  The intended
purpose of that one is to stand in for invalid and inconplete
byte sequences, and also for characters that cannot be displayed.

> This because DISPLAY-HINT already states that too long sequences
> strings should be cut short at the final multibyte character fitting
> inside the specified length, meaning that output can already get
> mangled.
> 
> This also brings me to the question how we should handle DISPLAY-HINT
> in the case of -Oa and -Ox. Net-snmp prioritizes DISPLAY-HINT over
> these, but it has the option to disable this by disabling MIB-loading
> via -m''; This implementation doesn't give us the that option (yet?).
> The current diff follows net-snmp, but there is something to say to
> prioritize -Ox over DISPLAY-HINT, so an admin can use it to debug
> snmp-output without it being mangled. Any feedback here is welcome.
> 
> Once it's clear if this is the right approach I'll do a thorough search
> through mib.h on which objects actually can use this new definition.

I can't really comment on what snmp(1) should do, or which settings
should override which other settings.

Do i understand correctly that you want to make -Oa print invalid
UTF-8 to the terminal?  If so, that would sound reasonable to me,
for the following reason.  Printing non-printable ASCII to the
terminal is often dangerous, it can screw up or reconfigure the
terminal, so -Oa is already an option that must be used with care,
just like `cat /usr/bin/cc`.  While printing invalid UTF-8 is not
a good idea in general, it is less dangerous than printing non-printable
ASCII, so just passing the bytes through for -Oa doesn't make
anything more dangerous than it already is.

Maybe -Oa should have a warning in the manual page about the dangers
of using it on untrusted data?  Not sure.

Oh, by the way, when -Oa is not specified and there is UTF-8
content, don't forget to still filter out non-printable bytes:
both non-printable ASCII bytes and non-printable UTF-8 characters.
Right?

> Index: smi.c
> ===
> RCS file: /cvs/src/usr.bin/snmp/smi.c,v
[...]
> @@ -598,6 +638,60 @@ smi_foreach(struct oid *oid)
>  }
>  
>  int
> +isu8cont(unsigned char c)
> +{
> + return (c & (0x80 | 0x40)) == 0x80;
> +}
> +
> +char *
> +smi_displayhint_os(struct textconv *tc, const char *buf, size_t buflen)
> +{
> + size_t octetlength, i, j, start;
> + char *displayformat;
> + char *rbuf;
> + mbstate_t mbs;
> +
> + errno = 0;
> + octetlength = (size_t) strtol(tc->tc_display_hint, , 10);
> + if (octetlength == 0 && errno != 0) {
> + errno = EINVAL;
> + return NULL;
> + }
> + if (displayformat[0] == 't') {
> + rbuf = malloc(octetlength + 1);

NULL pointer access ahead on ENOMEM.

> + if (strcmp(nl_langinfo(CODESET), "UTF-8") == 0) {

On OpenBSD, that is not the normal way of checking whether
the locale is UTF-8.  Also, it is not portable because the CODESET
values are not standardized.

We usually do

#include 

if (MB_CUR_MAX > 

Re: userland clock_gettime proof of concept

2020-05-31 Thread Theo de Raadt
Mark Kettenis  wrote:

> > I changed __amd64 to __amd64__ because I didn't find __powerpc.  I'm
> > not sure, but one might move the list of arches to dlfcn/Makefile.inc
> > and do -DTIMEKEEP, like how thread/Makefile.inc does -DFUTEX.  One
> > might drop the tc_get_timecount function pointer and just always call
> > the function #ifdef TIMEKEEP.
> 
> Yes, we prefer the __xxx__ variants in OpenBSD code; thanks for
> catching that.  The benefit of the TIMEKEEP define would be that we
> can eliminate the fallback code completely on architectures that don't
> implement this functionality.

...

> > --- lib/libc/dlfcn/init.c.beforeSat May 30 23:26:35 2020
> > +++ lib/libc/dlfcn/init.c   Sat May 30 18:00:45 2020
> > @@ -70,7 +70,7 @@
> >  
> >  /* provide definitions for these */
> >  const dl_cb *_dl_cb __relro = NULL;
> > -#if defined(__amd64)
> > +#if defined(__amd64__) || defined(__powerpc__)
> >  uint64_t (*const tc_get_timecount)(void) = tc_get_timecount_md;
> >  #else
> >  uint64_t (*const tc_get_timecount)(void) = NULL;

1) I think adding _md to the name is superflous.  There will never
   be a MI version, so tc_get_timecount() is enough.

2) I hope we can get away from #ifdef __ arch__.
   Maybe this can be split into architectures which
  a) have a function called tc_get_timecount()
   or
  b) tc_get_timecount is #define'd to NULL, though I don't
 know which MD include file to do that in



Re: userland clock_gettime proof of concept

2020-05-31 Thread Paul Irofti

On 2020-05-31 18:25, Theo de Raadt wrote:

Mark Kettenis  wrote:


I changed __amd64 to __amd64__ because I didn't find __powerpc.  I'm
not sure, but one might move the list of arches to dlfcn/Makefile.inc
and do -DTIMEKEEP, like how thread/Makefile.inc does -DFUTEX.  One
might drop the tc_get_timecount function pointer and just always call
the function #ifdef TIMEKEEP.


Yes, we prefer the __xxx__ variants in OpenBSD code; thanks for
catching that.  The benefit of the TIMEKEEP define would be that we
can eliminate the fallback code completely on architectures that don't
implement this functionality.


...


Yeah, I just followed the dlfcn/dlfcn_stubs.c example from libc. Which I 
see now it is commented out...



--- lib/libc/dlfcn/init.c.beforeSat May 30 23:26:35 2020
+++ lib/libc/dlfcn/init.c   Sat May 30 18:00:45 2020
@@ -70,7 +70,7 @@
  
  /* provide definitions for these */

  const dl_cb *_dl_cb __relro = NULL;
-#if defined(__amd64)
+#if defined(__amd64__) || defined(__powerpc__)
  uint64_t (*const tc_get_timecount)(void) = tc_get_timecount_md;
  #else
  uint64_t (*const tc_get_timecount)(void) = NULL;


1) I think adding _md to the name is superflous.  There will never
be a MI version, so tc_get_timecount() is enough.


What about pvclock(4)?


2) I hope we can get away from #ifdef __ arch__.
Maybe this can be split into architectures which
   a) have a function called tc_get_timecount()
or
   b) tc_get_timecount is #define'd to NULL, though I don't
  know which MD include file to do that in


If we go with something like this or with something like -DTIMEKEEP, how 
do we handle the different PROTO_WRAP vs. PROTO_NORMAL declarations? 
Split them in MD headers? But then we end up in the same place. Sort of.




Re: sparc64 boot issue on qemu

2020-05-31 Thread Mark Cave-Ayland
On 31/05/2020 15:58, Theo de Raadt wrote:

>> AFAICT the problem here is the Forth being used at
>> https://github.com/openbsd/src/blob/master/sys/arch/sparc64/dev/fb.c#L511: 
>> since the
>> addr word isn't part of the IEEE-1275 specification, it is currently 
>> unimplemented in
>> OpenBIOS.
>>
>> Why is addr needed here? Does the fb.c driver try and change these values 
>> rather than
>> just read them?
> 
> Why does that matter?
> 
> sparc64 isn't a IEEE-1275 openfirmware.
> 
> It is a Sun openfirmware, meaning it is more than the vague
> specification.  An emulation must be able to emulate THE REAL HARDWARE.
> 
> This should work.

Well there are plenty of SUN-ims already included in OpenBIOS to enable Solaris 
to
boot as far it does; I'm not against them, I was just commenting that this was 
the
reason why it is currently unimplemented.

> For another 64-bit cell_t usage see dev/prtc.c.
> 
> For another "addr" usage, see romgetcursoraddr()

A simple addr implementation for Forth values should be fairly easy to put 
together.
Since I don't have access to any Sun hardware, can someone confirm the 
semantics of
the addr word for me? In particular what does it return for:

- Values (presumably this is a pointer to a 64-bit value?)
- Defers (does it return a pointer to the deferred word?)
- Words  (is it the same as the ' word?)


ATB,

Mark.



Re: userland clock_gettime proof of concept

2020-05-31 Thread Theo de Raadt
Paul Irofti  wrote:

> Yeah, I just followed the dlfcn/dlfcn_stubs.c example from libc. Which
> I see now it is commented out...
> 
> >>> --- lib/libc/dlfcn/init.c.before  Sat May 30 23:26:35 2020
> >>> +++ lib/libc/dlfcn/init.c Sat May 30 18:00:45 2020
> >>> @@ -70,7 +70,7 @@
> >>> /* provide definitions for these */
> >>>   const dl_cb *_dl_cb __relro = NULL;
> >>> -#if defined(__amd64)
> >>> +#if defined(__amd64__) || defined(__powerpc__)
> >>>   uint64_t (*const tc_get_timecount)(void) = tc_get_timecount_md;
> >>>   #else
> >>>   uint64_t (*const tc_get_timecount)(void) = NULL;
> >
> > 1) I think adding _md to the name is superflous.  There will never
> > be a MI version, so tc_get_timecount() is enough.
> 
> What about pvclock(4)?

What about it?  Is it MI?

> > 2) I hope we can get away from #ifdef __ arch__.
> > Maybe this can be split into architectures which
> >a) have a function called tc_get_timecount()
> > or
> >b) tc_get_timecount is #define'd to NULL, though I don't
> >   know which MD include file to do that in
> 
> If we go with something like this or with something like -DTIMEKEEP,
> how do we handle the different PROTO_WRAP vs. PROTO_NORMAL
> declarations? Split them in MD headers? But then we end up in the same
> place. Sort of.

Sorry you lost me here.  But go ahead, continue with your plan which will
result in 6-line chunk which will look something like this:

#if defined || defined || defined || defined || defined || defined ||
 defined || defined || defined || defined || defined || defined ||
 defined || defined




Re: userland clock_gettime proof of concept

2020-05-31 Thread Paul Irofti

On 2020-05-31 19:17, Theo de Raadt wrote:

Paul Irofti  wrote:


Yeah, I just followed the dlfcn/dlfcn_stubs.c example from libc. Which
I see now it is commented out...


--- lib/libc/dlfcn/init.c.beforeSat May 30 23:26:35 2020
+++ lib/libc/dlfcn/init.c   Sat May 30 18:00:45 2020
@@ -70,7 +70,7 @@
 /* provide definitions for these */
   const dl_cb *_dl_cb __relro = NULL;
-#if defined(__amd64)
+#if defined(__amd64__) || defined(__powerpc__)
   uint64_t (*const tc_get_timecount)(void) = tc_get_timecount_md;
   #else
   uint64_t (*const tc_get_timecount)(void) = NULL;


1) I think adding _md to the name is superflous.  There will never
 be a MI version, so tc_get_timecount() is enough.


What about pvclock(4)?


What about it?  Is it MI?


It is used by two architectures. There is also glxpcib. Of course we can 
have a copy of each in arch/*/usertc.c





2) I hope we can get away from #ifdef __ arch__.
 Maybe this can be split into architectures which
a) have a function called tc_get_timecount()
 or
b) tc_get_timecount is #define'd to NULL, though I don't
   know which MD include file to do that in


If we go with something like this or with something like -DTIMEKEEP,
how do we handle the different PROTO_WRAP vs. PROTO_NORMAL
declarations? Split them in MD headers? But then we end up in the same
place. Sort of.


Sorry you lost me here.  


I was talking about kettenis@'s idea that this will help us lose the 
wrapper on architectures that do not support user clock_gettime.


In lib/libc/hidden/time.h we have PROTO_WRAP(clock_gettime); instead of 
PROTO_NORMAL(clock_gettime) and all the libc calls to clock_gettime() 
are now WRAP(clock_gettime).


So we will have to keep the wrapper even for architectures that just 
fallback to the system call.


The only way not to do that is to add a different macro? Or to create MD 
files for wrap and non-wrap architectures.


> [aggressive defined-defined comment]

I am more than happy to lose the ifdef maze as you know I always have 
been in such situations.


The reason I put that there is because I could not figure out another 
way to do that nicely.


My diff has already split this into arch's that have tc_get_timecount() 
and those that do not; your point a).


We need to do point b): identify a proper include file.

Paul



Re: symmetric toeplitz hashing

2020-05-31 Thread Theo Buehler
> At the end of this loop, key[b] contains two copies of the cyclically
> permuted skey next to each other. When building the cache, you scan
> through the bits of val, xor the corresponding keys in if they're set
> and then throw away half of the 32 bits when assigning
> scache->bytes[val] = res;
> 
> So I think you can use "uint16_t keys[NBBY];" and "uint16_t res = 0;",
> replace j < 32 by j < 16 and 31 - j by 15 - j and you'll get the exact
> same result.

In other words, the first nested loop can be simplified to this:

for (b = 0; b < NBBY; ++b)
key[b] = skey << b | skey >> (NBSK - b);

and instead of populating the the key[] array up front, you could do:

void
stoeplitz_cache_init(struct stoeplitz_cache *scache, stoeplitz_key skey)
{
unsigned int b, shift, val;

/*
 * Cache the results of all possible bit combinations of
 * one byte.
 */
for (val = 0; val < 256; ++val) {
uint16_t res = 0;

for (b = 0; b < NBBY; ++b) {
shift = NBBY - b - 1;
if (val & (1 << shift))
res ^= skey << b | skey >> (NBSK - b);
}
scache->bytes[val] = res;
}
}